A well-organized research project will save you time, improve reproducibility, and make it easier to share.
Deciding how to organize your data depends on the specific characteristics of your research project. A well-organized project includes:
- a well-thought-out folder structure where you can easily locate files
- sustainable file formats that will last and are independent of specific software
- an appropriate file naming convention that will make your file names comprehensible
- detailed information on the data collection and processing procedures (see Metadata and documentation)
- a README file or similar solution that describes the organization system (see Metadata and documentation)
Folder Structure
Structuring your data files in folders is important for making it easier to locate and organise files and to keep track of different versions of files. A proper folder structure is especially needed when collaborating with others. The data folder structure has a huge impact on how your files can be processed and analysed. Once your structure has been filled with data, changing it will be laborious and time consuming. Here are some tips:
- Do not use your computer desktop as a storage place
- Make a folder hierarchy and use descriptive folder names
- Avoid folders that become too general, create more subfolders instead
- Create a structure and follow it
- Systematic, logical, and clear before you start (!)
- Quick and easy to navigate
- Simple enough to be used all the time
- Scaleable
- Keep active and finished parts of your project separate
- Periodically take the time to tidy
You can find an example of a folder structure that is systematic, simple, and scaleable on CodeRefinery pages. You can find another example (that you can also download) here.
File Formats
In an early stage of your research, you are faced with the question of which formats you will use for your data files. Your initial decision about the file formats should be considered thoroughly. An important part of your project’s metadata and documentation may be embedded into the data file. An example of this is when you take a picture using your mobile’s camera and it embeds the date and location of where a picture was taken in the image file that it creates (i.e. metadata). This information can aid in data analysis, documentation, and reusability.
To ensure that your data can be accessed well into the future it is often a good idea to store your data (or copies of your data) in sustainable file formats. For example, plain text files (.txt) are more sustainable than Microsoft Word files (.docx) since they have a format that is open, non-proprietary, and often used.
Here you will find information on sustainable file formats:
File Naming Conventions
A file naming convention is a set of rules that govern how files are named in your project. Using a file naming convention can help you save time when trying to find a specific file. It will also make your data easier to reuse and reproduce.
Information that you might consider including in a file name:
- Dates or times that are relevant for the contents of the file
- Name of the project or experiment
- A version number for the file
- Short information on the file’s content
- Name or initials of a researcher
- Unique identifiers such as experiment number or number in a series
When naming your files consider the following best practices:
- Short and descriptive names
- General information first, then add details to the name
- Underscore or hyphenate separate words
- Write dates backwards (YYYYMMDD) – ISO 8601 standard
- Numbers (e.g. version or experiment numbers) should have the same number of digits
- use 01 and not 1 if numbers will go over 10
- use 0001 and not 1 if numbers will go over 1000
- Version number at the end
- Avoid using special characters
- #, %, &, \ , / , ‘ , “ , !, $ , > , < , { , } , * , ?, =
- DO NOT use space in file names
- DO NOT start or end your filename with a space, period, hyphen, or underline
- Most operating systems are case sensitive; always use lowercase
Need advice?
Contact us at: research-data@uio.no