File naming, organization, and versioning
Before starting a project, it is important to plan file management strategies. This will help you save time later on. When developing file organization conventions, be consistent, document them, and share them with anyone who may access the data.
Directory structures
Consider creating a readme.txt file, in your project's main file folder, that gives an explanation of the directory structure and describes the contents of the major folders. See Massachusetts Institute of Technology's README file & folder schema example.
Best practices
- The main folder should have an informative name. For example: title, unique identifier, and date (year).
- Subfolders should be divided by common theme. For example:
- research activity (interviews, surveys, experiment)
- parameter assessed
- data type (images, text, databases)
- kind of material (publications, deliverables, documentation)
- Consider restricting the level of folders to three or four deep and not to have more than ten items per folder.
Directory structure examples
Psychology research directory structure example
Source: Berenson, K.R. 2018. Managing your research data and documentation (American Psychology Association).
Marketing research directory structure example
Source: Organising (UK Data Service).
File naming
Consider creating a readme.txt file, in your project's main file folder, that explains your file naming convention, as well as any abbreviations or codes.
Best practices
Common elements in folder or file names:
- Project or experiment name or acronym
- File creator's name/initials
- Date
- Version number
- Data characteristics, for example:
- Location/spatial coordinates
- Type of data (e.g., Survey)
- Conditions (e.g., Lab instrument, Solvent, Temperature, etc.)
Rules of thumb for file names:
- Keep file names as short as possible while including all necessary information.
- Do not use spaces, full stops (.), or special characters (e.g., &, *%#;()!@$^~'{}[]?<>)
- Use hyphens (-), underscores ( _ ), or camel case (FileName) to separate elements in a file name
- Dates should use consistent formatting (e.g., YYYYMMDD)
- Version numbers should have leading zeros to allow for multi-digit versions (e.g., v_05, v_023)
| Examples of useful file names | Examples of poor file names |
|---|---|
| FG1_CONS_20100212.rtf interview transcript of the first focus group with consumers, that took place on 12 February 2010 |
|
| Int024_AP_20080605.doc interview with participant 024, interviewed by Anne Parsons on 5 June 2008 |
Focus group consumers 12 Feb?.doc |
| BDHSurveyProcedures_v04.pdf version 4 of the survey procedures for the British Dental Health Survey |
Health&Safety Procedures1 |
Source: Organising (UK Data Service).
File renaming
Software is available for batch renaming multiple files using an automated process. Example software include Renamer (Mac) or Bulk Rename Utility (Windows).
Further reading: Batch renaming (Wikipedia).
File versioning
Two separate tables should be used for tracking file versions: 1) document control table, and 2) version history. The document control table should include the document's title, file name, description, creator name, maintainer name, date created, and date modified. The version history table should list the version numbers, the person(s) responsible for the version, explanatory notes, and a last amended date.
Best practices
- Keep a copy of the original data, and never edit it.
- Add version information in file naming convention (e.g., creation or modification date, or version number)
- Use tools or software to help track file versioning. This could include:
- Tools that automatically assign version numbers (e.g., Electronic Lab Notebooks)
- File sharing services (e.g., Dropbox, Google Docs)
- Version control software (e.g., Subversion, Git)
- Version control tables (see below)
Further reading: Versioning (UK Data Service).
Examples of version control tables
| Title | Hearing screening tests in Montreal daycares |
|---|---|
| File name | HearingScreenResults_v05.csv |
| Description | Results data of 120 Hearning Screen Tests carried out in 7 daycares in Montreal during June 2017 |
| Created by | Kate Smith |
| Maintained by | Mandy Watson |
| Created | 04/07/2017 |
| Last modified | 25/11/2017 |
| Version | Responsible | Notes | Last amended |
|---|---|---|---|
| 05 | Mandy Watson | Version 03 and 04 compared and merged by MW | 25/11/2017 |
| 04 | Alex Thakor | Entries checked by AT, independent from SK | 17/10/2017 |
| 03 | Steve Knight | Entries checked by SK | 29/07/2017 |
| 02 | Karen Miller | Test results 81-120 entered | 05/07/2017 |
| 01 | Mandy Watson | Test results 1-80 entered | 04/07/2017 |
Source: Versioning (UK Data Service).
Help and resources
Research data management consultations are available for Concordia faculty, students, and staff. Find out more about how librarians on the Library's RDM team can provide guidance. This service is part of Concordia's Institutional Research Data Management Strategy.