Data storage and backup
Active data storage
Things to consider
- How much data will your project generate? This is something to consider during the planning phase, because storage costs should be factored into the overall data management plan.
- Who will need access to the data during the project's active phase? Collaborative research means additional challenges to storage and access.
- Will the project involve confidential or sensitive information? If so, you'll need to take extra precautions to avoid accidental disclosure.
Source: Research Data Management: Document & Organize Your Data (University of Saskatchewan Library).
Where can I store my active data?
Select the different tabs below to view some options for active research data storage solutions.
Concordia faculty and staff have been granted a licence for Office 365, which includes access to the standard suite of office applications (Word, Excel, PowerPoint), file storage (OneDrive), Intranet (SharePoint), collaboration tools (Microsoft Teams) and more.
Brief description
- Teams: Collaboration platform combining chat, video meetings, file storage (including collaboration on files), and application integration. Meant to replace services like Google Drive or Dropbox.
- SharePoint: Browser-based document management platform.
Example use
- Teams: Use to share and collaborate with colleagues, as well as for instant communication to reduce the number of emails.
- SharePoint: Use to store files for retention purposes, manage versions, co-edit documents.
Storage capacity
Each Team site has a SharePoint site behind it with a storage limit of 25 TB. Each user can create up to 250 Teams.
Access and collaboration
Files can be accessed and shared with a group. Different permission levels can be granted to different members of a group.
Data allowed
Microsoft employs security measures that meets standards set by the security community, however Concordia recommends that additional security measures be taken when sensitive or confidential data is stored. These can include password protecting documents, using multi-factor authorization to protect your account, or encrypting files. While not always needed, these measures and general cyber security awareness help in protecting the privacy of your data.
See Concordia's Office 365 FAQ for more information as well as Concordia's Privacy Impact Assessments.
Server location
Quebec and Toronto
Versioning
Automatic file versioning for Office 365 files.
Backups
Automatic across both Quebec and Toronto data sites. Files can be retrieved using SharePoint for up to 90 days after a user deletes them. The SharePoint site has a Recycle Bin and a Second Stage Recycle Bin where users can easily rescue deleted files themselves.
More information:
Concordia faculty and staff have been granted a licence for Office 365, which includes access to the standard suite of office applications (Word, Excel, PowerPoint), file storage (OneDrive), intranet (SharePoint), collaboration tools (Microsoft Teams) and more.
Brief description
Used to store personal and work related files. Files stored within OneDrive are private by default but there is an option to allow sharing and collaboration with others. Can sync desktop files with OneDrive to keep backup in the cloud. This is meant to replace personal drives such as the C:\ drive or P:\ drive.
Example use
Use to store personal working and reference documents that you don’t necessarily want to share.
Storage capacity
100 GB to 1 TB, depending on faculty eligibility.
Access and collaboration
Meant for personal use; however, individual files and folders can be shared with users within Concordia.
Data allowed
Microsoft employs security measures that meets standards set by the security community, however Concordia recommends that additional security measures be taken when sensitive or confidential data is stored. These can include password protecting documents, using multi-factor authorization to protect your account, or encrypting files. While not always needed, these measures and general cyber security awareness help in protecting the privacy of your data.
See Concordia's Office 365 FAQ for more information as well as Concordia's Privacy Impact Assessments.
Server location
Quebec and Toronto
Versioning
Automatic file versioning for Office 365 files.
Backups
Automatic across both Quebec and Toronto data sites. Files can be retrieved using SharePoint for up to 90 days after a user deletes them. The SharePoint site has a Recycle Bin and a Second Stage Recycle Bin where users can easily rescue deleted files themselves.
For more information, see Office 365 - faculty & staff (Concordia IT Services).
Note that OSF is not a service provided by Concordia
Brief description
Free online collaboration tool with both hosted and add-on storage options. Use OSF to organize, document and share projects, including files, data, code and protocols.
Example use
Use to work on research projects with multiple collaborators who need varying levels of access.
Storage capacity
5 GB storage limit per project or component for private projects. 50 GB storage limit per project or component for public projects. Add-on storage extends capacity but storage limits are controlled by, and vary, depending on provider. Read more about OSF storage capacity.
Access and collaboration
Can include non-Concordia users. Collaborators can be granted read-only, read-write or administrative permissions.
Data allowed
Most unpublished research data can be added to OSF, however, confidential, restricted, or high-risk data should not.
Server location
Montreal (default storage location must be set when creating a new project).
Versioning
Automatic file versioning.
Backups
Redundant data centers and infrastructure.
For more information, see the OSF website.
Storing big and/or sensitive or confidential data can be challenging. Although Microsoft Teams/SharePoint/OneDrive can be used for sensitve data if files are password protected or encrypted (see the FAQ on Concordia's Office 365 webpage), the services below can also be considered.
Concordia's IT Research Support team
Provide consultation services as well as research storage, research server hosting, and research virtualized servers. Go to Research Support (Concordia IT Services).
Digital Research Alliance of Canada (the Alliance)
The Rapid Access Service allows Principal Investigators (PIs) to request a modest amount of storage. Resource Allocation Competitions are an application based process to request storage and compute resources that go beyond what is available with the Rapid Access Service.
Find out more about:
- Storage and file management from the Alliance;
- Advanced Research Computing at Concordia, in partnership with the Alliance and Calcul Québec;
- Training opportunities through the Alliance and Calcul Québec.
REDCap
Note that this is not a service provided by Concordia
"A secure web application for building and managing online surveys and databases. While REDCap can be used to collect virtually any type of data in any environment (including compliance with regulations such as 21 CFR Part 11, FISMA, HIPAA, and GDPR), it is specifically geared to support online and offline data capture for research studies and operations."
Pillar Science
Note that this is not a service provided by Concordia
Research data management, research project management and research data analysis software solution for collaborative and interdisciplinary research. This is a Montreal based company.
See also: Active Storage and Security section of the Human Participant Research Data Risk Matrix (p. 6) (Sensitive Data Expert Group of the Portage Network).
Data backup
What files should I back up?
Ideally, you should back up all the data files and associated documentation files, e.g. metadata files, files describing the methodology and/or the instruments used to obtain the data, files describing any manipulation or transformation of the dataset.
What should be the backup frequency?
There are no absolute rules prescribing how often data files should be backed up. However, critical files, especially dataset under construction should be backed up every time the file is modified. Less crucial files can be backed up at regular intervals, daily or weekly for instance. You should use a software or hardware solution that will automatize your backup plan and can handle incremental backups.
What type of storage should I use to back up my files?
No storage medium is perfect; you should use multiple backup media and store at least one copy in a remote location. Here are some storage solutions.
Networked drive
Generally managed at the university or departmental level, these devices are regularly backed up (usually on tape) and provide for easy and secure access to your data.
PC or laptop hard drive
A flexible solution while you are working on a dataset, but these should not be the only storage solution you use. Hard drives can fail and computers can be stolen or lost.
External storage device (USB flash drive, CDs, DVDs)
Although this a common and affordable backup solution, there are several issues to be aware of:
- Depending on the size of your dataset, you may have to use multiple devices
- The longevity of these supports is questionable
- Follow the care and handling instructions carefully
- Regularly check your files to see if they are accessible and complete
- Make sure to “refresh” your data by making new copies on a new CD, or USB drive
- Encrypt any confidential data or protect it with a password
Remote/cloud storage
Services like Dropbox, Google Drive or OneDrive provide some free storage space on remote servers (more space can be obtained on a subscription base). Most cloud-storage solutions provide automated syncing and data encryption. However, remember that there are drawbacks in using third-party online storage.
- Legal issues (copyright, data protection licences) can be complicated or unsatisfactory, especially if the server is located outside of Canada. It is generally not recommended to use cloud-storage for sensitive data that includes identifying information on human subjects.
- Bandwidth may be a concern, especially if you have large datasets.
- You are subject to changes in policies and commercial terms.
Source: Adapted from Research data management training (MANTRA, CC-BY).
Resources
- Storage solutions overview (Consortium of European Social Science Data Archives): Data sensitivity, ease of access, file size and overall data volume will influence storage choice. Advantages and disadvantages are detailed as well as precautions that should be taken when working with personal (sensitive) data.
Help and resources
Research data management consultations are available for Concordia faculty, students, and staff. Find out more about how librarians on the Library's RDM team can provide guidance. This service is part of Concordia's Institutional Research Data Management Strategy.