Safe storage of your working data and regular backups are essential during your research project.
- Data storage refers to where and how you keep your data. It involves both:
- Selecting appropriate locations and media for physical storage of data (for example, local hard-drives, networked storage and servers, etc.)
- Selecting appropriate file formats (for example, deciding between options such as plain text, rich text, or proprietary formats such as Microsoft Excel)
- Backup refers to creating additional copies of your ‘live’ or ‘working’ data. Backing up your data is essential to avoid the risk of losing data through accidental deletion, hard-drive failure, or theft or damage of equipment.
- Some data storage systems will automatically back up your data: see below for more information.
- Files stored on a desktop or laptop computer may not be automatically backed up, so you will need to set up your own backup solution.
- Security refers to keeping your data safe. This means both:
- Ensuring that data is not lost, and is kept free from corruption.
- Controlling access to your data as appropriate – ensuring that no one who shouldn’t be able to see your data can. This may be achieved in a variety of ways, including physical security (e.g. storing data in a locked room), password protection of files, and encryption.
Below are some questions you may wish to consider – and resources and tools which may help.
Questions to consider
What storage options are available to me at the University?
Data storage in Oxford is often provided at the departmental level: ask your local IT Officer if there is server space available that you or your research group could use. The Medical Sciences Division provides some storage options for its members.
Some of the tools and services provided by the University for researchers include space for data and document storage:
- SharePoint Online is a web-based collaborative platform that can be used for storing and sharing material securely. It offers up to 1 TB of storage space.
- OneDrive for Business, provided as part of the Nexus365 suite of tools, gives University members 5 TB of secure cloud-hosted storage space. Files and folders can be shared with colleagues within the University and beyond. (Note, however, that this service is not intended for long-term archiving: anything in your OneDrive space will be deleted shortly after you leave the University. You will therefore need to make alternative arrangements for the continued storage of data before the end of your course or contract.)
- LabArchives is the University’s electronic lab notebook service. It can be be used to (among other things) store research data, and share this with colleagues in and outside the University.
These services offer a number of advantages: they are all free at the point of use, they have been assessed by the University’s Information Security team and are deemed suitable places to store confidential data, and they all offer automatic backup.
Other University-provided storage solutions include:
- The Sustainable Digital Scholarship (SDS) service provides a Figshare-based platform for storing, working with, and publishing research data. The service is based in the Humanities Division, but is available to researchers across the University. Charges apply for some categories of project.
- For larger projects, the Infrastructure Chargeable Services team at IT Services offer managed server space for a fee.
The University is furthermore undertaking a project to scope and provide dedicated storage for research data. Further details will appear on this site in due course.
What backup options are available to me at the University?
Some University storage services include automatic backup: see the section on storage above for more details.
IT Services also provides a centrally-funded University-wide backup service for staff and postgraduate students – known as the HFS (Hierarchical File Server). It offers an automated backup service for personal computers and for servers. Find out more on the HFS web pages.
How can I encrypt my research data?
Where possible it is recommended that researchers protect the whole hard drive of a device – especially if it is portable. Some operating systems offer built-in whole disc encryption solutions: for example, BitLocker for Windows 10 (unfortunately not available in the Home edition) and FileVault for MacOS. The safe retention and handling of passwords is of course essential, and backup is provided by both companies as part of their online account management.
Where single or small groups of files need to be protected free tools such as 7-Zip (Windows) and Keka (MacOS) can be used to compress and password-protect files and folders. This is especially useful for movement of data via email. Extensive advice on the secure use of email is available on the University of Oxford InfoSec website.
What storage media and file formats should I choose?
The answer to this will vary depending on the type and quantity of data. If departmental storage is available to you, this is often a good option, as backup and maintenance may be taken care of automatically (though it’s always worth checking the details of what’s being offered). In general, it is good practice to avoid relying heavily on storage media that can easily be lost or damaged, such as USB sticks.
Cloud storage can be convenient, but be wary of using commercial cloud providers to store research data, as these often do not meet the University’s security and data protection requirements. OneDrive for Business (part of the Nexus365 suite of tools) offers a University-approved alternative. See above for details of other University storage systems.
In some cases, file formats will be dictated by the software you use to store and analyse your data. But if you’re using proprietary software, it’s worth considering whether you can also store a copy of your data in a more open format (that is, one that can be read by a wider range of software packages – common examples are plain text files and .csv (comma separated value) files). Software companies sometimes go bankrupt, or stop making a particular package, and you don’t want to find that your data is locked into a format that’s no longer easily readable. Additionally, if you’re planning to share your data later on, it makes sense to have it in a format that’s as useful to as many people as possible.
What do I need to bear in mind if I’m working with sensitive or confidential data?
This topic is considered in more detail in the Ethical Issues and Data Protection section of this site.
- Academic research often results in the creation of sensitive data. At the very least you may wish to control who has access to your research data (prior to peer review or publication, if appropriate), and be able to determine and keep track of, what others are authorised to do with your data.
- Research data may also be of a type where you are legally or contractually obliged to keep it safe and confidential.
- Data which includes any information about living identifiable individuals counts as personal data, with concomitant responsibilities under the Data Protection Act. The Data Protection and Research web pages and the Staff Guidance on Data Protection web pages provide more information on this: the latter includes details of the University’s Data Protection by Design process, which any group that is collecting or working with personal data should work through.
- Certain types of data may be commercially sensitive or be protected by intellectual property agreements.
- The University’s Information Security website provides general advice about keeping data secure, including guidance on the management of information security. Within a devolved University responsibility for implementing the relevant policies lies with individual researchers. In general, maintaining data security usually includes consideration of:
- the available skills and expertise required to ensure an adequate level of data security;
- a risk assessment to determine the value of data, the level of confidentiality required, applicable statutory requirements, the impact of unauthorised access to, or loss of, the data, and the steps required to provide appropriate data protection.
- the prevention of unauthorised and malicious access to buildings and rooms where computers and other devices holding data may be housed.
- how access to data is managed, authorised and logged.
- how data is protected from loss or damage, for example by regular backups, implementing version control and installing anti-malware software.
- the means to access data from both within Oxford and from outside the Oxford network; and the transmission of data from one computer to another (e.g. via email, ftp, web server).
- the storage and encryption of data taken offsite (whether, for example, on an external drive, laptop, mobile device).
- the process to verify the deletion of confidential data (for example, when equipment is re-deployed or in line with a project’s exit strategy)
- Data that is sensitive or confidential should be stored in a way that has been approved by the University’s Information Security team.
- The most straightforward way to do this is to use a University-provided storage solution: see the section on storage above for some suggestions.
- If it is necessary to use an external service, then a Third Party Security Assessment should be completed.
What’s the best way to keep track of different versions of my data?
There are various ways of doing this, and the important thing is to select a method that’s appropriate for the type of data you’re working with.
If your data is fairly straightforward, it may be sufficient simply to add a version number to the file name when a new version of the dataset is created. For more complex projects – especially those with multiple collaborators – you may wish to explore a system such as SharePoint Online, the LabArchives electronic lab notebook service, or specialist file management software.
There are many commercial storage platforms which offer file syncing services (so you always have the most recent copy of a file) and versioning. However, there is a need for extreme caution when using these, as they often fail to meet the University’s security and data protection requirements. Unless you are working entirely with data that could be made public (that is, data that could be published on the open web without this raising any concerns), you will need to use a storage solution which has been approved by the University’s Information Security team – see the section on sensitive and confidential data above for more details.