Data collection forms the foundation of many research projects. Ensuring that this foundation is as solid as possible gives the rest of the project the best chance of accomplishing its goals. Some issues pertaining to specific types of data collection are discussed in the sub-sections below, but there are a few general key principles to consider:
Check for existing data first
Before you start planning your data collection, it's good practice to check whether any relevant pre-existing datasets are available to you. See the Finding and reusing existing data section below for more on this
Data collection needs to be consistent and accurate
Written protocols or standard processes can help with this. These are particularly useful in research groups where multiple people are involved in collection, but are valuable even for lone researchers. It's also important that appropriate quality assurance measures are in place. Depending on the type of data being collected, this might include calibration of instruments, validation of data entry, repeating samples or measurements, or peer review of data.
Data should be made as straightforward as possible to work with
Depending on the nature of the project, this may apply to the data collection process itself, or to what is done with the data immediately after collection. The progress of the rest of the project can be smoothed through (for example) clear documentation and labelling, standardised file names, and a well-ordered file or folder structure. See the later sections on this page for more on these.
Data collection must be ethical and compliant
The collection process and any interactions with research subjects need to be conducted ethically. It's also important to make sure you're aware of relevant laws and regulations, and that you have a plan to ensure compliance with these. See the next section for more on this.
Any research activity involving human participants requires ethical approval, even if no personal data is being collected. For further information, see the Research Ethics pages on the Research Support website.
Any personal data - that is, data about an identifiable living individual - needs to be handled in a way that is compliant with the General Data Protection Regulation (GDPR). Any University of Oxford activity which involves processing data will need to make use of the Data Protection by Design framework, which may involve completing a Data Protection Impact Assessment (DPIA).
If personal data is being collected directly from participants (e.g. via a survey or interview), a privacy notice or participant information sheet should be used to provide participants with all the relevant information about what will happen to their data: see the Creating Privacy Notices page on the Compliance website for more details.
Additional information is available from the Data Protection and Research pages on the Research Support website.
If people other than the members of the research team are involved in data collection or digitisation (if, for example, a transcription service is used), additional steps or processes may be needed. These might include gaining ethical approval for the involvement of the additional personnel, ensuring that the participant information sheet mentions that they will be involved in processing the data, or a Third Party Security Assessment for an external service. In some cases, a confidentiality agreement or data sharing agreement may be needed: Research Services can advise on this.
This topic is covered in more detail in the Ethical and legal issues section of this website.
Surveys and questionnaires are a common method of collecting quantitative (and sometimes qualitative) data. They may be conducted via an online platform, via electronic devices such as tablets, or using pen and paper.
As there is a high chance that data collected via a survey will include personal data, it is important that proper processes are put in place, and appropriate tools are used: see the general notes on Data from or about human subjects above for more on this.
Online survey platforms
Online survey platforms can offer a quick and convenient way of creating and running a survey. Two platforms are available free at the point of use to all University members: Jisc Online Surveys, and Microsoft Forms. These have both been approved by the University's Information Security team as suitable places to store University data, including confidential data. They offer solid, user-friendly functionality, though they do not have the more advanced features offered by some survey tools.
Other tools offering a wider range of features are available to specific groups within the University: for example, the Medical Sciences Division runs an instance of REDCap for its members, and some departments have subscriptions to other survey platforms (for example Qualtrics) for the use of their members. Consult your local admin staff, research support staff, or IT officer to find out what is available to you.
If you need to use a survey tool other than those provided by the University, a Third Party Security Assessment (TPSA) should be completed. A handful of survey tools (including Qualtrics and SmartSurvey) have already been through this process, and are deemed suitable for use with confidential data. New subscriptions should be arranged through your department, rather than via a personal account, so that a contractual relationship exists between the University (which ultimately has legal responsibility for data gathered in the course of University research) and the service provider.
The IT Services Survey Advice Service can provide advice about selecting a suitable online platform, and training is available via the IT Learning Centre.
Electronic devices
If a reliable internet connection is not available, it may be more convenient to conduct a survey using an app on an electronic device, such as a tablet or a smartphone. This may be a standalone piece of software, or it may be an offline app that works in tandem with an online survey platform (responses are stored on the device, and then synced with the online platform when internet access is available).
Thought needs to be given to data storage and handling both on the device, and after data has been transferred elsewhere. If at all possible, portable devices used for storing personal data should be encrypted: if this is not feasible, data should be transferred from the device to secure storage at the earliest opportunity, and the original unencrypted copy deleted. If an app provided by an online survey platform is used, the platform should be selected in line with the guidance given above. Qualtrics and SmartSurvey are examples of suitably secure survey platforms which offer an offline app.
Hard copy surveys
In some cases, the traditional method of collecting survey responses using pen and paper may be most appropriate. Responses may then be digitised, either by scanning, or by entering the data manually into an electronic survey platform or database. As with electronic collection, care needs to be taken to ensure that the data is handled properly both during collection and subsequently. For example, it may be necessary to store completed survey forms in a locked filing cabinet, and any involvement of non-University personnel in collection or transcription may require a data sharing agreement.
Some types of fieldwork raise additional data management challenges: for example, lack of reliable access to electricity or an internet connection may limit data collection and storage options.
In general:
- Data collection and storage should be as secure as practically possible under the circumstances
- Where compromises have to be made, data should be transferred to secure, properly backed-up storage as soon as feasible
Thus, for example, if fieldwork involves making recordings of interviews, the recording device should ideally be encrypted. If this is not possible, the recordings should be transferred to encrypted storage (and the unencrypted originals deleted) as soon as practically possible.
If a sufficiently robust internet connection is available, consider uploading data to secure cloud-based storage such as the University's Nexus365 OneDrive for Business at the earliest opportunity. This reduces the risk of data loss as a result of portable storage media being damaged, lost, or stolen while travelling. It is also possible to make back-up copies of hard copy data (e.g. consent forms or written questionnaires) by taking photos of these and uploading them to Nexus365 OneDrive. If portable media need to be used, these should be encrypted, and it is sensible to make multiple copies, and keep these in different places.
However, some countries impose restrictions on travelling with encrypted devices. This has the potential to cause problems at customs, and in extreme cases, can lead to devices being confiscated or to fines or other penalties. Check the situation in your destination before leaving the UK, so you can plan accordingly. If using an encrypted device may be problematic, uploading your data to Nexus365 OneDrive is a good option: this means that sensitive data does not need to be stored on the portable device, but can still be accessed easily once back in the UK.
Research that forms part of a University of Oxford project needs to comply with relevant UK legislation such as GDPR, even if collection is taking place in another country. You may also need to take local legislation into account.
Recording an interview is a convenient way to save notetaking and allow you to concentrate on asking questions and responding to the interviewee's answers. Having a transcript of the interview means you can quickly search, browse, and refer to the material without having to listen to the whole recording.
You should always obtain explicit permission from the participants before recording any conversations. A video or audio recording will contain personally identifiable information, even if that is just the voice or face of the interviewee. As with all personal data, you need to make sure the recording is secure, for example by encrypting the file or only keeping it on secure, encrypted storage (see the Keeping working data safe section). Many recording devices, such as digital voice recorders or dictaphones, cannot be encrypted or store the recordings in an encrypted form. If alternative equipment is not available, you can mitigate the risk by deleting the recording data as soon as possible (after copying to your encrypted storage) and locking the device and its storage media away when not in use.
Another option is to record using Microsoft Stream, which uploads the recording directly to the University’s secure Nexus365 service. As Nexus365 has been approved for use with all kinds of University data, it is a good place to store any personal data that you're collecting. Nexus365 also gives you access to the transcription feature of Microsoft Stream or Word online, which you can use to automatically transcribe material at no cost. For more information, see our guide to transcribing interviews.
For remote video interviews, Microsoft Teams is the University's recommended platform. Participants do not need a Microsoft account: it is possible to send a link to a Teams meeting that will let them attend as a guest, using either a browser or the Teams app (which is free to download). Recording of Teams meetings can be enabled on request.
The Research Support website offers Guidance for Researchers Working Remotely with Participant Data.