Skip to main content

Research data management: Storage

Considerations

A data storage strategy is important because digital storage media formats are liable to fail and all file formats and physical storage media will become obsolete eventually. Therefore, creating a data storage strategy will help to minimise the risk of loss or destruction of data by:

Considering these requirements at the start of the project will ensure the accessibility of the data.

View storage arrangements from example data management plan.

View safeguarding measures from example data management plan.

Locations

When selecting a storage location, please consider:

  • Are you working with confidential data?
  • What volume of data will be produced?
  • Who needs to access the data?
  • Do you require remote access to the data?

Click on the image below for a decision matrix to help determine the best option available at Curtin University for your digital research data.

 

Accessing the Research drive (R: drive)

R: drive is a shared network drive for storing research data at Curtin University. R: drive storage is allocated on a per-project basis and supervisors request R: drive access on behalf of their students.  There are two steps to access the R: drive:

  1. Create a data management plan
  2. Submit request storage

The storage request requires three pieces of information:

  • A short name for the R: drive folder
  •  How much storage is needed (in gigabytes)
  • Who needs to have access - the DMP owner and supervisor (if applicable) are automatically added

To request storage, the following steps are required:

  1. Log into the Data Management Planning Tool (http://dmp.curtin.edu.au/) with Curtin ID and OASIS password
  2. Go to the My Data Management Plans or My Students’ Plans (if you are a supervisor) section
  3. Locate the data management plan associated with the project
  4. Using the drop-down list, initiate the Request Storage process
  5. Enter the name for the folder, amount of storage required and the individuals that need access
  6. Submit

The Curtin Information Technology Services (CITS) Service Desk will send a notification to the requester when the storage has been provisioned.

If you need additional storage for existing research project folders, please follow the instructions here.

Safeguarding data

Safeguarding the data refers to steps taken to minimise the risk of loss or destruction of data. Data loss can occur for a variety of reasons, including:

  • Software or hardware failure
  • Viruses, hacking or theft of physical media
  • Human error (such as losing a USB thumb drive or accidentally deleting files)

Storing research data on a Curtin networked drive ensures a variety of different data protection mechanisms, automatically provided by CITS, are in place:

  • Creation of redundant copies
  • Monitoring to predict and prevent hardware failure
  • Enterprise-grade security

If the research data cannot be stored on a Curtin networked drive, regular back-ups should be done so the data can be recovered should they be lost.

Backups should occur at regular intervals as well as when major changes are made. If data are not backed up automatically, alternative arrangements will need to be put in place to ensure data are backed up regularly.

An additional measure to safeguard the data is to make multiple redundant copies and distribute them in different physical locations. However, redundant copies represent a point in time and will not reflect updates to the data. Copies should be made regularly.

File formats

Selecting standard, interchangeable and longer-lasting formats for the data will ensure ongoing access and preservation and may avoid a potentially difficult and expensive migration to an alternate format at a later point. 

Factors influencing the selection of file formats and software for data include: 

  • Method of data analysis
  • Hardware used
  • Software available
  • Discipline-specific standards 

Other factors to consider when selecting file formats include:

  • Proprietary and open formats and whether specific software is required
  • Maintaining data integrity by using lossless formats
  • “Bit-rot” resulting from the gradual decay of storage media
  • File format obsolescence which may be the result of software or hardware upgrades

Estimating file sizes

Estimating file sizes is difficult. Text files are typically quite small, in the order of megabytes in size. Multimedia files such as photos and videos are much larger.

Many multimedia file formats offer built-in compression and quality options that can drastically vary file sizes. This table is a rough guide of file sizes generated by current generation digital still and video cameras.

Data type Approximate size
Photo, 20MP, RAW 30MB per photo
Photo, 20MP, JPEG 5MB per photo
Video, 4K ultra-high definition, h.264 MP4 40GB per hour
Video, 1080p high definition, h.264 MP4 10GB per hour
Video, 720p high definition, h.264 MP4 5GB per hour

 

Modern digital video cameras will record at a resolution of 1080p high definition by default.

Example storage arrangements

These are example storage arrangements from the data management plan for a fictitious research project.

For the duration of the project, the physical data sheets will be stored in a filing cabinet in the principal investigator’s office. Upon completion, the principal investigator will work with Curtin Information Management and Archives to find a suitable long-term storage location.

When in the field, data will be stored on the primary investigator’s laptop and backed up to an external USB hard drive on a nightly basis.

Upon return to Curtin University, all digital data will be transferred to Curtin’s R Drive.

Example safeguarding measures

These are example safeguarding measures from the data management plan for a fictitious research project.

In the field, redundant copies of data will be kept on a password-protected laptop and a USB hard drive. Backups will be performed on a nightly basis after the data are transcribed from the physical datasheets.

When the field survey is complete, the data will be transferred to the Curtin R drive, which is set up according to standard Curtin Information Technology Services security and safeguarding protocols.

Weekly snapshots of the survey data analysis file will be made and stored on the R drive.

Storage options

More resources