Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

How to Find

What are data and datasets?

Data is the raw information and the direct result of research that was conducted as part of a study, experiment or survey. It is typically raw data that needs to be manipulated using software.

Datasets are groupings of data collected and arranged in a set or structured manner. The records in a dataset can be organised in various ways, depending on how you wish to access the information. When a dataset has been created by another researcher or research group, it is classed as secondary data.                                                                                                                                  

Image: Code programming data by markusspiske Pixabay

Why use datasets?

The Open Data movement predicts a huge potential for using a nation's extensive datasets for research purposes, namely to stimulate research, innovation and industry.

There are benefits associated with using datasets:

  • much of the background work has already been completed therefore easier to undertake further research
  • time-saving and cost efficient - reduced cost associated with duplication of data
  • pre-established validity and reliability 

Where can I find datasets?

Finding data for your research can be time consuming and may involve searching major sources of published research data such as:

  • Government websites
  • Data directories
  • Subject based repositories
  • Institutional repositories
  • Research centre websites
  • Internet search engines e.g. Google or Google Scholar 

Key Australian repositories

Data portals provide datasets with a defined scope. Directories of data portals can be useful for finding data if you have details about the kind of dataset you're seeking.

Commercial data repositories and registries

International data archives

How do I know if the dataset is useful?

Once you have selected a dataset you wish to use, consider the following before you start using it:

  • Is there enough description about the content of the data? Is the context of the research relevant?
  • Is the source trustworthy?
  • Is the file format useable?
  • Are the conditions for re-use clear?
  • Does the data have a persistent identifier (DOI)?
  • Do you know how long the data will be stored and made available?

More information on assessing datasets can be found on the ANDS Data Reuse website.

Using and citing datasets

Using datasets

Research data which is intended for sharing and re-use should have an assigned license. When you use datasets that have been licensed, ensure you use it a the way that is permitted by the license.

If the data has not been licensed, contact the data owner (rights holder) to obtain permission to re-use the data.

Citing datasets

It is important that you attribute or give credit to any datasets that you use, modify or adapt.

The Australian National Data Service's (ANDS) guide provides information on how to cite data.