Statistics and Datasets: Finding datasets

A guide on who collects statistics and datasets and where to find statistics and datasets

QUT datasets resources

Students in the Faculty of Science and Engineering or in Humanties may find the QUT LibGuides on Datasets useful resources. They contain useful background information on datasets and how to use them as well as links to a number of data portals.

Finding datasets

Due to the uptake of the principles of Open Data, there are enormous amounts of freely available datasets on the internet. This means it's very important to have a clear definition of what you're searching for before you begin.

Consider the following:

  1. Variables - what variables will be described in this dataset ? What are the critical components/fields ? Is there a specific subset of events or individuals you are looking to examine ? 
  2. Time - when will the data have been collected ? Will the data be from a single point in time or longitudinal ?
  3. Geographical area - does the data need to have been collected at a specific location or area ?
  4. Author details - Will you need to contact the person who created the data ? What kind of creator authority is required for the dataset to be useful ?
  5. Research area context - Is there any ethical concerns that would prevent the dataset you want from being shared ? Is there likely to be any commercialisation or intellectual property concerns that will prevent this data from being shared ?

Directories of data portals

Data portals provide datasets with a defined scope. Directories of data portals can be useful for finding data if you have details about the kind of dataset you're seeking.

The directories below can be very useful if you know:

  • the geographic region of your desired data source; or
  • the type of organisation (government/private/academic) that would have created your data; or
  • the subject area/discipline categorisation of your desired dataset

Large data portals

Multidisciplinary dataset portals provide large numbers of datasets through a searchable interface. They are different to normal data portals in that they will usually have a broad scope and few restrictions on the types of datasets they will provide links to.

Contacting researchers directly

After your literature review, you may find publications that are based on data similar to the data your project requires.

In this case, consider contacting the author or authors of the paper and asking if they would be able to make their data available to you for further analysis. However, be aware that:

  • Some forms of data may require de-identification and anonymisation
  • Some data is not able to be shared due to funding requirements or circumstances of the initial research