Data is the raw information and the direct result of research that was conducted as part of a study, experiment or survey. It is typically raw data that needs to be manipulated using software.
Datasets are groupings of data collected and arranged in a set or structured manner. The records in a dataset can be organised in various ways, depending on how you wish to access the information. When a dataset has been created by another researcher or research group, it is classed as secondary data.
Image: Code programming data by markusspiske Pixabay
The Open Data movement predicts a huge potential for using a nation's extensive datasets for research purposes, namely to stimulate research, innovation and industry.
There are benefits associated with using datasets:
Finding data for your research can be time consuming and may involve searching major sources of published research data such as:
Key Australian repositories
Data portals provide datasets with a defined scope. Directories of data portals can be useful for finding data if you have details about the kind of dataset you're seeking.
Commercial data repositories and registries
International data archives
Once you have selected a dataset you wish to use, consider the following before you start using it:
More information on assessing datasets can be found on the ANDS Data Reuse website.
Using datasets
Research data which is intended for sharing and re-use should have an assigned license. When you use datasets that have been licensed, ensure you use it a the way that is permitted by the license.
If the data has not been licensed, contact the data owner (rights holder) to obtain permission to re-use the data.
Citing datasets
It is important that you attribute or give credit to any datasets that you use, modify or adapt.
The Australian National Data Service's (ANDS) guide provides information on how to cite data.