How to find and re-use data?
Finding Data
The best sources of published research data are usually data centres and disciplinary data repositories established by research funders and research groups. Several strategies can be used to find sources of research data for study and re-use:
- Search for a suitable disciplinary data repository or data centre (see 1)
- Search in a data portal or index service (see 2)
- Finding data that is mentioned or presented as supplementary material in journal articles
1. Repositories
A disciplinary data centre or data repository can be discovered by searching indexes of research data repositories:
- Re3Data provides a searchable directory of research data repositories
- OpenDOAR offers a searchable directory of repositories with different types of content, including research data
German repositories can be found via the DFG-funded RIsources portal
2. Portals
Data portals and index services enable searches across multiple repositories:
- DataCite - search of registered datasets
- European Union Open Data Portal - Open data portal of the European Union
- Research Data Australia - Australia's open data portal
- B2FIND - is a search service based on metadata continuously extracted from research data holdings of EUDAT data centres and other repositories
- EMBL-EBI - European Bioinformatics Institute
- Google Dataset Search (Attention: proprietary)
- gesisDataSearch - search for data on social and economic research in data repositories and metadata services
- VerbundFDB - Search for studies, research data and instruments of empirical educational research
Re-Using Data
Have you found data that you want to re-use? Great. When re-using data, it is important to cite the data appropriately, because citing research data is part of good scientific practice. Today we would like to present some basic principles and possibilities:
- Data are legitimate, citable products of scientific activity. Citing research data is just as essential as citing publications or other results of scientific work.
- Data should be cited in such a way that all persons who contributed to the data production are acknowledged.
- If a claim in the scientific literature is based on data, the related data should always be cited.
- Citation of data should include a persistent method of identification that is machine-executable, globally unique and widely used by a community.
- Citation of research data should facilitate access to the data and associated metadata, documentation, programme code and other materials to the extent necessary for meaningful use by humans and machines.
Elements of a data citation:
- Author(s)/Creator
- Title: “Data for ….”
- Year of publication: The date when the dataset was published or released (rather than the collection or coverage date)
- Publisher: the data center/repository
- Any applicable identifier (including edition or version)
- Availability and access: Persistent identifier (URL / DOI) or other location information
Example:
Vobejda, C. (2025). Data: Stability of heart rate at maximal lactate steady state in recreational sportsmen. Bielefeld University. DOI: https://doi.org/10.4119/unibi/3000793