100+ datasets found
  1. Data repository

    • osf.io
    Updated Aug 9, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yarrow Dunham; Amy Rakei; Chen Fang; Abhishek Giri; Filip Verroens (2015). Data repository [Dataset]. https://osf.io/q5j8g
    Explore at:
    Dataset updated
    Aug 9, 2015
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Yarrow Dunham; Amy Rakei; Chen Fang; Abhishek Giri; Filip Verroens
    Description

    Data and variable key for Dunham, Dotsch, Clark, & Stepanova, "The development of White-Asian categorization: Contributions from skin color and other physiognomic cues"

  2. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  3. Data from the International Open Data Repository Survey

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated May 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Markus von der Heyde; Markus von der Heyde (2022). Data from the International Open Data Repository Survey [Dataset]. http://doi.org/10.5281/zenodo.2643493
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Markus von der Heyde; Markus von der Heyde
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file collection is part of the ORD Landscape and Cost Analysis Project (DOI: 10.5281/zenodo.2643460), a study jointly commissioned by the SNSF and swissuniversities in 2018.

    Please cite this data collection as:
    von der Heyde, M. (2019). Data from the International Open Data Repository Survey. Retrieved from https://doi.org/10.5281/zenodo.2643493

    Further information is given in the corresponding data paper:
    von der Heyde, M. (2019). International Open Data Repository Survey: Description of collection, collected data, and analysis methods [Data paper]. Retrieved from https://doi.org/10.5281/zenodo.2643450

    Contact

    Swiss National Science Foundation (SNSF)

    Open Research Data Group

    E-mail: ord@snf.ch

    swissuniversities

    Program "Scientific Information"

    Gabi Schneider

    E-Mail: isci@swissuniversities.ch

  4. d

    DHS Public Access Data Repository

    • catalog.data.gov
    • datasets.ai
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unspecified (2023). DHS Public Access Data Repository [Dataset]. https://catalog.data.gov/dataset/dhs-public-access-data-repository
    Explore at:
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    Unspecified
    Description

    ST - DHS Public Access Database: Consistent with the 2013 OSTP Memorandum and the 2022 update, “Increasing Access to the Results of Federally Funded Scientific Research,” directed all agencies with greater than $100 million in R&D expenditures each year to prepare a plan for improving the public’s access to the results of federally funded research, specifically peer-reviewed scholarly publications and digital data. In response to the memorandum, DHS developed a DHS Public Access Plan, and intends to make available to the public digitally formatted scientific data that support the conclusions in peer-reviewed scholarly publications that are the results of DHS R&D funding. This data repository site with a customized DHS Storefront allows DHS to post releasable scientific digital data from peer-reviewed publications resulting from DHS-funded research. The data repository is configured to allow DHS users (and publishers acting on behalf of these users) to deposit data sets into the repository, making them available to the general public.

  5. H

    Data from: Scientific production on data repositories and open science...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sinval Rodrigues-Junior (2024). Scientific production on data repositories and open science published in the Web of Science database – Bibliometric conceptual analysis [Dataset]. http://doi.org/10.7910/DVN/MZ1EUP
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Sinval Rodrigues-Junior
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This document describes data collected from the Main Collection of the Web of Science database. Records of published studies addressing the intersection of Open Science and data repository were searched up to January 15th, 2024, and the final dataset was comprised of 545 records for bibliometric analysis.

  6. B

    How to deposit research data in the University of Guelph Research Data...

    • borealisdata.ca
    • dataone.org
    Updated Aug 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research & Scholarship (2025). How to deposit research data in the University of Guelph Research Data Repositories [Dataset]. http://doi.org/10.5683/SP2/CPHFGA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 14, 2025
    Dataset provided by
    Borealis
    Authors
    Research & Scholarship
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Guelph
    Description

    This dataset provides guidance materials and templates to help you prepare your research datasets for deposit in the U of G Research Data Repositories.Please refer to the U of G Research Data Repositories LibGuide for detailed information about the U of G Research Data Repositories including additional resources for preparing datasets for deposit. The library offers a self-deposit with curation service. The deposit workflow is as follows:Create your repository account.If you are a first-time depositor, complete the U of G Research Data Repositories Dataset Deposit Intake Form.Activate your Data Repositories account by logging in with your U of G username and password.Once your account is created, contact us to set up your dataset creator access to your home department’s collection in the Data Repositories.Note: If you already have a Data Repositories account and dataset creator access, you can log in and begin a new deposit to your home department’s collection right away.Prepare your dataset.Assemble your dataset following the Dataset Deposit Guidelines. Use the README file template to capture data documentation.Create a draft dataset record.Log in to the Data Repositories and create a draft dataset record following the instructions in the Dataset Submission Guide.Submit your draft dataset for review.Dataset review.Data Repositories staff will review (also referred to as curate) your dataset for alignment with the Dataset Deposit Guidelines using a standard curation workflow.The curator will collaborate with you to enhance the dataset.Public release.Once ready, the dataset curator will make the dataset publicly available in the Data Repositories, with appropriate file access controls. Support: If you have any questions about preparing and depositing your dataset, please make a Publishing and Author Support Request.

  7. Z

    Data from the Swiss Open Data Repository Landscape survey

    • data.niaid.nih.gov
    Updated May 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    von der Heyde, Markus (2022). Data from the Swiss Open Data Repository Landscape survey [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2643486
    Explore at:
    Dataset updated
    May 16, 2022
    Dataset provided by
    vdH-IT
    Authors
    von der Heyde, Markus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file collection is part of the ORD Landscape and Cost Analysis Project (DOI: 10.5281/zenodo.2643460), a study jointly commissioned by the SNSF and swissuniversities in 2018.

    Please cite this data collection as: von der Heyde, M. (2019). Data from the Swiss Open Data Repository Landscape survey. Retrieved from https://doi.org/10.5281/zenodo.2643487

    Further information is given in the corresponding data paper: von der Heyde, M. (2019). Open Data Landscape: Repository Usage of the Swiss Research Community: Description of collection, collected data, and analysis methods [Data paper]. Retrieved from https://doi.org/10.5281/zenodo.2643430

    Contact

    Swiss National Science Foundation (SNSF)

    Open Research Data Group

    E-mail: ord@snf.ch

    swissuniversities

    Program "Scientific Information"

    Gabi Schneider

    E-Mail: isci@swissuniversities.ch

  8. NSF Public Access Repository

    • catalog.data.gov
    • s.cnmilf.com
    Updated Sep 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Science Foundation (2021). NSF Public Access Repository [Dataset]. https://catalog.data.gov/dataset/nsf-public-access-repository
    Explore at:
    Dataset updated
    Sep 19, 2021
    Dataset provided by
    National Science Foundationhttp://www.nsf.gov/
    Description

    The NSF Public Access Repository contains an initial collection of journal publications and the final accepted version of the peer-reviewed manuscript or the version of record. To do this, NSF draws upon services provided by the publisher community including the Clearinghouse of Open Research for the United States, CrossRef, and International Standard Serial Number. When clicking on a Digital Object Identifier number, you will be taken to an external site maintained by the publisher. Some full text articles may not be available without a charge during the embargo, or administrative interval. Some links on this page may take you to non-federal websites. Their policies may differ from this website.

  9. Awesome Dataset Repository on GitHub

    • kaggle.com
    zip
    Updated Jun 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajesh Kumar Pandey (2020). Awesome Dataset Repository on GitHub [Dataset]. https://www.kaggle.com/datasets/rajeshkpandey/awesome-dataset-repository-on-github
    Explore at:
    zip(48522 bytes)Available download formats
    Dataset updated
    Jun 10, 2020
    Authors
    Rajesh Kumar Pandey
    Description

    Dataset

    This dataset was created by Rajesh Kumar Pandey

    Contents

  10. Scientific Data recommended repositories

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scientific Data (2023). Scientific Data recommended repositories [Dataset]. http://doi.org/10.6084/m9.figshare.1434640.v16
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Scientific Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spreadsheet listing data repositories that are recommended by Scientific Data (Springer Nature) as being suitable for hosting data associated with peer-reviewed articles. Please see the repository list on Scientific Data's website for the most up to date list.

  11. Canadian National Marine Seismic Data Repository

    • open.canada.ca
    • catalogue.arctic-sdi.org
    • +1more
    esri rest, fgdb/gdb +1
    Updated Dec 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natural Resources Canada (2020). Canadian National Marine Seismic Data Repository [Dataset]. https://open.canada.ca/data/en/dataset/e1fa0090-4b06-e476-5c71-e2326666a4d0
    Explore at:
    fgdb/gdb, wms, esri restAvailable download formats
    Dataset updated
    Dec 9, 2020
    Dataset provided by
    Ministry of Natural Resources of Canadahttps://www.nrcan.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Dec 31, 1959 - Aug 10, 2020
    Area covered
    Canada
    Description

    The Geological Survey of Canada (Atlantic and Pacific) has collected marine survey field records on marine expeditions for over 50 years. This release makes available the results of an ongoing effort to scan and convert our inventory of analog marine survey field records (seismic, sidescan and sounder) to digital format. These records were scanned at 300 dpi and converted into JPEG2000 format. Typically, each of these files was between 1 to 2 gbyte in size before compression and compressed by a factor of 10:1. Empirical tests with a number of data sets suggest that there is minimal visual distortion of the scanned data at this level of compression. In this KML file, scanned data are available in a reduced-scale thumbnail format and a compressed full-resolution JPEG2000 format.

  12. H

    Data from: Common Metadata Framework for Research Data Repository: Necessity...

    • dataverse.harvard.edu
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kavya Asok; Snigdha Dandpat; Dinesh K. Gupta; Prashant Shrivastava (2024). Common Metadata Framework for Research Data Repository: Necessity to Support Open Science [Dataset]. http://doi.org/10.7910/DVN/JK6HBB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 4, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kavya Asok; Snigdha Dandpat; Dinesh K. Gupta; Prashant Shrivastava
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    These research datasets are the updated version of the conference poster "Research data repositories and their metadata: A comparative study," presented by Ms. Kavya Asok and Ms. Snigdha Dandpat in a Conference on Open and FAIR Data Ecosystem: Principles, Policies, and Platforms scheduled from 11th -13th September 2023, at IIC, New Delhi. The study describes the features of a select number of RDRs and analyzes their metadata practices.

  13. Covid-19 Public Repository Data

    • kaggle.com
    zip
    Updated Jan 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirav Das (2022). Covid-19 Public Repository Data [Dataset]. https://www.kaggle.com/datasets/niravdas/covid19-open-source-project-github-repositories
    Explore at:
    zip(8587523 bytes)Available download formats
    Dataset updated
    Jan 16, 2022
    Authors
    Nirav Das
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A comprehensive versioned dataset of the repositories and relevant related metadata about public projects hosted on GitHub related to the 2019 Novel Coronavirus and associated COVID-19 disease.

    GitHub had received a number of enquiries from researchers and the community surrounding open collaboration on projects on the platform related to the disease COVID-19 caused by the SARS-CoV-2 virus. Many projects, ordered by star count, can be found using the covid-19 topic on GitHub, however, discovery of other important projects is difficult due to differences in the way users self identify their work.

    to hear about it so that we can help ensure it becomes more prominently featured. Please open a PR against the file USER_SUBMISSIONS.md with a link to your research. We are especially interested in highlighting the most promising and impactful projects in need of community help and support.

    Open data Open source is bigger than any company or community. The dataset is released under CC0-1.0 for anyone to use and learn from.

    There are two main sets of files, released via TSV and json formats for public consumption in the directory data/. A comprehensive data dictionary that explains the contents of these files is here. The files are sorted in descending order by the count of distinct contributors at the time of extract.

    The files have been versioned based on a weekly snapshot of identified repositories from the week of 2020-01-20 onward.

    We will update this repository with new data files on a monthly basis, generally on the first Tuesday of a month. We will revisit this each month and provide an update on continuing this commitment.

    Identification methodology Rather than relying on any one GitHub topic to identify potential COVID-19 related projects, the data set is produced using a more comprehensive set of search criteria to identify projects likely to be COVID-19 related. Note: This has the potential to include a small number of false positives however we figured we were better to cast a wide net and allow consumers of the data to perform additional cleaning if they desire. Furthermore, since this data is versioned based on the week the repo was initially created, there may exist data that are included for repos that were originally public that have been made private and are currently inaccessible.

    The following parts of public metadata are currently being used to identify public projects (those licensed and not) as COVID-19 related: The repo's description The name of the repo The topics associated with the repo The organization bio description where that exists Search terms against these metadata include variations of: covid, coronavirus, ncov and sars-cov-2

  14. G

    Research Data Repositories Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Research Data Repositories Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/research-data-repositories-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Research Data Repositories Market Outlook



    According to our latest research, the global research data repositories market size reached USD 3.1 billion in 2024, reflecting a robust expansion fueled by the rising demand for data-driven research and open science initiatives. The market is anticipated to grow at a CAGR of 10.2% from 2025 to 2033, with the total market forecasted to reach USD 7.4 billion by 2033. Key growth factors include the proliferation of digital research outputs, increasing mandates for data sharing by funding agencies, and the rapid evolution of cloud-based repository solutions.




    One of the primary growth drivers for the research data repositories market is the accelerating adoption of open science policies by governments, research councils, and academic institutions worldwide. These policies mandate that research data be made openly accessible, reusable, and interoperable, which has led to a surge in the establishment and utilization of repositories. Additionally, the exponential growth in research output, especially in fields such as genomics, climate science, and social sciences, necessitates efficient data management and sharing platforms. As the volume, variety, and velocity of research data increase, organizations are investing in sophisticated repository solutions to ensure data integrity, discoverability, and long-term preservation, further propelling the market’s expansion.




    Technological advancements have also played a pivotal role in shaping the research data repositories market. The integration of artificial intelligence, machine learning, and advanced metadata management tools within repository platforms has significantly enhanced data curation, searchability, and security. These innovations are helping institutions and researchers manage large and complex datasets more effectively, driving adoption across diverse end-user segments. Moreover, the shift towards cloud-based deployment models has enabled scalable, cost-effective, and collaborative environments for data storage and sharing, making research data repositories more accessible to a broader range of organizations, including those with limited IT infrastructure.




    Another critical factor fueling market growth is the increasing emphasis on research reproducibility and transparency. Funding agencies and scientific publishers are increasingly requiring researchers to deposit their data in trusted repositories as a prerequisite for grant approval or publication. This trend is particularly prominent in regions such as North America and Europe, where regulatory frameworks and research assessment policies are more mature. The growing recognition of data as a valuable research output, alongside publications, is encouraging institutions to invest in robust repository infrastructure, driving sustained market growth across the globe.




    Regionally, North America continues to dominate the research data repositories market, accounting for the largest revenue share in 2024, followed closely by Europe. These regions benefit from well-established research ecosystems, high digital literacy, and strong policy support for open data initiatives. Meanwhile, the Asia Pacific region is emerging as the fastest-growing market, driven by substantial investments in research infrastructure, increasing international collaborations, and government-led digitalization programs. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a lower base, as academic and governmental institutions in these regions gradually embrace open data practices and invest in repository solutions.





    Type Analysis



    The research data repositories market is segmented by type into institutional repositories, subject-based repositories, general-purpose repositories, and others. Institutional repositories are primarily managed by universities, research institutes, and academic organizations to store and disseminate the scholarly output of their members. These repositories have gained significant traction due

  15. GitHub Public Repository Metadata

    • kaggle.com
    zip
    Updated Oct 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter (2025). GitHub Public Repository Metadata [Dataset]. https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars
    Explore at:
    zip(606866859 bytes)Available download formats
    Dataset updated
    Oct 26, 2025
    Authors
    Peter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is obtained from the Github API and contains only public repository-level metadata. It may be useful for anyone interested in studying the Github ecosystem. It contains approximately 3.1 million entries.

    The Github API Terms of Service apply.

    You may not use this dataset for spamming purposes, including for the purposes of selling GitHub users' personal information, such as to recruiters, headhunters, and job boards.

    Please see the sample exploration notebook for some examples of what you can do! The data format is a JSON array of entries, an example of which is given below.

    Example entry

    {
     "owner": "pelmers",
     "name": "text-rewriter",
     "stars": 13,
     "forks": 5,
     "watchers": 4,
     "isFork": false,
     "isArchived": false,
     "languages": [ { "name": "JavaScript", "size": 21769 }, { "name": "HTML", "size": 2096 }, { "name": "CSS", "size": 2081 } ],
     "languageCount": 3,
     "topics": [ { "name": "chrome-extension", "stars": 43211 } ],
     "topicCount": 1,
     "diskUsageKb": 75,
     "pullRequests": 4,
     "issues": 12,
     "description": "Webextension to rewrite phrases in pages",
     "primaryLanguage": "JavaScript",
     "createdAt": "2015-03-14T22:35:11Z",
     "pushedAt": "2022-02-11T14:26:00Z",
     "defaultBranchCommitCount": 54,
     "license": null,
     "assignableUserCount": 1,
     "codeOfConduct": null,
     "forkingAllowed": true,
     "nameWithOwner": "pelmers/text-rewriter",
     "parent": null
    }
    

    The collection script and exploration notebook are also available on Github: https://github.com/pelmers/github-repository-metadata. For more background info, you can read my blog post.

  16. Z

    List of research data repositories that were shut down

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Strecker, Dorothea; Pampel, Heinz; Schabinger, Rouven; Weisweiler, Nina Leonie (2024). List of research data repositories that were shut down [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7802441
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Humboldt-Universität zu Berlin, Berlin School of Library and Information Science ; Helmholtz Association, Helmholtz Open Science Office
    Humboldt-Universität zu Berlin, Berlin School of Library and Information Science
    Helmholtz Association, Helmholtz Open Science Office
    Swiss Library Service Platform (SLSP)
    Authors
    Strecker, Dorothea; Pampel, Heinz; Schabinger, Rouven; Weisweiler, Nina Leonie
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset aggregates information about 191 research data repositories that were shut down. The data collection was based on the registry of research data repositories re3data and a comprehensive content analysis of repository websites and related materials. Documented in the dataset are the period in which a repository was active, the risks resulting in its shutdown, and the repositories taking over custody of the data after.

  17. D

    Research Data Repositories Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Research Data Repositories Market Research Report 2033 [Dataset]. https://dataintelo.com/report/research-data-repositories-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Research Data Repositories Market Outlook



    According to our latest research, the global research data repositories market size reached USD 4.12 billion in 2024, driven by the surging demand for secure, accessible, and scalable data management solutions across academic, government, and corporate sectors. The market is projected to expand at a robust CAGR of 8.7% from 2025 to 2033, reaching a forecasted value of USD 8.65 billion by 2033. This impressive growth trajectory is primarily attributed to the increasing emphasis on open science, data transparency, and regulatory compliance, which are compelling organizations to invest in advanced research data repository solutions.




    One of the primary growth factors driving the research data repositories market is the global shift towards open data policies and mandates by funding agencies and governments. The proliferation of open-access initiatives, such as the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles, has significantly increased the need for robust data repositories that can support data sharing, reproducibility, and long-term preservation. As research outputs become more data-intensive and collaborative, the ability to store, manage, and disseminate large datasets efficiently has become a strategic imperative for research institutions and organizations worldwide. This trend is further reinforced by the growing recognition of data as a critical asset in scientific discovery, innovation, and policy-making.




    Another major driver is the rapid digital transformation occurring across academia, government, and the corporate sector. Organizations are increasingly leveraging cloud-based research data repositories to overcome traditional storage limitations, enhance data security, and streamline workflows. The adoption of advanced technologies such as artificial intelligence, machine learning, and blockchain within these repositories is also enhancing data curation, metadata management, and access control. This technological evolution is enabling researchers and organizations to extract greater value from their data assets while ensuring compliance with evolving data governance standards and privacy regulations, such as GDPR and HIPAA.




    The expansion of interdisciplinary and international research collaborations is also fueling the demand for scalable and interoperable research data repositories. As research projects become more complex and involve multiple stakeholders across different geographies, there is a growing need for standardized platforms that facilitate seamless data exchange and integration. This is particularly evident in domains such as health sciences, environmental research, and social sciences, where data sharing and cross-institutional collaboration are essential for addressing global challenges. Furthermore, the increasing availability of funding for research infrastructure development, particularly in emerging economies, is creating new opportunities for market growth.




    From a regional perspective, North America currently dominates the research data repositories market, owing to its advanced research ecosystem, strong government support, and the presence of leading technology providers. Europe follows closely, driven by stringent data protection regulations and a vibrant academic landscape. The Asia Pacific region is expected to witness the fastest growth over the forecast period, supported by significant investments in research infrastructure, rising adoption of digital technologies, and increasing participation in global research initiatives. Latin America and the Middle East & Africa are also emerging as promising markets, albeit from a smaller base, as governments and institutions in these regions ramp up their efforts to enhance research capacity and data management capabilities.



    Type Analysis



    The research data repositories market is segmented by type into institutional repositories, disciplinary repositories, generalist repositories, and others. Institutional repositories form the backbone of most academic and research organizations, serving as centralized platforms for storing, managing, and disseminating research outputs generated by faculty, students, and staff. These repositories are increasingly being adopted as part of open access and research data management policies, enabling institutions to showcase their research impact, comply with funder mandates, and facilitate knowledge sharing. The growing emphasis o

  18. Living HHS Open Data Plan - m9xc-txya - Archive Repository

    • healthdata.gov
    csv, xlsx, xml
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Living HHS Open Data Plan - m9xc-txya - Archive Repository [Dataset]. https://healthdata.gov/dataset/Living-HHS-Open-Data-Plan-m9xc-txya-Archive-Reposi/7bgr-wqru
    Explore at:
    xlsx, csv, xmlAvailable download formats
    Dataset updated
    Jul 31, 2025
    Description

    This dataset tracks the updates made on the dataset "Living HHS Open Data Plan" as a repository for previous versions of the data and metadata.

  19. Data Repository

    • kaggle.com
    zip
    Updated Jul 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Mardones (2023). Data Repository [Dataset]. https://www.kaggle.com/datasets/felipemardones/data-repository
    Explore at:
    zip(214407574 bytes)Available download formats
    Dataset updated
    Jul 2, 2023
    Authors
    Felipe Mardones
    Description

    Dataset

    This dataset was created by Felipe Mardones

    Contents

  20. V

    Biologic Specimen and Data Repository Information Coordinating Center...

    • data.virginia.gov
    • healthdata.gov
    • +3more
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (NIH) (2023). Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC) [Dataset]. https://data.virginia.gov/dataset/biologic-specimen-and-data-repository-information-coordinating-center-biolincc
    Explore at:
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    National Institutes of Health (NIH)
    Description

    The goal of BioLINCC is to facilitate and coordinate the existing activities of the NHLBI Biorepository and the Data Repository and to expand their scope and usability to the scientific community through a single web-based user interface.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yarrow Dunham; Amy Rakei; Chen Fang; Abhishek Giri; Filip Verroens (2015). Data repository [Dataset]. https://osf.io/q5j8g
Organization logo

Data repository

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 9, 2015
Dataset provided by
Center for Open Sciencehttps://cos.io/
Authors
Yarrow Dunham; Amy Rakei; Chen Fang; Abhishek Giri; Filip Verroens
Description

Data and variable key for Dunham, Dotsch, Clark, & Stepanova, "The development of White-Asian categorization: Contributions from skin color and other physiognomic cues"

Search
Clear search
Close search
Google apps
Main menu