100+ datasets found
  1. h

    Repository-Dataset

    • huggingface.co
    Updated Feb 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vicky A (2025). Repository-Dataset [Dataset]. https://huggingface.co/datasets/Mr-Vicky-01/Repository-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 14, 2025
    Authors
    vicky A
    Description

    Mr-Vicky-01/Repository-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. Most Popular Github Repositories (Projects)

    • kaggle.com
    zip
    Updated Oct 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Canard (2023). Most Popular Github Repositories (Projects) [Dataset]. https://www.kaggle.com/datasets/donbarbos/github-repos
    Explore at:
    zip(24421413 bytes)Available download formats
    Dataset updated
    Oct 1, 2023
    Authors
    Canard
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About

    This dataset lists over 215k top projects by star with over 167 stars. Contains a lot of useful information (attributes).

    I collected this dataset using github search api. This allows you to get only the first thousand for a query, so I looped through the low/high (stars) pairs that return less than a thousand repositories when query=stars:{low}..{high}.

    The Github API Terms of Service apply.

    You may not use this dataset for spamming purposes, including for the purposes of selling GitHub users' personal information, such as to recruiters, headhunters, and job boards.

    Columns

    Column nameDescription
    NameThe name of the GitHub repository
    DescriptionA brief textual description that summarizes the purpose or focus of the repository
    URLThe URL or web address that links to the GitHub repository, which is a unique identifier for the repository
    Created AtThe date and time when the repository was initially created on GitHub, in ISO 8601 format
    Updated AtThe date and time of the most recent update or modification to the repository, in ISO 8601 format
    HomepageThe URL to the homepage or landing page associated with the repository, providing additional information or resources
    SizeThe size of the repository in bytes, indicating the total storage space used by the repository's files and data
    StarsThe number of stars or likes that the repository has received from other GitHub users, indicating its popularity or interest
    ForksThe number of times the repository has been forked by other GitHub users
    IssuesThe total number of open issues
    WatchersThe number of GitHub users who are "watching" or monitoring the repository for updates and changes
    LanguageThe primary programming language
    LicenseInformation about the software license using a license identifier
    TopicsA list of topics or tags associated with the repository, helping users discover related projects and topics of interest
    Has IssuesA boolean value indicating whether the repository has an issue tracker enabled. In this case, it's true, meaning it has an issue tracker
    Has ProjectsA boolean value indicating whether the repository uses GitHub Projects to manage and organize tasks and work items
    Has DownloadsA boolean value indicating whether the repository offers downloadable files or assets to users
    Has WikiA boolean value indicating whether the repository has an associated wiki with additional documentation and information
    Has PagesA boolean value indicating whether the repository has GitHub Pages enabled, allowing the creation of a website associated with the repository
    Has DiscussionsA boolean value indicating whether the repository has GitHub Discussions enabled, allowing community discussions and collaboration
    Is ForkA boolean value indicating whether the repository is a fork of another repository. In this case, it's false, meaning it is not a fork
    Is ArchivedA boolean value indicating whether the repository is archived. Archived repositories are typically read-only and are no longer actively maintained
    Is TemplateA boolean value indicating whether the repository is set up as a template
    Default BranchThe name of the default branch
  3. NSF Public Access Repository

    • catalog.data.gov
    Updated Sep 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Science Foundation (2021). NSF Public Access Repository [Dataset]. https://catalog.data.gov/dataset/nsf-public-access-repository
    Explore at:
    Dataset updated
    Sep 19, 2021
    Dataset provided by
    National Science Foundationhttp://www.nsf.gov/
    Description

    The NSF Public Access Repository contains an initial collection of journal publications and the final accepted version of the peer-reviewed manuscript or the version of record. To do this, NSF draws upon services provided by the publisher community including the Clearinghouse of Open Research for the United States, CrossRef, and International Standard Serial Number. When clicking on a Digital Object Identifier number, you will be taken to an external site maintained by the publisher. Some full text articles may not be available without a charge during the embargo, or administrative interval. Some links on this page may take you to non-federal websites. Their policies may differ from this website.

  4. B

    Research Data Repository Requirements and Features Review

    • borealisdata.ca
    Updated Aug 24, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amber Leahey; Peter Webster; Claire Austin; Nancy Fong; Julie Friddell; Chuck Humphrey; Susan Brown; Walter Stewart (2015). Research Data Repository Requirements and Features Review [Dataset]. http://doi.org/10.5683/SP3/UPABVH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 24, 2015
    Dataset provided by
    Borealis
    Authors
    Amber Leahey; Peter Webster; Claire Austin; Nancy Fong; Julie Friddell; Chuck Humphrey; Susan Brown; Walter Stewart
    License

    https://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVHhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVH

    Time period covered
    Sep 2014 - Feb 2015
    Area covered
    Europe, United Kingdom, United States, Canada, International
    Description

    Data collected from major Canadian and international research data repositories cover data storage, preservation, metadata, interchange, data file types, and other standard features used in the retention and sharing of research data. The outputs of this project primarily aim to assist in the establishment of recommended minimum requirements for a Canadian research data infrastructure. The committee also aims to further develop guidelines and criteria for the assessment and selection o f repositories for deposit of Canadian research data by researchers, data managers, librarians, archivists etc.

  5. GitHub Repos

    • kaggle.com
    zip
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Github (2019). GitHub Repos [Dataset]. https://www.kaggle.com/datasets/github/github-repos
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 20, 2019
    Dataset provided by
    GitHubhttps://github.com/
    Authors
    Github
    Description

    GitHub is how people build software and is home to the largest community of open source developers in the world, with over 12 million people contributing to 31 million projects on GitHub since 2008.

    This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions.

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    This dataset was made available per GitHub's terms of service. This dataset is available via Google Cloud Platform's Marketplace, GitHub Activity Data, as part of GCP Public Datasets.

    Inspiration

    • This is the perfect dataset for fighting language wars.
    • Can you identify any signals that predict which packages or languages will become popular, in advance of their mass adoption?
  6. o

    The Experience Sampling Method (ESM) Item Repository

    • osf.io
    Updated Aug 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olivia Kirtley; Gudrun Eisele; Yoram Kunkels; Anu Hiekkaranta; Laura Van Heck; Milla Pihlajamäki; Benjamin Kunc; Steffie Schoefs; Nieke Vermaelen; Inez Myin-Germeys (2025). The Experience Sampling Method (ESM) Item Repository [Dataset]. http://doi.org/10.17605/OSF.IO/KG376
    Explore at:
    Dataset updated
    Aug 11, 2025
    Dataset provided by
    Center For Open Science
    Authors
    Olivia Kirtley; Gudrun Eisele; Yoram Kunkels; Anu Hiekkaranta; Laura Van Heck; Milla Pihlajamäki; Benjamin Kunc; Steffie Schoefs; Nieke Vermaelen; Inez Myin-Germeys
    Description

    This project has built a repository of items (www.esmitemrepository.com) used in experience sampling method (ESM), ecological momentary assessment (EMA) and ambulatory assessment (AA) studies. The idea for this repository arose out of discussions during the Open Science hackathon at the 2018 Belgian-Dutch ESM Network Meeting.

    In order to contribute items to the repository, you will need to download all five documents in the Contributors' Pack. When you have downloaded the ESM Item Repository submission template (spreadsheet) document, you can enter your items into it and then send it back to us via email (submissions [at] esmitemrepository.com). We will then collate all the submitted items into a repository and publish them here.

    If you would like to browse the full repository and download items and their information, visit www.esmitemrepository.com.

  7. d

    NIH Common Data Elements Repository

    • catalog.data.gov
    • datadiscovery.nlm.nih.gov
    • +1more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). NIH Common Data Elements Repository [Dataset]. https://catalog.data.gov/dataset/nih-common-data-elements-repository-f6b3a
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    The NIH Common Data Elements (CDE) Repository has been designed to provide access to structured human and machine-readable definitions of data elements that have been recommended or required by NIH Institutes and Centers and other organizations for use in research and for other purposes. Visit the NIH CDE Resource Portal for contextual information about the repository.

  8. u

    Thesis Data Repository

    • figshare.unimelb.edu.au
    zip
    Updated Oct 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory White (2023). Thesis Data Repository [Dataset]. http://doi.org/10.26188/24295243.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 11, 2023
    Dataset provided by
    The University of Melbourne
    Authors
    Gregory White
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Availability of data, code, and plot creation for various figures throughout my PhD thesis. Rough organisation currently. Pertains to Figures 5.4, 5.8, 6.11, 6.18, 7.3, 7.12, and Table 6.1.

  9. Administrative Data Repository (ADR)

    • catalog.data.gov
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2025). Administrative Data Repository (ADR) [Dataset]. https://catalog.data.gov/dataset/administrative-data-repository-adr
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    The Administrative Data Repository (ADR) was established to provide support for the administrative data elements relative to multiple categories of a person entity such as demographic and eligibility information. Although initially focused on the computing needs of the Veterans Health Administration, the ADR is positioned to provide identity management and demographics support for all IT systems within the Department of Veterans Affairs.

  10. D

    Clinical Trial Data Repository Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Clinical Trial Data Repository Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-clinical-trial-data-repository-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Sep 23, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2025 - 2034
    Area covered
    Global
    Description

    Clinical Trial Data Repository Market Outlook




    The global clinical trial data repository market size was estimated to be approximately $1.8 billion in 2023 and is projected to grow at a compound annual growth rate (CAGR) of 9.5% to reach around $4.1 billion by 2032. The primary growth factors include the increasing volume and complexity of clinical trials, rising need for efficient data management systems, and stringent regulatory requirements for data accuracy and integrity. The advent of advanced technologies such as artificial intelligence and big data analytics further drives market expansion by enhancing data processing capabilities and providing actionable insights.




    The growth of the clinical trial data repository market is significantly influenced by the increasing number of clinical trials being conducted globally. With the rise in chronic diseases, the need for innovative treatments and therapies has surged, leading to an upsurge in clinical trials. This increase in clinical trials necessitates robust data management systems to handle vast amounts of data generated, thereby propelling the demand for clinical trial data repositories. Moreover, the complexity of modern clinical trials, which often involve multiple sites and diverse patient populations, further amplifies the need for sophisticated data management solutions.




    Another critical driver for the market is the stringent regulatory landscape governing clinical trial data. Regulatory bodies such as the FDA, EMA, and other local authorities mandate rigorous data management standards to ensure data integrity, accuracy, and accessibility. These regulations necessitate the adoption of advanced data repository systems that can comply with regulatory requirements, thereby fueling market growth. Additionally, regulatory frameworks are becoming increasingly stringent, prompting pharmaceutical and biotechnology companies to invest in state-of-the-art data management systems to avoid compliance issues and potential financial penalties.




    Technological advancements play a pivotal role in the market's growth. The integration of artificial intelligence, machine learning, and big data analytics into data repository systems enhances data processing and analysis capabilities. These technologies enable real-time data monitoring, predictive analytics, and improved decision-making, thereby improving the efficiency of clinical trials. Furthermore, the shift towards cloud-based solutions offers scalability, flexibility, and cost-effectiveness, making advanced data management systems accessible to even small and medium-sized enterprises.




    Regionally, North America dominates the clinical trial data repository market owing to its robust healthcare infrastructure, high R&D investments, and presence of major pharmaceutical and biotechnology companies. Europe follows closely due to stringent regulatory standards and a strong focus on clinical research. The Asia Pacific region is expected to witness the highest growth rate during the forecast period due to increasing clinical trial activities, growing healthcare expenditure, and the rising adoption of advanced technologies. Latin America and the Middle East & Africa are also likely to experience growth, albeit at a slower pace, driven by improving healthcare systems and increasing focus on clinical research.



    Component Analysis




    The clinical trial data repository market is segmented by components into software and services. The software segment is anticipated to hold a significant share of the market due to the essential role software plays in data management. Advanced software solutions offer capabilities such as data storage, management, retrieval, and analysis, which are critical for effective clinical trial management. The integration of AI and machine learning algorithms into these software systems further enhances their efficiency by enabling predictive analytics and real-time monitoring, thus driving the software segment's growth.




    Software solutions in clinical trial data repositories also offer interoperability, enabling seamless integration with other clinical trial management systems (CTMS) and electronic data capture (EDC) systems. This interoperability is crucial for ensuring data consistency and accuracy across different platforms, thereby enhancing overall data management. Additionally, the increasing adoption of cloud-based software solutions provides scalability, cost-effectiveness, and remote acce

  11. e

    Open Repository and Bibliography

    • data.europa.eu
    unknown
    Updated Mar 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Luxembourg (2024). Open Repository and Bibliography [Dataset]. https://data.europa.eu/88u/dataset/open-repository-and-bibliography
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset authored and provided by
    University of Luxembourg
    Description

    Digital Repository for Open Access to University of Luxembourg publications.

    ORBilu was officially launched on the 22nd April 2013. The acronym ORBi stands for “Open Repository and Bibliography”. It also expresses the Latin word “orbi” (“for the world”) and signals the will of the University to make its academic research available to everyone, without barriers, be they legal, financial or technical. By keeping the ORBi name and adding “lu”, the University of Luxembourg wants to show its appreciation for the work done by the University of Liège but also clearly indicates that this is a version adapted to the UL context.

    The API format is described at https://www.openarchives.org/pmh/.

  12. n

    NIH Pediatric MRI Data Repository

    • neuinfo.org
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). NIH Pediatric MRI Data Repository [Dataset]. http://identifiers.org/RRID:SCR_014149
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A database which contains longitudinal structural MRIs, spectroscopy, DTI and correlated clinical/behavioral data from approximately 500 healthy, normally developing children, ages newborn to young adult.

  13. V

    Biologic Specimen and Data Repository Information Coordinating Center...

    • data.virginia.gov
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (NIH) (2023). Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC) [Dataset]. https://data.virginia.gov/dataset/biologic-specimen-and-data-repository-information-coordinating-center-biolincc
    Explore at:
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    National Institutes of Health (NIH)
    Description

    The goal of BioLINCC is to facilitate and coordinate the existing activities of the NHLBI Biorepository and the Data Repository and to expand their scope and usability to the scientific community through a single web-based user interface.

  14. I

    Language values for DataCite dataset records

    • databank.illinois.edu
    Updated Jun 23, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Wickes (2016). Language values for DataCite dataset records [Dataset]. http://doi.org/10.13012/B2IDB-1065549_V1
    Explore at:
    Dataset updated
    Jun 23, 2016
    Authors
    Elizabeth Wickes
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset was extracted from a set of metadata files harvested from the DataCite metadata store (http://search.datacite.org/ui) during December 2015. Metadata records for items with a resourceType of dataset were collected. 1,647,949 total records were collected. This dataset contains four files: 1) readme.txt: a readme file. 2) language-results.csv: A CSV file containing three columns: DOI, DOI prefix, and language text contents 3) language-counts.csv: A CSV file containing counts for unique language text content values. 4) language-grouped-counts.txt: A text file containing the results of manually grouping these language codes.

  15. RISEnergy_D3.2 Annex MD platform and Data repository/dataset list v.2

    • zenodo.org
    txt
    Updated Oct 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nakamae; Nakamae; Lazher Mejdi; Lazher Mejdi; Kourosh Malek; Kourosh Malek; Kai Heussen; Kai Heussen (2025). RISEnergy_D3.2 Annex MD platform and Data repository/dataset list v.2 [Dataset]. http://doi.org/10.5281/zenodo.17452376
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 27, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nakamae; Nakamae; Lazher Mejdi; Lazher Mejdi; Kourosh Malek; Kourosh Malek; Kai Heussen; Kai Heussen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 26, 2025
    Description

    Annex to Deliverable 3.2 of project RISEnergy. Contains lists of: Metadata platforms, Data repository sites and Database services relevant to 10 renewabl energy sectors of concern by RISEnergy

  16. D

    Unified Data Repository Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Unified Data Repository Market Research Report 2033 [Dataset]. https://dataintelo.com/report/unified-data-repository-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2025 - 2034
    Area covered
    Global
    Description

    Unified Data Repository Market Outlook



    According to our latest research, the unified data repository market size reached USD 8.4 billion in 2024 on a global scale. The market is witnessing robust momentum, driven by the exponential growth of enterprise data and the need for streamlined data management solutions. The market is projected to expand at a notable CAGR of 14.7% during the forecast period, with the total value anticipated to reach USD 26.2 billion by 2033. This significant growth trajectory is underpinned by the increasing adoption of cloud-based solutions, the proliferation of big data analytics, and a growing emphasis on regulatory compliance and data governance across various industries.




    One of the primary growth factors propelling the unified data repository market is the relentless surge in data volumes generated by organizations across all sectors. With the proliferation of digital transformation initiatives, enterprises are experiencing unprecedented data growth, originating from diverse sources such as IoT devices, customer interactions, business operations, and social media. Managing, integrating, and extracting value from this deluge of data has become a strategic imperative. Unified data repositories offer a centralized platform that enables organizations to consolidate disparate data silos, improve data accessibility, and enhance decision-making capabilities. As businesses increasingly recognize the value of data-driven insights, the demand for robust unified data repository solutions is set to accelerate further.




    Another critical driver for the unified data repository market is the growing need for compliance with stringent data protection and privacy regulations. Regulatory frameworks such as GDPR in Europe, CCPA in California, and other local data governance mandates require organizations to maintain high levels of data integrity, security, and transparency. Unified data repositories facilitate centralized control and monitoring of data assets, ensuring that organizations can efficiently manage data lineage, access controls, and audit trails. This capability not only helps mitigate compliance risks but also fosters trust among stakeholders and customers. Consequently, sectors such as BFSI, healthcare, and government are increasingly investing in unified data repository solutions to uphold regulatory standards and safeguard sensitive information.




    Technological advancements and the integration of artificial intelligence (AI) and machine learning (ML) capabilities are further enhancing the value proposition of unified data repositories. Modern solutions are equipped with advanced analytics, automated data classification, and intelligent data integration features that empower organizations to derive actionable insights from their data assets. The ability to seamlessly integrate with existing IT infrastructure and support multi-cloud deployments is also a key differentiator. These technological innovations are enabling organizations to unlock new business opportunities, optimize operational efficiency, and gain a competitive edge in the digital economy. As a result, the unified data repository market is experiencing heightened adoption across both large enterprises and small and medium-sized enterprises (SMEs).




    From a regional perspective, North America continues to dominate the unified data repository market, accounting for the largest revenue share in 2024. The region’s leadership is attributed to the high concentration of technology-driven enterprises, early adoption of advanced data management solutions, and a mature regulatory environment. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digitalization, expanding IT infrastructure, and increasing investments in cloud technologies. Europe remains a significant market, driven by stringent data protection regulations and strong demand from the BFSI and healthcare sectors. The Middle East & Africa and Latin America are also witnessing steady growth, supported by rising awareness of data management best practices and ongoing digital transformation initiatives.



    Component Analysis



    The unified data repository market is segmented by component into software, hardware, and services, each playing a crucial role in the overall ecosystem. The software segment holds the largest share, driven by the widespread adoption of advanced data management platforms that enable seamless integration, storage, and retriev

  17. r

    KMASH Data Repository for outlier detection

    • research-repository.rmit.edu.au
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sevvandi Kandanaarachchi; Mario Andres Munoz Acosta; Kate Smith-Miles; Rob J Hyndman (2023). KMASH Data Repository for outlier detection [Dataset]. http://doi.org/10.26180/5c6253c0b3323
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    RMIT University
    Authors
    Sevvandi Kandanaarachchi; Mario Andres Munoz Acosta; Kate Smith-Miles; Rob J Hyndman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The zip files contains 12338 datasets for outlier detection investigated in the following papers:(1) Instance space analysis for unsupervised outlier detection Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Kate Smith-Miles (2) On normalization and algorithm selection for unsupervised outlier detection Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Rob J. Hyndman, Kate Smith-MilesSome of these datasets were originally discussed in the paper: On the evaluation of unsupervised outlier detection:measures, datasets and an empirical studyAuthors : G. O. Campos, A, Zimek, J. Sander, R. J.G.B. Campello, B. Micenkova, E. Schubert, I. Assent, M.E. Houle.

  18. r

    UCI Machine Learning Repository

    • rrid.site
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). UCI Machine Learning Repository [Dataset]. http://identifiers.org/RRID:SCR_026571
    Explore at:
    Dataset updated
    Sep 30, 2024
    Description

    Collection of databases, domain theories, and data generators that are used by machine learning community for empirical analysis of machine learning algorithms. Datasets approved to be in the repository will be assigned Digital Object Identifier (DOI) if they do not already possess one. Datasets will be licensed under a Creative Commons Attribution 4.0 International license (CC BY 4.0) which allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given

  19. Biospecimen Repository Access and Data Sharing (BRADS)

    • healthdata.gov
    • data.virginia.gov
    • +1more
    csv, xlsx, xml
    Updated Feb 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Biospecimen Repository Access and Data Sharing (BRADS) [Dataset]. https://healthdata.gov/dataset/Biospecimen-Repository-Access-and-Data-Sharing-BRA/eqzy-py4a
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Feb 13, 2021
    Description

    BRADS is a repository for data and biospecimens from population health research initiatives and clinical or interventional trials designed and implemented by NICHD’s Division of Intramural Population Health Research (DIPHR). Topics include human reproduction and development, pregnancy, child health and development, and women’s health. The website is maintained by DIPHR.

  20. Global Weather Repository

    • kaggle.com
    zip
    Updated May 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    salhi rahma (2024). Global Weather Repository [Dataset]. https://www.kaggle.com/datasets/salhirahma/global-weather-repository
    Explore at:
    zip(230967 bytes)Available download formats
    Dataset updated
    May 20, 2024
    Authors
    salhi rahma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by salhi rahma

    Released under Attribution 4.0 International (CC BY 4.0)

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
vicky A (2025). Repository-Dataset [Dataset]. https://huggingface.co/datasets/Mr-Vicky-01/Repository-Dataset

Repository-Dataset

Mr-Vicky-01/Repository-Dataset

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 14, 2025
Authors
vicky A
Description

Mr-Vicky-01/Repository-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu