Facebook
Twitterhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVHhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVH
Data collected from major Canadian and international research data repositories cover data storage, preservation, metadata, interchange, data file types, and other standard features used in the retention and sharing of research data. The outputs of this project primarily aim to assist in the establishment of recommended minimum requirements for a Canadian research data infrastructure. The committee also aims to further develop guidelines and criteria for the assessment and selection o f repositories for deposit of Canadian research data by researchers, data managers, librarians, archivists etc.
Facebook
TwitterMr-Vicky-01/Repository-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThe NSF Public Access Repository contains an initial collection of journal publications and the final accepted version of the peer-reviewed manuscript or the version of record. To do this, NSF draws upon services provided by the publisher community including the Clearinghouse of Open Research for the United States, CrossRef, and International Standard Serial Number. When clicking on a Digital Object Identifier number, you will be taken to an external site maintained by the publisher. Some full text articles may not be available without a charge during the embargo, or administrative interval. Some links on this page may take you to non-federal websites. Their policies may differ from this website.
Facebook
TwitterGitHub is how people build software and is home to the largest community of open source developers in the world, with over 12 million people contributing to 31 million projects on GitHub since 2008.
This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.
This dataset was made available per GitHub's terms of service. This dataset is available via Google Cloud Platform's Marketplace, GitHub Activity Data, as part of GCP Public Datasets.
Facebook
TwitterTo address the increasing complexity of network management and the limitations of data repositories in handling the various network operational data, this paper proposes a novel repository design that uniformly represents network operational data while allowing for a multiple abstractions access to the information. This smart repository simplifies network management functions by enabling network verification directly within the repository. The data is organized in a knowledge graph compatible with any general-purpose graph database, offering a comprehensive and extensible network repository. Performance evaluations confirm the feasibility of the proposed design. The repository's ability to natively support 'what-if' scenario evaluation is demonstrated by verifying Border Gateway Protocol (BGP) route policies and analyzing forwarding behavior with virtual Traceroute.
Facebook
TwitterThe Administrative Data Repository (ADR) was established to provide support for the administrative data elements relative to multiple categories of a person entity such as demographic and eligibility information. Although initially focused on the computing needs of the Veterans Health Administration, the ADR is positioned to provide identity management and demographics support for all IT systems within the Department of Veterans Affairs.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset lists over 215k top projects by star with over 167 stars. Contains a lot of useful information (attributes).
I collected this dataset using github search api. This allows you to get only the first thousand for a query, so I looped through the low/high (stars) pairs that return less than a thousand repositories when query=stars:{low}..{high}.
The Github API Terms of Service apply.
You may not use this dataset for spamming purposes, including for the purposes of selling GitHub users' personal information, such as to recruiters, headhunters, and job boards.
| Column name | Description |
|---|---|
| Name | The name of the GitHub repository |
| Description | A brief textual description that summarizes the purpose or focus of the repository |
| URL | The URL or web address that links to the GitHub repository, which is a unique identifier for the repository |
| Created At | The date and time when the repository was initially created on GitHub, in ISO 8601 format |
| Updated At | The date and time of the most recent update or modification to the repository, in ISO 8601 format |
| Homepage | The URL to the homepage or landing page associated with the repository, providing additional information or resources |
| Size | The size of the repository in bytes, indicating the total storage space used by the repository's files and data |
| Stars | The number of stars or likes that the repository has received from other GitHub users, indicating its popularity or interest |
| Forks | The number of times the repository has been forked by other GitHub users |
| Issues | The total number of open issues |
| Watchers | The number of GitHub users who are "watching" or monitoring the repository for updates and changes |
| Language | The primary programming language |
| License | Information about the software license using a license identifier |
| Topics | A list of topics or tags associated with the repository, helping users discover related projects and topics of interest |
| Has Issues | A boolean value indicating whether the repository has an issue tracker enabled. In this case, it's true, meaning it has an issue tracker |
| Has Projects | A boolean value indicating whether the repository uses GitHub Projects to manage and organize tasks and work items |
| Has Downloads | A boolean value indicating whether the repository offers downloadable files or assets to users |
| Has Wiki | A boolean value indicating whether the repository has an associated wiki with additional documentation and information |
| Has Pages | A boolean value indicating whether the repository has GitHub Pages enabled, allowing the creation of a website associated with the repository |
| Has Discussions | A boolean value indicating whether the repository has GitHub Discussions enabled, allowing community discussions and collaboration |
| Is Fork | A boolean value indicating whether the repository is a fork of another repository. In this case, it's false, meaning it is not a fork |
| Is Archived | A boolean value indicating whether the repository is archived. Archived repositories are typically read-only and are no longer actively maintained |
| Is Template | A boolean value indicating whether the repository is set up as a template |
| Default Branch | The name of the default branch |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our dataset "repository_survey" summarizes a comprehensive survey of over 150 data repositories, characterizing their metadata documentation and standardization, data curation and validation, and tracking of dataset use in the literature. In addition, "survey_model_evaluation" includes our findings on model evaluation for five methodological repositories. Column descriptions and further details can be found in "README.pdf." The data are associated with our paper "Towards an Ideal Methodological Data Repository: Lessons and Recommendations."
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Availability of data, code, and plot creation for various figures throughout my PhD thesis. Rough organisation currently. Pertains to Figures 5.4, 5.8, 6.11, 6.18, 7.3, 7.12, and Table 6.1.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These research datasets are the updated version of the conference poster "Research data repositories and their metadata: A comparative study," presented by Ms. Kavya Asok and Ms. Snigdha Dandpat in a Conference on Open and FAIR Data Ecosystem: Principles, Policies, and Platforms scheduled from 11th -13th September 2023, at IIC, New Delhi. The study describes the features of a select number of RDRs and analyzes their metadata practices.
Facebook
TwitterThe goal of BioLINCC is to facilitate and coordinate the existing activities of the NHLBI Biorepository and the Data Repository and to expand their scope and usability to the scientific community through a single web-based user interface.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file collection is part of the ORD Landscape and Cost Analysis Project (DOI: 10.5281/zenodo.2643460), a study jointly commissioned by the SNSF and swissuniversities in 2018.
Please cite this data collection as: von der Heyde, M. (2019). Data from the International Open Data Repository Survey. Retrieved from https://doi.org/10.5281/zenodo.2643493
Further information is given in the corresponding data paper: von der Heyde, M. (2019). International Open Data Repository Survey: Description of collection, collected data, and analysis methods [Data paper]. Retrieved from https://doi.org/10.5281/zenodo.2643450
Contact
Swiss National Science Foundation (SNSF)
Open Research Data Group
E-mail: ord@snf.ch
swissuniversities
Program "Scientific Information"
Gabi Schneider
E-Mail: isci@swissuniversities.ch
Facebook
TwitterDigital Repository for Open Access to University of Luxembourg publications.
ORBilu was officially launched on the 22nd April 2013. The acronym ORBi stands for “Open Repository and Bibliography”. It also expresses the Latin word “orbi” (“for the world”) and signals the will of the University to make its academic research available to everyone, without barriers, be they legal, financial or technical. By keeping the ORBi name and adding “lu”, the University of Luxembourg wants to show its appreciation for the work done by the University of Liège but also clearly indicates that this is a version adapted to the UL context.
The API format is described at https://www.openarchives.org/pmh/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file collection is part of the ORD Landscape and Cost Analysis Project (DOI: 10.5281/zenodo.2643460), a study jointly commissioned by the SNSF and swissuniversities in 2018.
Please cite this data collection as: von der Heyde, M. (2019). Data from the Swiss Open Data Repository Landscape survey. Retrieved from https://doi.org/10.5281/zenodo.2643487
Further information is given in the corresponding data paper: von der Heyde, M. (2019). Open Data Landscape: Repository Usage of the Swiss Research Community: Description of collection, collected data, and analysis methods [Data paper]. Retrieved from https://doi.org/10.5281/zenodo.2643430
Contact
Swiss National Science Foundation (SNSF)
Open Research Data Group
E-mail: ord@snf.ch
swissuniversities
Program "Scientific Information"
Gabi Schneider
E-Mail: isci@swissuniversities.ch
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This public data repository (https://public.spider.surfsara.nl/project/lidarac/MAMBO/) provides the LiDAR point cloud datasets which were clipped using the boundary polygons (shapefiles) of the MAMBO demonstration sites. The raw LiDAR point cloud tiles were first downloaded from the national repository in the respective country based on the approximate location of each demonstration site. The data repository uses the storage services from the Dutch IT infrastructure SURF (https://www.surf.nl/en). The code for downloading, clipping and uploading the LiDAR point cloud datasets is available on GitHub (https://github.com/Jinhu-Wang/Retile_Clip_LAZ).
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global clinical trial data repository market size was estimated to be approximately $1.8 billion in 2023 and is projected to grow at a compound annual growth rate (CAGR) of 9.5% to reach around $4.1 billion by 2032. The primary growth factors include the increasing volume and complexity of clinical trials, rising need for efficient data management systems, and stringent regulatory requirements for data accuracy and integrity. The advent of advanced technologies such as artificial intelligence and big data analytics further drives market expansion by enhancing data processing capabilities and providing actionable insights.
The growth of the clinical trial data repository market is significantly influenced by the increasing number of clinical trials being conducted globally. With the rise in chronic diseases, the need for innovative treatments and therapies has surged, leading to an upsurge in clinical trials. This increase in clinical trials necessitates robust data management systems to handle vast amounts of data generated, thereby propelling the demand for clinical trial data repositories. Moreover, the complexity of modern clinical trials, which often involve multiple sites and diverse patient populations, further amplifies the need for sophisticated data management solutions.
Another critical driver for the market is the stringent regulatory landscape governing clinical trial data. Regulatory bodies such as the FDA, EMA, and other local authorities mandate rigorous data management standards to ensure data integrity, accuracy, and accessibility. These regulations necessitate the adoption of advanced data repository systems that can comply with regulatory requirements, thereby fueling market growth. Additionally, regulatory frameworks are becoming increasingly stringent, prompting pharmaceutical and biotechnology companies to invest in state-of-the-art data management systems to avoid compliance issues and potential financial penalties.
Technological advancements play a pivotal role in the market's growth. The integration of artificial intelligence, machine learning, and big data analytics into data repository systems enhances data processing and analysis capabilities. These technologies enable real-time data monitoring, predictive analytics, and improved decision-making, thereby improving the efficiency of clinical trials. Furthermore, the shift towards cloud-based solutions offers scalability, flexibility, and cost-effectiveness, making advanced data management systems accessible to even small and medium-sized enterprises.
Regionally, North America dominates the clinical trial data repository market owing to its robust healthcare infrastructure, high R&D investments, and presence of major pharmaceutical and biotechnology companies. Europe follows closely due to stringent regulatory standards and a strong focus on clinical research. The Asia Pacific region is expected to witness the highest growth rate during the forecast period due to increasing clinical trial activities, growing healthcare expenditure, and the rising adoption of advanced technologies. Latin America and the Middle East & Africa are also likely to experience growth, albeit at a slower pace, driven by improving healthcare systems and increasing focus on clinical research.
The clinical trial data repository market is segmented by components into software and services. The software segment is anticipated to hold a significant share of the market due to the essential role software plays in data management. Advanced software solutions offer capabilities such as data storage, management, retrieval, and analysis, which are critical for effective clinical trial management. The integration of AI and machine learning algorithms into these software systems further enhances their efficiency by enabling predictive analytics and real-time monitoring, thus driving the software segment's growth.
Software solutions in clinical trial data repositories also offer interoperability, enabling seamless integration with other clinical trial management systems (CTMS) and electronic data capture (EDC) systems. This interoperability is crucial for ensuring data consistency and accuracy across different platforms, thereby enhancing overall data management. Additionally, the increasing adoption of cloud-based software solutions provides scalability, cost-effectiveness, and remote acce
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This poster is for RDA P19 poster exhibition
Although there is yet a holistic national level support in Taiwan on the pursue of excellence in research data management, however, a culture of research data management is starting to take shape. As a research data repository operating in Taiwan, we report in this poster our work in helping the advance of good research data practices in Taiwan.
The depositar is a general-purpose data repository open to all for the deposit, discovery, and reuse of research data. It has been in service since early 2018. Its development has been supported by Academia Sinica and, in part, by a grant from Taiwan’s Ministry of Science and Technology. In addition to developing and operating the repository, since early 2019 the depositar team has been active in advocating good research data practices in Taiwan. From the perspective of depositar, researchers in Taiwan will be more likely to share data—hence to deposit data to depositar or to any other data repositories—when their data is well managed and in a state ready to be reused and shared. The funding we receive from the Ministry of Science and Technology also has a focus on facilitating better research data management in Taiwan (though initially only applied to grants awarded in the area of sustainable development research).
For the last few years, the depositar team has been working to cultivate a culture of research data management in Taiwan. We hold co-learning workshops where domain experts share their practices in managing research data. We work closely with several research projects about implementing data management plans. Above all, we strive to produce and make available guidelines and toolkits on research data management and on using research data repositories. At the same time we constantly improve the functionalities of depositar in response to the feedback we received from our users and from the above activities.
This poster will report on these activities and the lessons we have learned. We will also reflect on the strategy aspects of advocating for good research data practices, especially in the settings of limited resources and/or missing policies.
Facebook
TwitterThe VHA Data Sharing Agreement Repository serves as a centralized location to collect and report on agreements that share VHA data with entities outside of VA. It provides senior management an overall view of existing data sharing agreements; fosters productive sharing of health information with VHA's external partners; and streamlines data acquisition to improve data management responsibilities overall. Agreements that VHA has established with entities within the VA are not candidates for this Repository.
Facebook
TwitterThis project has built a repository of items (www.esmitemrepository.com) used in experience sampling method (ESM), ecological momentary assessment (EMA) and ambulatory assessment (AA) studies. The idea for this repository arose out of discussions during the Open Science hackathon at the 2018 Belgian-Dutch ESM Network Meeting.
In order to contribute items to the repository, you will need to download all five documents in the Contributors' Pack. When you have downloaded the ESM Item Repository submission template (spreadsheet) document, you can enter your items into it and then send it back to us via email (submissions [at] esmitemrepository.com). We will then collate all the submitted items into a repository and publish them here.
If you would like to browse the full repository and download items and their information, visit www.esmitemrepository.com.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Datasets as well as R and Python code of the empirical examples in the book "Causal Analysis" by Martin Huber (2023), published by MIT Press.
Facebook
Twitterhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVHhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVH
Data collected from major Canadian and international research data repositories cover data storage, preservation, metadata, interchange, data file types, and other standard features used in the retention and sharing of research data. The outputs of this project primarily aim to assist in the establishment of recommended minimum requirements for a Canadian research data infrastructure. The committee also aims to further develop guidelines and criteria for the assessment and selection o f repositories for deposit of Canadian research data by researchers, data managers, librarians, archivists etc.