Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the age of digital transformation, scientific and social interest for data and data products is constantly on the rise. The quantity as well as the variety of digital research data is increasing significantly. This raises the question about the governance of this data. For example, how to store the data so that it is presented transparently, freely accessible and subsequently available for re-use in the context of good scientific practice. Research data repositories provide solutions to these issues.
Considering the variety of repository software, it is sometimes difficult to identify a fitting solution for a specific use case. For this purpose a detailed analysis of existing software is needed. Presented table of requirements can serve as a starting point and decision-making guide for choosing the most suitable for your purposes repository software. This table is dealing as a supplementary material for the paper "How to choose a research data repository software? Experience report." (persistent identifier to the paper will be added as soon as paper is published).
Facebook
TwitterData and variable key for Dunham, Dotsch, Clark, & Stepanova, "The development of White-Asian categorization: Contributions from skin color and other physiognomic cues"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Bear Lake Data Repository (BLDR) is an active archive, containing a growing compilation of biological, chemical, and physical datasets collected from Bear Lake and its surrounding watershed. The datasets herein have been digitized from historical records and reports, extracted from papers and theses, and obtained from public and private entities, including the United States Geological Survey, PacifiCorp, and, inter alia, Ecosystems Research Institute.
Contributions are welcome. The BLDR accepts biological, chemical, or physical datasets obtained at Bear Lake, irrespective of funding source. There is no submission size limit at present—workarounds will be found if submissions exceed Hydroshare limits (20 GB). Contributions are published with an open access license and will serve many use cases. The current repository steward, Bear Lake Watch, will advise on submissions and make accepted contributions available promptly.
Metadata files are provided for each dataset, however, contact with original contributor(s) is encouraged for questions and additional details prior to data usage. The BLDR and its contributors shall not be liable for any damages resulting from misinterpretation or misuse of the data or metadata.
Facebook
TwitterThe NIH Common Data Elements (CDE) Repository has been designed to provide access to structured human and machine-readable definitions of data elements that have been recommended or required by NIH Institutes and Centers and other organizations for use in research and for other purposes. Visit the NIH CDE Resource Portal for contextual information about the repository.
Facebook
Twitterhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVHhttps://borealisdata.ca/api/datasets/:persistentId/versions/4.0/customlicense?persistentId=doi:10.5683/SP3/UPABVH
Data collected from major Canadian and international research data repositories cover data storage, preservation, metadata, interchange, data file types, and other standard features used in the retention and sharing of research data. The outputs of this project primarily aim to assist in the establishment of recommended minimum requirements for a Canadian research data infrastructure. The committee also aims to further develop guidelines and criteria for the assessment and selection o f repositories for deposit of Canadian research data by researchers, data managers, librarians, archivists etc.
Facebook
Twitterhttps://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
This collection consists of geospatial data layers and summary data at the country and country sub-division levels that are part of USAID's Demographic Health Survey Spatial Data Repository. This collection includes geographically-linked health and demographic data from the DHS Program and the U.S. Census Bureau for mapping in a geographic information system (GIS). The data includes indicators related to: fertility, family planning, maternal and child health, gender, HIV/AIDS, literacy, malaria, nutrition, and sanitation. Each set of files is associated with a specific health survey for a given year for over 90 different countries that were part of the following surveys:Demographic Health Survey (DHS)Malaria Indicator Survey (MIS)Service Provisions Assessment (SPA)Other qualitative surveys (OTH)Individual files are named with identifiers that indicate: country, survey year, survey, and in some cases the name of a variable or indicator. A list of the two-letter country codes is included in a CSV file.Datasets are subdivided into the following folders:Survey boundaries: polygon shapefiles of administrative subdivision boundaries for countries used in specific surveys. Indicator data: polygon shapefiles and geodatabases of countries and subdivisions with 25 of the most common health indicators collected in the DHS. Estimates generated from survey data.Modeled surfaces: geospatial raster files that represent gridded population and health indicators generated from survey data, for several countries.Geospatial covariates: CSV files that link survey cluster locations to ancillary data (known as covariates) that contain data on topics including population, climate, and environmental factors.Population estimates: spreadsheets and polygon shapefiles for countries and subdivisions with 5-year age/sex group population estimates and projections for 2000-2020 from the US Census Bureau, for designated countries in the PEPFAR program.Workshop materials: a tutorial with sample data for learning how to map health data using DHS SDR datasets with QGIS. Documentation that is specific to each dataset is included in the subfolders, and a methodological summary for all of the datasets is included in the root folder as an HTML file. File-level metadata is available for most files. Countries for which data included in the repository include: Afghanistan, Albania, Angola, Armenia, Azerbaijan, Bangladesh, Benin, Bolivia, Botswana, Brazil, Burkina Faso, Burundi, Cape Verde, Cambodia, Cameroon, Central African Republic, Chad, Colombia, Comoros, Congo, Congo (Democratic Republic of the), Cote d'Ivoire, Dominican Republic, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Eswatini (Swaziland), Ethiopia, Gabon, Gambia, Ghana, Guatemala, Guinea, Guyana, Haiti, Honduras, India, Indonesia, Jordan, Kazakhstan, Kenya, Kyrgyzstan, Lesotho, Liberia, Madagascar, Malawi, Maldives, Mali, Mauritania, Mexico, Moldova, Morocco, Mozambique, Myanmar, Namibia, Nepal, Nicaragua, Niger, Nigeria, Pakistan, Papua New Guinea, Paraguay, Peru, Philippines, Russia, Rwanda, Samoa, Sao Tome and Principe, Senegal, Sierra Leone, South Africa, Sri Lanka, Sudan, Tajikistan, Tanzania, Thailand, Timor-Leste, Togo, Trinidad and Tobago, Tunisia, Turkey, Turkmenistan, Uganda, Ukraine, Uzbekistan, Viet Nam, Yemen, Zambia, Zimbabwe
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Status of the repository https://github.com/building-envelope-data/database , branch develop, on February 20, 2025.
Facebook
TwitterThe Administrative Data Repository (ADR) was established to provide support for the administrative data elements relative to multiple categories of a person entity such as demographic and eligibility information. Although initially focused on the computing needs of the Veterans Health Administration, the ADR is positioned to provide identity management and demographics support for all IT systems within the Department of Veterans Affairs.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global clinical trial data repository market size was estimated to be approximately $1.8 billion in 2023 and is projected to grow at a compound annual growth rate (CAGR) of 9.5% to reach around $4.1 billion by 2032. The primary growth factors include the increasing volume and complexity of clinical trials, rising need for efficient data management systems, and stringent regulatory requirements for data accuracy and integrity. The advent of advanced technologies such as artificial intelligence and big data analytics further drives market expansion by enhancing data processing capabilities and providing actionable insights.
The growth of the clinical trial data repository market is significantly influenced by the increasing number of clinical trials being conducted globally. With the rise in chronic diseases, the need for innovative treatments and therapies has surged, leading to an upsurge in clinical trials. This increase in clinical trials necessitates robust data management systems to handle vast amounts of data generated, thereby propelling the demand for clinical trial data repositories. Moreover, the complexity of modern clinical trials, which often involve multiple sites and diverse patient populations, further amplifies the need for sophisticated data management solutions.
Another critical driver for the market is the stringent regulatory landscape governing clinical trial data. Regulatory bodies such as the FDA, EMA, and other local authorities mandate rigorous data management standards to ensure data integrity, accuracy, and accessibility. These regulations necessitate the adoption of advanced data repository systems that can comply with regulatory requirements, thereby fueling market growth. Additionally, regulatory frameworks are becoming increasingly stringent, prompting pharmaceutical and biotechnology companies to invest in state-of-the-art data management systems to avoid compliance issues and potential financial penalties.
Technological advancements play a pivotal role in the market's growth. The integration of artificial intelligence, machine learning, and big data analytics into data repository systems enhances data processing and analysis capabilities. These technologies enable real-time data monitoring, predictive analytics, and improved decision-making, thereby improving the efficiency of clinical trials. Furthermore, the shift towards cloud-based solutions offers scalability, flexibility, and cost-effectiveness, making advanced data management systems accessible to even small and medium-sized enterprises.
Regionally, North America dominates the clinical trial data repository market owing to its robust healthcare infrastructure, high R&D investments, and presence of major pharmaceutical and biotechnology companies. Europe follows closely due to stringent regulatory standards and a strong focus on clinical research. The Asia Pacific region is expected to witness the highest growth rate during the forecast period due to increasing clinical trial activities, growing healthcare expenditure, and the rising adoption of advanced technologies. Latin America and the Middle East & Africa are also likely to experience growth, albeit at a slower pace, driven by improving healthcare systems and increasing focus on clinical research.
The clinical trial data repository market is segmented by components into software and services. The software segment is anticipated to hold a significant share of the market due to the essential role software plays in data management. Advanced software solutions offer capabilities such as data storage, management, retrieval, and analysis, which are critical for effective clinical trial management. The integration of AI and machine learning algorithms into these software systems further enhances their efficiency by enabling predictive analytics and real-time monitoring, thus driving the software segment's growth.
Software solutions in clinical trial data repositories also offer interoperability, enabling seamless integration with other clinical trial management systems (CTMS) and electronic data capture (EDC) systems. This interoperability is crucial for ensuring data consistency and accuracy across different platforms, thereby enhancing overall data management. Additionally, the increasing adoption of cloud-based software solutions provides scalability, cost-effectiveness, and remote acce
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Availability of data, code, and plot creation for various figures throughout my PhD thesis. Rough organisation currently. Pertains to Figures 5.4, 5.8, 6.11, 6.18, 7.3, 7.12, and Table 6.1.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database represents a survey of open-source version controlled repositories, with the purpose of exploring best practice adoption and the extent of development centered around the University of Wisconsin. Data was obtained through the use of the xDD and GitHub APIs, with code publically available at https://github.com/UW-Madison-DSI/OSPO_Data_Management.
This release represents an initial data compilation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PATRON is a human ethics approved program of research incorporating an enduring de-identified repository of Primary Care data facilitating research and knowledge generation. PATRON is a part of the 'Data for Decisions' initiative of the Department of General Practice, University of Melbourne. 'Data for Decisions' is a research initiative in partnership with general practices. It is an exciting undertaking that makes possible primary care research projects to increase knowledge and improve healthcare practices and policy. Principal Researcher: Jon EmeryData Custodian: Lena SanciData Steward: Douglas BoyleManager: Rachel CanawayMore information about Data for Decisions and utilising PATRON data is available from the Data for Decisions website.
Facebook
TwitterA database of flow cytometry experiments where users can query and download data collected and annotated according to the MIFlowCyt data standard.
Facebook
TwitterTo address the increasing complexity of network management and the limitations of data repositories in handling the various network operational data, this paper proposes a novel repository design that uniformly represents network operational data while allowing for a multiple abstractions access to the information. This smart repository simplifies network management functions by enabling network verification directly within the repository. The data is organized in a knowledge graph compatible with any general-purpose graph database, offering a comprehensive and extensible network repository. Performance evaluations confirm the feasibility of the proposed design. The repository's ability to natively support 'what-if' scenario evaluation is demonstrated by verifying Border Gateway Protocol (BGP) route policies and analyzing forwarding behavior with virtual Traceroute.
Facebook
TwitterA database which contains longitudinal structural MRIs, spectroscopy, DTI and correlated clinical/behavioral data from approximately 500 healthy, normally developing children, ages newborn to young adult.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and a baseline for future studies of ag research data.
Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data-sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multidisciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to:
Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analyzed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered.
Search Methods - We first compiled a list of known domain-specific USDA/ARS datasets/databases represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. - We then searched using search engines such as Bing and Google for non-USDA/federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal/USDA results). Most of these results were domain-specific, though some contained a mix of data subjects. - We searched using search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university website to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. - We found both ag-specific university repositories and general university repositories that housed a portion of agricultural data. Ag-specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). - We then split out NCBI (National Center for Biotechnology Information) repositories. - Next, we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. - Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compi...
Facebook
TwitterThis data contains COSORT flow chart and Check List with reference to Article page number.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset aggregates information about 191 research data repositories that were shut down. The data collection was based on the registry of research data repositories re3data and a comprehensive content analysis of repository websites and related materials. Documented in the dataset are the period in which a repository was active, the risks resulting in its shutdown, and the repositories taking over custody of the data after.
Facebook
TwitterFive (5) year PRDR and regional energy database development and implementation strategy and actions. The strategy was compiled by Dr Herbert Wade, via technical assistance from the World Bank to SPC. It is in draft format, circulated here for your review and comments.
Also attached in this record are references used in the compilation of this strategy:
SPC - A Pacific Island Region Plan for the Implementation of Initiatives for Strengthening Statistical Services through Regional Approaches 2010 - 2020
Pacific Statistics Strategy Action Plan Phase 1 (2011 - 2014) - Activities & Budget
44th Pacific Islands Forum in Majuro, RMI 2013
ADB Statistics and Databases
Pacific Regionalism Factsheet
SPC Circular to Energy Ministers, 2015
ECOWAS Database User Guide
Pacific Energy Ministers Resolutions, 2014
Guidelines for 2013 United Nations Statistics Division Annual Questionnaire on Energy Statistics
IEA Member Countries
IRENA Renewable Energy Statistics Activities
Asia Pacific Energy Portal presentation
Open data essentials
The Framework for Pacific Regionalism
Meeting Outcomes of the 5th Meeting of the PEAG, 2014
PRDR Declaration
Trust Fund for Statistics Capacity Building - Guidelines and Procedures
Trust Fund for Statistics Capacity Building - Template
International Recommendations for Energy Statistics (IRES)
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given and data on retracted papers (based on Retraction Watch database) as well as citations to/from retracted papers have been added in the most recent iteration. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2023 and single recent year data pertain to citations received during calendar year 2023. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (7) is based on the August 1, 2024 snapshot from Scopus, updated to end of citation year 2023. This work uses Scopus data. Calculations were performed using all Scopus author profiles as of August 1, 2024. If an author is not on the list it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work. PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases. The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, see attached file on FREQUENTLY ASKED QUESTIONS. Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the age of digital transformation, scientific and social interest for data and data products is constantly on the rise. The quantity as well as the variety of digital research data is increasing significantly. This raises the question about the governance of this data. For example, how to store the data so that it is presented transparently, freely accessible and subsequently available for re-use in the context of good scientific practice. Research data repositories provide solutions to these issues.
Considering the variety of repository software, it is sometimes difficult to identify a fitting solution for a specific use case. For this purpose a detailed analysis of existing software is needed. Presented table of requirements can serve as a starting point and decision-making guide for choosing the most suitable for your purposes repository software. This table is dealing as a supplementary material for the paper "How to choose a research data repository software? Experience report." (persistent identifier to the paper will be added as soon as paper is published).