This data provides results from chemistry and field analyses, from the California Environmental Data Exchange Network (CEDEN). The data set contains two provisionally assigned values (“DataQuality” and “DataQualityIndicator”) to help users interpret the data quality metadata provided with the associated result.
Due to file size limitations, the data has been split into individual resources by year. The entire dataset can also be downloaded in bulk using the zip files on this page (in csv format or parquet format), and developers can also use the API associated with each year's dataset to access the data. Example R code using the API to access data across all years can be found here.
Users who want to manually download more specific subsets of the data can also use the CEDEN query tool, at: https://ceden.waterboards.ca.gov/AdvancedQueryTool
Groundwater quality data and related groundwater well information available on the page was queried from the GAMA Groundwater information system (**[GAMA GIS](https://gamagroundwater.waterboards.ca.gov/gama/datadownload)**). Data provided represent a collection of groundwater quality results from various federal, state, and local groundwater sources. Results have been filtered to only represent untreated sampling results for the purpose of characterizing ambient conditions. Data have been standardized across multiple data sets including chemical names and units. Standardization has not been performed for chemical result modifier and others (although we are working currently to standardize most fields). Chemicals that have been standardized are included in the data sets. Therefore, other chemicals have been analyzed for but are not included in GAMA downloads. Groundwater samples have been collected from well types including domestic, irrigation, monitoring, municipal. Wells that cannot accurately be attributed to a category are labeled as "water supply, other". For additional information regarding the GAMA GIS data system please reference our **[factsheet](https://www.waterboards.ca.gov/publications_forms/publications/factsheets/docs/gama_gis_factsheet.pdf)**.
This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
The Pacific Community Results Report highlights the results achieved by SPC with our 26 Member countries and territories, and development partners. This dataset provides the data used in the Results Report provided in Excel and CSV formats.
This data has been visualised in the Results Explorer Dashboard: https://pacificdata.org/results-explorer
A Data Dictionary for the TSS Individual Reports with Comments reports.
Supplementary data from carbon-14 analysis of Oldbury reactor graphite
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We studied and compared three automated FAIRness evaluation tools namely F-UJI, the FAIR Evaluator, and FAIR Checker examining three aspects: 1) tool characteristics, 2) the evaluation metrics, and 3) metrics tests for three public datasets. We find significant differences in the evaluation results for tested resources, along with differences in the design, implementation, and documentation of the evaluation metrics and platforms.
This data is the comparison results we summarized from the study. All results are reported in our manuscript. This data is the supplementary material of the manuscript.
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
The dataset contains all the data produced running the research software for the study:"Open Science for Social Sciences and Humanities: Open Access availability and distribution across disciplines and Countries in OpenCitations Meta".
Disclaimer: these results are not considered to be representative, because we have fount that Mega Journals skewed significantly some of the data. The result datasets without Mega Journals are published here.
Description of datasets:
SSH_Publications_in_OC_Meta_and_Open_Access_status.csv: containing information about OpenCitations Meta coverage of ERIH PLUS Journals as well as their Open Access availability. In this dataset, every row holds data for a Journal of ERIH PLUS also covered by OpenCitations Meta database. It is structured with the following columns: "EP_id", the internal ERIH PLUS identifier; "Publications_in_venue", the numbers of Publications counted in each venue; "OC_omid", the internal OpenCitations Meta identifier for the venue; "issn", numbers of publications in each venue; "Open Access", a value to represent if the journal is OA or not, either "True" or "Unknown".
SSH_Publications_by_Discipline.csv: containing information about number of publications per discipline (in addition, number of journals per discipline are also included). The dataset has three columns, the first, labeled "Discipline", contains single disciplines of the ERIH classificaton, the second and the third, labeled "Journal_count" and "Publication_count", respectively, the number of Journals and the number of Publications counted for each discipline.
SSH_Publications_and_Journals_by_Country: containing information about number of publications and journals per country. The dataset has three columns, the first, labeled "Country", contains single countries of the ERIH classificaton, the second and the third, labeled "Journal_count" and "Publication_count", respectively, the number of Journals and the number of Publications counted for each discipline.
result_disciplines.json: the dictionary containing all disciplines as key and a list of related ERIH PLUS venue identifiers as value.
result_countries.json: the dictionary containing all countries as key and a list of related ERIH PLUS venue identifiers as value.
duplicate_omids.csv: a dataset containing the duplicated Journal entries in OpenCitations Meta, structured with two columns: "OC_omid", the internal OC Meta identifier; "issn", the issn values associated to that identifier
eu_data.csv: contains the data specific for European countries' SSH Journals covered in OCMeta. It is structured with the following columns: "EP_id", the internal ERIH PLUS identifier; "Publications_in_venue", the numbers of Publications counted in each venue; "Original_Title", "Country_of_Publication","ERIH_PLUS_Disciplines", "disc_count", the number of disciplines per Journal.
eu_disciplines_count.csv: containing information about number of publications per discipline and number of journals per discipline of european countries. The dataset has three columns, the first, labeled "Discipline", contains single disciplines of the ERIH classificaton, the second and the third, labeled "Journal_count" and "Publication_count", respectively, the number of Journals and the number of Publications counted for each discipline.
meta_coverage_eu.csv: contains the data specific for European countries' SSH Journals covered in OCMeta. It is structured with the following columns: "EP_id", the internal ERIH PLUS identifier; "Publications_in_venue", the numbers of Publications counted in each venue; "OC_omid", the internal OpenCitations Meta identifier for the venue; "issn", numbers of publications in each venue; "Open Access", a value to represent if the journal is OA or not, either "True" or "Unknown".
us_data.csv: contains the data specific for the United States' SSH Journals covered in OCMeta. It is structured with the following columns: "EP_id", the internal ERIH PLUS identifier; "Publications_in_venue", the numbers of Publications counted in each venue; "Original_Title", "Country_of_Publication","ERIH_PLUS_Disciplines", "disc_count", the number of disciplines per Journal.
us_disciplines_count.csv: containing information about number of publications per discipline and number of journals per discipline of the United States. The dataset has three columns, the first, labeled "Discipline", contains single disciplines of the ERIH classificaton, the second and the third, labeled "Journal_count" and "Publication_count", respectively, the number of Journals and the number of Publications counted for each discipline.
meta_coverage_us.csv: contains the data specific for the United States' SSH Journals covered in OCMeta. It is structured with the following columns: "EP_id", the internal ERIH PLUS identifier; "Publications_in_venue", the numbers of Publications counted in each venue; "OC_omid", the internal OpenCitations Meta identifier for the venue; "issn", numbers of publications in each venue; "Open Access", a value to represent if the journal is OA or not, either "True" or "Unknown".
Abstract of the research:
Purpose: this study aims to investigate the representation and distribution of Social Science and Humanities (SSH) journals within the OpenCitations Meta database, with a particular emphasis on their Open Access (OA) status, as well as their spread across different disciplines and countries. The underlying premise is that open infrastructures play a pivotal role in promoting transparency, reproducibility, and trust in scientific research. Study Design and Methodology: the study is grounded on the premise that open infrastructures are crucial for ensuring transparency, reproducibility, and fostering trust in scientific research. The research methodology involved the use of secondary data sources, namely the OpenCitations Meta database, the ERIH PLUS bibliographic index, and the DOAJ index. A custom research software was developed in Python to facilitate the processing and analysis of the data. Findings: the results reveal that 78.1% of SSH journals listed in the European Reference Index for the Humanities (ERIH-PLUS) are included in the OpenCitations Meta database. The discipline of Psychology has the highest number of publications. The United States and the United Kingdom are the leading contributors in terms of the number of publications. However, the study also uncovers that only 38% of the SSH journals in the OpenCitations Meta database are OA. Originality: this research adds to the existing body of knowledge by providing insights into the representation of SSH in open bibliographic databases and the role of open access in this domain. The study highlights the necessity for advocating OA practices within SSH and the significance of open data for bibliometric studies. It further encourages additional research into the impact of OA on various facets of citation patterns and the factors leading to disparity across disciplinary representation.
Related resources:
Ghasempouri S., Ghiotto M., & Giacomini S. (2023). Open Science for Social Sciences and Humanities: Open Access availability and distribution across disciplines and Countries in OpenCitations Meta - RESEARCH ARTICLE. https://doi.org/10.5281/zenodo.8263908
Ghasempouri, S., Ghiotto, M., Giacomini, S., (2023). Open Science for Social Sciences and Humanities: Open Access availability and distribution across disciplines and Countries in OpenCitations Meta - DATA MANAGEMENT PLAN (Version 4). Zenodo. https://doi.org/10.5281/zenodo.8174644
Ghasempouri, S., Ghiotto, M., Giacomini, S. (2023e). Open Science for Social Sciences and Humanities: Open Access availability and distribution across disciplines and Countries in OpenCitations Meta - PROTOCOL. V.5. (https://dx.doi.org/10.17504/protocols.io.5jyl8jo1rg2w/v5)
This report includes results for the New York State Math exams for the years 2013-2023. For the results for the New York State Math exams for the years 2006-2012, please follow this link.
The most recent school level results for New York City on AP exams. Results are available at the school level for the school year. Records contain AP Test Taking and Passing Rates.
The INTEGRAL Public Data Results Catalog is based on publicly available data from the two main instruments (IBIS and SPI) on board INTEGRAL (see Winkler et al. 2003, A&A, 411, L1 for a description of the INTEGRAL spacecraft and instrument packages). INTEGRAL began collecting data in October 2002. This catalog will be regularly updated as data become public (~14 months after they are obtained). This catalog is a collaborative effort between the INTEGRAL Science Data Center (ISDC) in Switzerland and the NASA Goddard Space Flight Center (GSFC) INTEGRAL Guest Observer Facility (GOF). The results presented here are a result of a semi-automated analysis and they should be considered as approximate: they are intended to serve as a guideline to those interested in pursuing more detailed follow-up analyses. The data from the imager ISGRI (Lebrun et al. 2003, A&A, 411, L141) have been analyzed at the INTEGRAL Science Data Centre (ISDC), while the SPI (Vedrenne et al. 2003, A&A, 411, L63) data analysis was performed at GSFC as a service of the INTEGRAL GOF. Note: For cases where two or more proposals have been amalgamated (entries with pi_lname = 'Amalgamated') for a given observation, the same observation is listed for each of the amalgamated proposal numbers. This database table was first created in September 2004. It is based on the online web page maintained by the INTEGRAL GOF at the URL http://heasarc.gsfc.nasa.gov/docs/integral/obslist.html and was updated on a weekly basis whenever that web page was updated. Automatic updates were discontinued in June 2019. Duplicate entries were removed in June 2019, also. This is a service provided by NASA HEASARC .
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
This dataset contains information about projects and their results funded by the European Union under the Horizon Europe framework programme for research and innovation from 2021 to 2027.
The dataset is composed of six (6) different sub-set (in different formats):
Reference data (programmes, topics, topic keywords funding schemes (types of action), organisation types and countries) can be found in this dataset: https://data.europa.eu/euodp/en/data/dataset/cordisref-data
EuroSciVoc is available here: https://data.europa.eu/data/datasets/euroscivoc-the-european-science-vocabulary
CORDIS datasets are produced monthly. Therefore, inconsistencies may occur between what is presented on the CORDIS live website and the datasets.
Source data from the National Aquatic Resource Survey's National Rivers and Streams Assessment containing benthic macroinvertebrate and fish taxa data and environmental predictor variables for stream sites. Results data contains estimated of taxa richness for invertebrates and fish for both ecoregions and hydrologic units. This dataset is associated with the following publication: Hughes, R.M., A. Herlihy, R. Comeleo, D. Peck, R. Mitchell, and S. Paulsen. Patterns in and predictors of stream and river macroinvertebrate genera and fish species richness across the conterminous USA. Knowledge and Management of Aquatic Ecosystems. EDP Sciences, LES ULIS, FRANCE, 424: 2023014, (2023).
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
GWAS results in cardiovascular research
The Surveillance, Epidemiology, and End Results (SEER) Program provides information on cancer statistics in an effort to reduce the cancer burden among the U.S. population. SEER is supported by the Surveillance Research Program (SRP) in NCI's Division of Cancer Control and Population Sciences (DCCPS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Descriptive statistics for variables.
No description was included in this Dataset collected from the OSF
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Contains data and statistical code to reproduce the tables and figures for: Austin N, Harper S, Kaufman JS, Hamra GB. Challenges in reproducing results from publicly available data: an example of sexual orientation and cardiovascular disease risk. Published in Journal of Epidemiology & Community Health: doi:10.1136/jech-2015-206698
Reporting units of sample results [where 1 picoCurie (pCi) = 1 trillionth (1E-12) Curie (Ci)]: • Other samples are reported in pCi/g. Data Quality Disclaimer: This database is for informational use and is not a controlled quality database. Efforts have been made to ensure accuracy of data in the database; however, errors and omissions may occur. Examples of potential errors include: • Data entry errors. • Lab results not reported for entry into the database. • Missing results due to equipment failure or unable to retrieve samples due to lost or environmental hazards. • Translation errors – the data has been migrated to newer data platforms numerous times, and each time there have been errors and data losses. Error Results are the calculated uncertainty for the sample measurement results and are reported as (+/-). Environmental Sample Records are from the year 1998 until present. Prior to 1998 results were stored in hardcopy, in a non-database format. Requests for results from samples taken prior to 1998 or results subject to quality assurance are available from archived records and can be made through the DEEP Freedom of Information Act (FOIA) administrator at deep.foia@ct.gov. Information on FOIA requests can be found on the DEEP website. FOIA Administrator Office of the Commissioner Department of Energy and Environmental Protection 79 Elm Street, 3rd Floor Hartford, CT 06106
This dataset contains all data used in this study, including site ID, latitude, longitude, watershed land cover, water chemistry, and carbon and nitrogen stable isotope ratios of periphyton, invertebrate functional feeding groups, and five most frequently observed invertebrate families. Also included is a list of all invertebrates collected in this study along with their functional feeding group and stable isotope ratios. This dataset is associated with the following publication: Smucker, N., A. Kuhn, C. Cruz-Quinones, J. Serbst, and J. Lake. Stable isotopes of algae and macroinvertebrates in streams respond to watershed urbanization, inform management goals, and indicate food web relationships. ECOLOGICAL INDICATORS. Elsevier Science Ltd, New York, NY, USA, 90: 295-304, (2018).
This data provides results from chemistry and field analyses, from the California Environmental Data Exchange Network (CEDEN). The data set contains two provisionally assigned values (“DataQuality” and “DataQualityIndicator”) to help users interpret the data quality metadata provided with the associated result.
Due to file size limitations, the data has been split into individual resources by year. The entire dataset can also be downloaded in bulk using the zip files on this page (in csv format or parquet format), and developers can also use the API associated with each year's dataset to access the data. Example R code using the API to access data across all years can be found here.
Users who want to manually download more specific subsets of the data can also use the CEDEN query tool, at: https://ceden.waterboards.ca.gov/AdvancedQueryTool