51 datasets found
  1. f

    Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Miron; Rafael Gonçalves; Mark A. Musen (2023). Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov" [Dataset]. http://doi.org/10.6084/m9.figshare.12743939.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Laura Miron; Rafael Gonçalves; Mark A. Musen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.

  2. Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km...

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019, v1 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/annual-mean-pm2-5-components-trace-elements-tes-50m-urban-and-1km-non-urban-area-grids-for
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019, v1 data set contains annual predictions of trace elements concentrations at a hyper resolution (50m x 50m grid cells) in urban areas and a high resolution (1km x 1km grid cells) in non-urban areas, for the years 2000 to 2019. Particulate matter with an aerodynamic diameter of less than 2.5 �m (PM2.5) is a human silent killer of millions worldwide, and contains many trace elements (TEs). Understanding the relative toxicity is largely limited by the lack of data. In this work, ensembles of machine learning models were used to generate approximately 163 billion predictions estimating annual mean PM2.5 TEs, namely Bromine (Br), Calcium (Ca), Copper (Cu), Iron (Fe), Potassium (K), Nickel (Ni), Lead (Pb), Silicon (Si), Vanadium (V), and Zinc (Zn). The monitored data from approximately 600 locations were integrated with more than 160 predictors, such as time and location, satellite observations, composite predictors, meteorological covariates, and many novel land use variables using several machine learning algorithms and ensemble methods. Multiple machine-learning models were developed covering urban areas and non-urban areas. Their predictions were then ensembled using either a Generalized Additive Model (GAM) Ensemble Geographically-Weighted-Averaging (GAM-ENWA), or Super-Learners. The overall best model R-squared values for the test sets ranged from 0.79 for Copper to 0.88 for Zinc in non-urban areas. In urban areas, the R-squared model values ranged from 0.80 for Copper to 0.88 for Zinc. The Coordinate Reference System (CRS) used in the predictions is the World Geodetic System 1984 (WGS84) and the Units for the PM2.5 Components TEs are ng/m^3. The data are provided in RDS tabular format, a file format native to the R programming language, but can also be opened by other languages such as Python.

  3. D

    Data from: Historical Gridded Meteorological Dataset in Japan

    • search.diasjp.net
    Updated May 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasushi Ishigooka (2025). Historical Gridded Meteorological Dataset in Japan [Dataset]. http://doi.org/10.20783/DIAS.670
    Explore at:
    Dataset updated
    May 31, 2025
    Dataset provided by
    Institute for Agro-Environmental Sciences, NARO (NIAES)
    Authors
    Yasushi Ishigooka
    Area covered
    Japan
    Description

    The Historical Gridded Meteorological Dataset in Japan (HGMD-Japan) is a grided high-resolution (1km x 1km, approximately) daily (and in some cases hourly or yearly) meteorological datasets, intended for use in agricultural climate change analysis, created from 1978 to the latest year continuously. The daily data were created by overlaying the spatially interpolated differences between observed and climate normal at the meteorological observation stations onto the 1km resolution gridded climate data. In this process, in order to maintain time-series homogeneity in each variable, possible source of time-series heterogeneities unrelated to climate change, such as changes in statistical methods and instrument types, were corrected as much as possible.

    The details of this dataset are described as follows.

    ■ Common items Projection: Geographic Geodetic system: Tokyo Datum

    ◆ Daily data Directory structure: HGMDJ_NARO(YYYY)daily[file] File name: (YYYY)_d_(element).bin Element name (element): Mean temperatures (tmp) [0.1 °C]
    Maximum temperatures (hourly) (tmx) [0.1 °C] Minimum temperatures (hourly) (tmn) [0.1 °C] Precipitation (pre) [0.1 mm] Solar radiation (srd) [0.1 MJ/m2/d] Sunshine duration (sdr) [0.1 hour] Relative humidity (rhu) [0.1 %] Wind speed at 2.5m height (wsd) [0.1 m/s] Downward long wave radiation (lrd) [0.1 MJ/ m2/d] Potential evapotranspiration (pet) [0.1 mm] FAO reference evapotranspiration (eto) [0.1 mm] Paddy water temperature (LAI=0) (tw0) [0.1 °C] Paddy water temperature (LAI=∞) (twi) [0.1 °C] Error value: -999 Record format: Data format: Binary format (little endian) Data size: 278,237,440 bytes Record length: 736 bytes (4+366*2: see below) Number of rows (meshes): 378040 Structure: 1) 3rd mesh code (4 byte long), data (2 byte short) x 366 days 2) 3rd mesh code (4 byte long), data (2 byte short) x 366 days ・・・ 378040) 3rd mesh code (4 byte long), data (2 byte short) x 366 days * Dummy (-999) for the 366th day in no-leap years

    ◆ Hourly data Directory structure: HGMDJ_NARO(YYYY)hourly[element][file] File name: (YYYYMMDD)_h_(element).bin Element name (element): Rice panicle temperatures (tp) [0.1 °C] Air temperatures (ta) [0.1 °C] Error value: -999 Record format: Data format: Binary format (little endian) Data size: 19,658,080 bytes Record length: 52 bytes (4+24*2: see below) Number of rows (meshes): 378040 Structure: 1) 3rd mesh code (4 byte long), data (2 byte short) x 24 hours 2) 3rd mesh code (4 byte long), data (2 byte short) x 24 hours ・・・ 378040) 3rd mesh code (4 byte long), data (2 byte short) x 24 hours

    ◆ Yearly data Directory structure: HGMDJ_NARO(YYYY)yearly[file] File name: (YYYY)_y_(element).bin Element name (element): Heat-dose of daily maximum temperature above 35 ℃ (HD_x35) (hdx35) [0.1 °C day] Heat-dose of daily minimum temperature above 25 ℃ (HD_n25) (hdn25) [0.1 °C day] Heat-dose of daily mean temperature above 26 ℃ (HD_m26) (hdm26) [0.1 °C day] Mean air temperature during 20 days after heading date (hed20atm) [0.1 °C] HD_m26 during 20 days after heading date (hed20hdm26) [0.1 °C day] Mean panicle temperature during daytime within 5 days around heading date (ptm5dc) [0.1 °C] Mean panicle temperature during daytime within 7 days around heading date (ptm7dc) [0.1 °C] Error value: -999 Record format: Data format: Binary format (little endian) Data size: 2,268,240 bytes Record length: 6 bytes (4+2: see below) Number of rows (meshes): 378040 Structure: 1) 3rd mesh code (4 byte long), data (2 byte short) 2) 3rd mesh code (4 byte long), data (2 byte short) ・・・ 378040) 3rd mesh code (4 byte long), data (2 byte short)

  4. n

    Global Surface Summary of the Day - GSOD

    • data.noaa.gov
    • ncei.noaa.gov
    • +3more
    csv, https
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Global Surface Summary of the Day - GSOD [Dataset]. https://data.noaa.gov/onestop/collections/details/33ac52f2-4da3-471c-9059-7f4485baa498
    Explore at:
    https, csvAvailable download formats
    Dataset updated
    Feb 12, 2025
    Time period covered
    Jan 1, 1929 - Present
    Area covered
    Earth, Geographic Region > Global Land, geographic bounding box, Vertical Location > Land Surface
    Description

    Global Surface Summary of the Day is derived from The Integrated Surface Hourly (ISH) dataset. The ISH dataset includes global data obtained from the USAF Climatology Center, located in the Federal Climate Complex with NCDC. The latest daily summary data are normally available 1-2 days after the date-time of the observations used in the daily summaries. The online data files begin with 1929 and are at the time of this writing at the Version 8 software level. Over 9000 stations' data are typically available. The daily elements included in the dataset (as available from each station) are: Mean temperature (.1 Fahrenheit) Mean dew point (.1 Fahrenheit) Mean sea level pressure (.1 mb) Mean station pressure (.1 mb) Mean visibility (.1 miles) Mean wind speed (.1 knots) Maximum sustained wind speed (.1 knots) Maximum wind gust (.1 knots) Maximum temperature (.1 Fahrenheit) Minimum temperature (.1 Fahrenheit) Precipitation amount (.01 inches) Snow depth (.1 inches) Indicator for occurrence of: Fog, Rain or Drizzle, Snow or Ice Pellets, Hail, Thunder, Tornado/Funnel Cloud Global summary of day data for 18 surface meteorological elements are derived from the synoptic/hourly observations contained in USAF DATSAV3 Surface data and Federal Climate Complex Integrated Surface Hourly (ISH). Historical data are generally available for 1929 to the present, with data from 1973 to the present being the most complete. For some periods, one or more countries' data may not be available due to data restrictions or communications problems. In deriving the summary of day data, a minimum of 4 observations for the day must be present (allows for stations which report 4 synoptic observations/day). Since the data are converted to constant units (e.g, knots), slight rounding error from the originally reported values may occur (e.g, 9.9 instead of 10.0). The mean daily values described below are based on the hours of operation for the station. For some stations/countries, the visibility will sometimes 'cluster' around a value (such as 10 miles) due to the practice of not reporting visibilities greater than certain distances. The daily extremes and totals--maximum wind gust, precipitation amount, and snow depth--will only appear if the station reports the data sufficiently to provide a valid value. Therefore, these three elements will appear less frequently than other values. Also, these elements are derived from the stations' reports during the day, and may comprise a 24-hour period which includes a portion of the previous day. The data are reported and summarized based on Greenwich Mean Time (GMT, 0000Z - 2359Z) since the original synoptic/hourly data are reported and based on GMT.

  5. e

    INSPIRE Soil / Medium element content in topsoil BB

    • data.europa.eu
    wfs
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INSPIRE-Zentrale im Land Brandenburg, INSPIRE Soil / Medium element content in topsoil BB [Dataset]. https://data.europa.eu/data/datasets/f0628e6c-9d74-446a-b06f-1dd2d7ff7970?locale=en
    Explore at:
    wfsAvailable download formats
    Dataset authored and provided by
    INSPIRE-Zentrale im Land Brandenburg
    Description

    The interoperable INSPIRE dataset contains data from the LBGR on the mean element contents in the Brandenburg topsoil, transformed into the INSPIRE target soil scheme. The dataset is provided via an interoperable display and download service.

    The compliant INSPIRE data set contains data about average element contents in topsoil in the State of Brandenburg from the LBGR, transformed into the INSPIRE annex schema Soil. The data set is provided via compliant view and download services.

  6. Scanning Multichannel Microwave Radiometer (SMMR) Monthly Mean Atmospheric...

    • datasets.ai
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +4more
    21
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2024). Scanning Multichannel Microwave Radiometer (SMMR) Monthly Mean Atmospheric Liquid Water (ALW) By Prabhakara [Dataset]. https://datasets.ai/datasets/scanning-multichannel-microwave-radiometer-smmr-monthly-mean-atmospheric-liquid-water-alw--80d58
    Explore at:
    21Available download formats
    Dataset updated
    Sep 15, 2024
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    National Aeronautics and Space Administration
    Description

    SMMR_ALW_PRABHAKARA data are Special Multichannel Microwave Radiometer (SMMR) Monthly Mean Atmospheric Liquid Water (ALW) data by Prabhakara.The Prabhakara Scanning Multichannel Microwave Radiometer (SMMR) Atmospheric Liquid Water (ALW) files were generated by Dr. Prabhakara Cuddapah at the Goddard Space Flight Center (GSFC) using SMMR Antenna Temperatures. A discussion of the SMMR Antenna Temperatures is available from the Langley Distributed Active Archive Center (DAAC). Each ALW file contains one month of 3 degree by 5 degree gridded mean liquid water. Each element of data is in units of mg/cm2. The data spans the period from February 1979 to May 1984.

  7. US Gross Rent ACS Statistics

    • kaggle.com
    Updated Aug 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Golden Oak Research Group (2017). US Gross Rent ACS Statistics [Dataset]. https://www.kaggle.com/datasets/goldenoakresearch/acs-gross-rent-us-statistics/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 23, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Golden Oak Research Group
    Description

    What you get:

    Upvote! The database contains +40,000 records on US Gross Rent & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 325,272 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to upvote. Upvote right now, please. Enjoy!

    Get the full free database with coupon code: FreeDatabase, See directions at the bottom of the description... And make sure to upvote :) coupon ends at 2:00 pm 8-23-2017

    Gross Rent & Geographic Statistics:

    • Mean Gross Rent (double)
    • Median Gross Rent (double)
    • Standard Deviation of Gross Rent (double)
    • Number of Samples (double)
    • Square area of land at location (double)
    • Square area of water at location (double)

    Geographic Location:

    • Longitude (double)
    • Latitude (double)
    • State Name (character)
    • State abbreviated (character)
    • State_Code (character)
    • County Name (character)
    • City Name (character)
    • Name of city, town, village or CPD (character)
    • Primary, Defines if the location is a track and block group.
    • Zip Code (character)
    • Area Code (character)

    Abstract

    The data set originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.

    License

    Only proper citing is required please see the documentation for details. Have Fun!!!

    Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.

    For any questions, you may reach us at research_development@goldenoakresearch.com. For immediate assistance, you may reach me on at 585-626-2965

    please note: it is my personal number and email is preferred

    Check our data's accuracy: Census Fact Checker

    Access all 325,272 location for Free Database Coupon Code:

    Don't settle. Go big and win big. Optimize your potential**. Access all gross rent records and more on a scale roughly equivalent to a neighborhood, see link below:

    A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.

  8. E

    Dataset: The plural interpretability of German linking elements...

    • live.european-language-grid.eu
    • explore.openaire.eu
    • +2more
    csv
    Updated Aug 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 15, 2021
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.

  9. Z

    Wrist-mounted IMU data towards the investigation of free-living human eating...

    • data.niaid.nih.gov
    Updated Jun 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kyritsis, Konstantinos (2022). Wrist-mounted IMU data towards the investigation of free-living human eating behavior - the Free-living Food Intake Cycle (FreeFIC) dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4420038
    Explore at:
    Dataset updated
    Jun 20, 2022
    Dataset provided by
    Kyritsis, Konstantinos
    Diou, Christos
    Delopoulos, Anastasios
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    The Free-living Food Intake Cycle (FreeFIC) dataset was created by the Multimedia Understanding Group towards the investigation of in-the-wild eating behavior. This is achieved by recording the subjects’ meals as a small part part of their everyday life, unscripted, activities. The FreeFIC dataset contains the (3D) acceleration and orientation velocity signals ((6) DoF) from (22) in-the-wild sessions provided by (12) unique subjects. All sessions were recorded using a commercial smartwatch ((6) using the Huawei Watch 2™ and the MobVoi TicWatch™ for the rest) while the participants performed their everyday activities. In addition, FreeFIC also contains the start and end moments of each meal session as reported by the participants.

    Description

    FreeFIC includes (22) in-the-wild sessions that belong to (12) unique subjects. Participants were instructed to wear the smartwatch to the hand of their preference well ahead before any meal and continue to wear it throughout the day until the battery is depleted. In addition, we followed a self-report labeling model, meaning that the ground truth is provided from the participant by documenting the start and end moments of their meals to the best of their abilities as well as the hand they wear the smartwatch on. The total duration of the (22) recordings sums up to (112.71) hours, with a mean duration of (5.12) hours. Additional data statistics can be obtained by executing the provided python script stats_dataset.py. Furthermore, the accompanying python script viz_dataset.py will visualize the IMU signals and ground truth intervals for each of the recordings. Information on how to execute the Python scripts can be found below.

    The script(s) and the pickle file must be located in the same directory.

    Tested with Python 3.6.4

    Requirements: Numpy, Pickle and Matplotlib

    Calculate and echo dataset statistics

    $ python stats_dataset.py

    Visualize signals and ground truth

    $ python viz_dataset.py

    FreeFIC is also tightly related to Food Intake Cycle (FIC), a dataset we created in order to investigate the in-meal eating behavior. More information about FIC can be found here and here.

    Publications

    If you plan to use the FreeFIC dataset or any of the resources found in this page, please cite our work:

    @article{kyritsis2020data,
    title={A Data Driven End-to-end Approach for In-the-wild Monitoring of Eating Behavior Using Smartwatches},
    author={Kyritsis, Konstantinos and Diou, Christos and Delopoulos, Anastasios},
    journal={IEEE Journal of Biomedical and Health Informatics}, year={2020},
    publisher={IEEE}}

    @inproceedings{kyritsis2017automated, title={Detecting Meals In the Wild Using the Inertial Data of a Typical Smartwatch}, author={Kyritsis, Konstantinos and Diou, Christos and Delopoulos, Anastasios}, booktitle={2019 41th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
    year={2019}, organization={IEEE}}

    Technical details

    We provide the FreeFIC dataset as a pickle. The file can be loaded using Python in the following way:

    import pickle as pkl import numpy as np

    with open('./FreeFIC_FreeFIC-heldout.pkl','rb') as fh: dataset = pkl.load(fh)

    The dataset variable in the snipet above is a dictionary with (5) keys. Namely:

    'subject_id'

    'session_id'

    'signals_raw'

    'signals_proc'

    'meal_gt'

    The contents under a specific key can be obtained by:

    sub = dataset['subject_id'] # for the subject id ses = dataset['session_id'] # for the session id raw = dataset['signals_raw'] # for the raw IMU signals proc = dataset['signals_proc'] # for the processed IMU signals gt = dataset['meal_gt'] # for the meal ground truth

    The sub, ses, raw, proc and gt variables in the snipet above are lists with a length equal to (22). Elements across all lists are aligned; e.g., the (3)rd element of the list under the 'session_id' key corresponds to the (3)rd element of the list under the 'signals_proc' key.

    sub: list Each element of the sub list is a scalar (integer) that corresponds to the unique identifier of the subject that can take the following values: ([1, 2, 3, 4, 13, 14, 15, 16, 17, 18, 19, 20]). It should be emphasized that the subjects with ids (15, 16, 17, 18, 19) and (20) belong to the held-out part of the FreeFIC dataset (more information can be found in ( )the publication titled "A Data Driven End-to-end Approach for In-the-wild Monitoring of Eating Behavior Using Smartwatches" by Kyritsis et al). Moreover, the subject identifier in FreeFIC is in-line with the subject identifier in the FIC dataset (more info here and here); i.e., FIC’s subject with id equal to (2) is the same person as FreeFIC’s subject with id equal to (2).

    ses: list Each element of this list is a scalar (integer) that corresponds to the unique identifier of the session that can range between (1) and (5). It should be noted that not all subjects have the same number of sessions.

    raw: list Each element of this list is dictionary with the 'acc' and 'gyr' keys. The data under the 'acc' key is a (N_{acc} \times 4) numpy.ndarray that contains the timestamps in seconds (first column) and the (3D) raw accelerometer measurements in (g) (second, third and forth columns - representing the (x, y ) and (z) axis, respectively). The data under the 'gyr' key is a (N_{gyr} \times 4) numpy.ndarray that contains the timestamps in seconds (first column) and the (3D) raw gyroscope measurements in ({degrees}/{second})(second, third and forth columns - representing the (x, y ) and (z) axis, respectively). All sensor streams are transformed in such a way that reflects all participants wearing the smartwatch at the same hand with the same orientation, thusly achieving data uniformity. This transformation is in par with the signals in the FIC dataset (more info here and here). Finally, the length of the raw accelerometer and gyroscope numpy.ndarrays is different ((N_{acc} eq N_{gyr})). This behavior is predictable and is caused by the Android platform.

    proc: list Each element of this list is an (M\times7) numpy.ndarray that contains the timestamps, (3D) accelerometer and gyroscope measurements for each meal. Specifically, the first column contains the timestamps in seconds, the second, third and forth columns contain the (x,y) and (z) accelerometer values in (g) and the fifth, sixth and seventh columns contain the (x,y) and (z) gyroscope values in ({degrees}/{second}). Unlike elements in the raw list, processed measurements (in the proc list) have a constant sampling rate of (100) Hz and the accelerometer/gyroscope measurements are aligned with each other. In addition, all sensor streams are transformed in such a way that reflects all participants wearing the smartwatch at the same hand with the same orientation, thusly achieving data uniformity. This transformation is in par with the signals in the FIC dataset (more info here and here). No other preprocessing is performed on the data; e.g., the acceleration component due to the Earth's gravitational field is present at the processed acceleration measurements. The potential researcher can consult the article "A Data Driven End-to-end Approach for In-the-wild Monitoring of Eating Behavior Using Smartwatches" by Kyritsis et al. on how to further preprocess the IMU signals (i.e., smooth and remove the gravitational component).

    meal_gt: list Each element of this list is a (K\times2) matrix. Each row represents the meal intervals for the specific in-the-wild session. The first column contains the timestamps of the meal start moments whereas the second one the timestamps of the meal end moments. All timestamps are in seconds. The number of meals (K) varies across recordings (e.g., a recording exist where a participant consumed two meals).

    Ethics and funding

    Informed consent, including permission for third-party access to anonymised data, was obtained from all subjects prior to their engagement in the study. The work has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No 727688 - BigO: Big data against childhood obesity.

    Contact

    Any inquiries regarding the FreeFIC dataset should be addressed to:

    Dr. Konstantinos KYRITSIS

    Multimedia Understanding Group (MUG) Department of Electrical & Computer Engineering Aristotle University of Thessaloniki University Campus, Building C, 3rd floor Thessaloniki, Greece, GR54124

    Tel: +30 2310 996359, 996365 Fax: +30 2310 996398 E-mail: kokirits [at] mug [dot] ee [dot] auth [dot] gr

  10. g

    Historical static RÚIAN data for basic data set distributed by...

    • gimi9.com
    • data.gov.cz
    • +1more
    Updated Sep 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Historical static RÚIAN data for basic data set distributed by municipalities in the VFR format [Dataset]. https://gimi9.com/dataset/eu_cz-00025712-cuzk_series-md_ruian-h-za-u
    Explore at:
    Dataset updated
    Sep 3, 2020
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Ruian
    Description

    Dataset contains original (historical) data of RÚIAN elements, in which in the past any change occured. The user can use it for re-creation of changes in RÚIAN data (since 2012). Descriptive data for each element is specified only. Dataset contains no spatial location (polygons, definition lines and centroids of RÚIAN elements). It is possible to download file for the whole state territory or for selected municipality only. The file covering the whole state territory contains following elements: state, cohesion region, higher territorial self-governing entity (VÚSC), municipality with extended competence (ORP), authorized municipal office (POU), region (old ones – defined in 1960), county, municipality, municipality part, town district (MOMC), Prague city district (MOP), town district of Prague (SOP), cadastral units and basic urban units (ZSJ). Files for specified municipality contain following elements: municipality, municipality part, MOMC (for territorialy structured statutory cities), MOP (for Prague), SOP (for Prague), cadastral unit, ZSJ, streets, building objects and address points. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Data is created once a month in RÚIAN exchange format (VFR), which is based on XML language and fulfils the GML 3.2.1 standard (according to ISO 19136:2007). Dataset is compressed (ZIP) for downloading. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree no. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  11. US Household Income Statistics

    • kaggle.com
    zip
    Updated Apr 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Golden Oak Research Group (2018). US Household Income Statistics [Dataset]. https://www.kaggle.com/goldenoakresearch/us-household-income-stats-geo-locations
    Explore at:
    zip(2344717 bytes)Available download formats
    Dataset updated
    Apr 16, 2018
    Dataset authored and provided by
    Golden Oak Research Group
    Description

    New Upload:

    Added +32,000 more locations. For information on data calculations please refer to the methodology pdf document. Information on how to calculate the data your self is also provided as well as how to buy data for $1.29 dollars.

    What you get:

    The database contains 32,000 records on US Household Income Statistics & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 348,893 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to up vote. Up vote right now, please. Enjoy!

    Household & Geographic Statistics:

    • Mean Household Income (double)
    • Median Household Income (double)
    • Standard Deviation of Household Income (double)
    • Number of Households (double)
    • Square area of land at location (double)
    • Square area of water at location (double)

    Geographic Location:

    • Longitude (double)
    • Latitude (double)
    • State Name (character)
    • State abbreviated (character)
    • State_Code (character)
    • County Name (character)
    • City Name (character)
    • Name of city, town, village or CPD (character)
    • Primary, Defines if the location is a track and block group.
    • Zip Code (character)
    • Area Code (character)

    Abstract

    The dataset originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.

    License

    Only proper citing is required please see the documentation for details. Have Fun!!!

    Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.

    Sources, don't have 2 dollars? Get the full information yourself!

    2011-2015 ACS 5-Year Documentation was provided by the U.S. Census Reports. Retrieved August 2, 2017, from https://www2.census.gov/programs-surveys/acs/summary_file/2015/data/5_year_by_state/

    Found Errors?

    Please tell us so we may provide you the most accurate data possible. You may reach us at: research_development@goldenoakresearch.com

    for any questions you can reach me on at 585-626-2965

    please note: it is my personal number and email is preferred

    Check our data's accuracy: Census Fact Checker

    Access all 348,893 location records and more:

    Don't settle. Go big and win big. Optimize your potential. Overcome limitation and outperform expectation. Access all household income records on a scale roughly equivalent to a neighborhood, see link below:

    Website: Golden Oak Research Kaggle Deals all databases $1.29 Limited time only

    A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.

  12. e

    Collated apatite trace element data (ppm) from the literature - Dataset -...

    • b2find.eudat.eu
    Updated Apr 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Collated apatite trace element data (ppm) from the literature - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/c515471a-116d-5dee-8bcb-c8275bf8f72f
    Explore at:
    Dataset updated
    Apr 25, 2023
    Description

    This database contains trace element compositional data for apatite from 23 published (cited) datasets on bedrock apatite trace element compositions. IM = Mafic I-type granitoids and Mafic Igneous rocksS = Felsic Granitoids (i.e. Aluminium Saturation Index > 1.1)LM = Low- and medium-grade metamorphic rocks (i.e. sub-upper-amphibolite facies) and eclogitesHM = High-grade (HT) metamorphic rocks and migmatitesUM = Ultramafic rocksALK = Alkali-rich igneous rocksAUT = Authigenic and fossil apatiteExplanation for this databaseThe intended use of this database is to support provenance studies by providing a database of apatite from known bedrocks of known composition against which detritus can be compared. This database will also find use for tephra vectoring and ore-deposit vectoring.The database incorporates apatite trace element data from almost all common lithologies on the Earth Surface, though data from orthogneisses is lacking.It is the authors' intention to update this database as more data become available or are deemed suitable for inclusion.The authors of this submission have added category labels to the data. These are derived partly from the results of K-means tests and PCA transformations previously performed upon the data, and are also simply derived from the names that the original authors of each of the constituent papers that constitute this database identified the rocks as. Users should feel free to use these categories or ignore them as they wish.Data were selectively incorporated into this dataset, not all data-points from the papers from which these data were collated are published here. In particular, rocks identified by the authors' of the original data that were identified as having been metasomatised were avoided, as they defy easy categorisation. Only the central values (in ppm) are provided.Some of the bedrocks were published with only mean values, others by each spot analysis, this is indicated in the database.All data were collected by ICPMS.

  13. d

    Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SEDAC (2025). Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019, v1 [Dataset]. https://catalog.data.gov/dataset/annual-mean-pm2-5-components-trace-elements-tes-50m-urban-and-1km-non-urban-area-grids-for
    Explore at:
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    SEDAC
    Area covered
    United States
    Description

    The Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019, v1 data set contains annual predictions of trace elements concentrations at a hyper resolution (50m x 50m grid cells) in urban areas and a high resolution (1km x 1km grid cells) in non-urban areas, for the years 2000 to 2019. Particulate matter with an aerodynamic diameter of less than 2.5 �m (PM2.5) is a human silent killer of millions worldwide, and contains many trace elements (TEs). Understanding the relative toxicity is largely limited by the lack of data. In this work, ensembles of machine learning models were used to generate approximately 163 billion predictions estimating annual mean PM2.5 TEs, namely Bromine (Br), Calcium (Ca), Copper (Cu), Iron (Fe), Potassium (K), Nickel (Ni), Lead (Pb), Silicon (Si), Vanadium (V), and Zinc (Zn). The monitored data from approximately 600 locations were integrated with more than 160 predictors, such as time and location, satellite observations, composite predictors, meteorological covariates, and many novel land use variables using several machine learning algorithms and ensemble methods. Multiple machine-learning models were developed covering urban areas and non-urban areas. Their predictions were then ensembled using either a Generalized Additive Model (GAM) Ensemble Geographically-Weighted-Averaging (GAM-ENWA), or Super-Learners. The overall best model R-squared values for the test sets ranged from 0.79 for Copper to 0.88 for Zinc in non-urban areas. In urban areas, the R-squared model values ranged from 0.80 for Copper to 0.88 for Zinc. The Coordinate Reference System (CRS) used in the predictions is the World Geodetic System 1984 (WGS84) and the Units for the PM2.5 Components TEs are ng/m^3. The data are provided in RDS tabular format, a file format native to the R programming language, but can also be opened by other languages such as Python.

  14. RÚIAN current status-basic dataset — municipality: The Rear [597121]

    • data.europa.eu
    gml
    Updated May 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Český úřad zeměměřický a katastrální (2023). RÚIAN current status-basic dataset — municipality: The Rear [597121] [Dataset]. https://data.europa.eu/data/datasets/https-atom-cuzk-cz-api-responses-cz-00025712-cuzk_ruian-s-za-u_597121-jsonld?locale=en
    Explore at:
    gmlAvailable download formats
    Dataset updated
    May 6, 2023
    Dataset provided by
    Czech Office for Surveying, Mapping and Cadastre
    Authors
    Český úřad zeměměřický a katastrální
    License

    https://data.gov.cz/zdroj/datové-sady/00025712/0e6875cb1df1a8b59c04f2ebf2ce5293/distribuce/15ef30d794ca6898a65bee78aba506e9/podmínky-užitíhttps://data.gov.cz/zdroj/datové-sady/00025712/0e6875cb1df1a8b59c04f2ebf2ce5293/distribuce/15ef30d794ca6898a65bee78aba506e9/podmínky-užití

    Description

    The data set contains basic descriptive current RÚIAN data, i.e. descriptive data on territorial elements and territorial registration units either for the whole state or for the chosen municipality. The data set does not contain spatial delimitation of RÚIAN elements. The state-wide file (ST_UZSZ) contains the following elements: state, cohesion regions, higher territorial self-governing units (VÚSC), municipalities with extended competence (ORP), municipalities with a mandated municipal authority (POU), regions (from 1960), districts, municipalities, parts of the municipality, municipal districts/urban districts (MOMC), municipal districts of Prague (ILO), administrative districts of Prague (SOP), cadastral territory and basic settlement units (ZSJ). The files for each municipality (OB_UZSZ) contain the following elements: municipality, parts of the municipality, MOMC (for territorially divided statutory cities, ILO (for Prague), SOP (for Prague), cadastral territory, ZSJ, streets, plots, building buildings and address points. For each element, its code, definition point (if any) and all available descriptive attributes, including the parent element code, are given. The data set is provided as open data (CC-BY 4.0 license). Data is based on RÚIAN (registry of territorial identification, addresses and real estate). Data are generated once a month in RÚIAN (VFR) exchange format, which is based on XML language and conforms to GML 3.2.1 (according to ISO 19136:2007). For download, each file is compressed as a ZIP. More in Act No. 111/2009 Coll., on basic registers, in Decree No. 359/2011 Coll., on the basic register of territorial identification, addresses and real estate. The data set contains basic descriptive current RÚIAN data, i.e. descriptive data on territorial elements and territorial registration units either for the whole state or for the chosen municipality. The data set does not contain spatial delimitation of RÚIAN elements. The state-wide file (ST_UZSZ) contains the following elements: state, cohesion regions, higher territorial self-governing units (VÚSC), municipalities with extended competence (ORP), municipalities with a mandated municipal authority (POU), regions (from 1960), districts, municipalities, parts of the municipality, municipal districts/urban districts (MOMC), municipal districts of Prague (ILO), administrative districts of Prague (SOP), cadastral territory and basic settlement units (ZSJ). The files for each municipality (OB_UZSZ) contain the following elements: municipality, parts of the municipality, MOMC (for territorially divided statutory cities, ILO (for Prague), SOP (for Prague), cadastral territory, ZSJ, streets, plots, building buildings and address points. For each element, its code, definition point (if any) and all available descriptive attributes, including the parent element code, are given. The data set is provided as open data (CC-BY 4.0 license). Data is based on RÚIAN (registry of territorial identification, addresses and real estate). Data are generated once a month in RÚIAN (VFR) exchange format, which is based on XML language and conforms to GML 3.2.1 (according to ISO 19136:2007). For download, each file is compressed as a ZIP. More in Act No. 111/2009 Coll., on basic registers, in Decree No. 359/2011 Coll., on the basic register of territorial identification, addresses and real estate.

  15. Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...

    • zenodo.org
    bin, csv, zip
    Updated Dec 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa (2022). Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials [Dataset]. http://doi.org/10.5281/zenodo.6965147
    Explore at:
    bin, zip, csvAvailable download formats
    Dataset updated
    Dec 24, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials

    Background

    This dataset contains data from monotonic and cyclic loading experiments on structural metallic materials. The materials are primarily structural steels and one iron-based shape memory alloy is also included. Summary files are included that provide an overview of the database and data from the individual experiments is also included.

    The files included in the database are outlined below and the format of the files is briefly described. Additional information regarding the formatting can be found through the post-processing library (https://github.com/ahartloper/rlmtp/tree/master/protocols).

    Usage

    • The data is licensed through the Creative Commons Attribution 4.0 International.
    • If you have used our data and are publishing your work, we ask that you please reference both:
      1. this database through its DOI, and
      2. any publication that is associated with the experiments. See the Overall_Summary and Database_References files for the associated publication references.

    Included Files

    • Overall_Summary_2022-08-25_v1-0-0.csv: summarises the specimen information for all experiments in the database.
    • Summarized_Mechanical_Props_Campaign_2022-08-25_v1-0-0.csv: summarises the average initial yield stress and average initial elastic modulus per campaign.
    • Unreduced_Data-#_v1-0-0.zip: contain the original (not downsampled) data
      • Where # is one of: 1, 2, 3, 4, 5, 6. The unreduced data is broken into separate archives because of upload limitations to Zenodo. Together they provide all the experimental data.
      • We recommend you un-zip all the folders and place them in one "Unreduced_Data" directory similar to the "Clean_Data"
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the unreduced data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Clean_Data_v1-0-0.zip: contains all the downsampled data
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the clean data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Database_References_v1-0-0.bib
      • Contains a bibtex reference for many of the experiments in the database. Corresponds to the "citekey" entry in the summary files.

    File Format: Downsampled Data

    These are the "LP_

    • The header of the first column is empty: the first column corresponds to the index of the sample point in the original (unreduced) data
    • Time[s]: time in seconds since the start of the test
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: the surface temperature in degC

    These data files can be easily loaded using the pandas library in Python through:

    import pandas
    data = pandas.read_csv(data_file, index_col=0)

    The data is formatted so it can be used directly in RESSPyLab (https://github.com/AlbanoCastroSousa/RESSPyLab). Note that the column names "e_true" and "Sigma_true" were kept for backwards compatibility reasons with RESSPyLab.

    File Format: Unreduced Data

    These are the "LP_

    • The first column is the index of each data point
    • S/No: sample number recorded by the DAQ
    • System Date: Date and time of sample
    • Time[s]: time in seconds since the start of the test
    • C_1_Force[kN]: load cell force
    • C_1_Déform1[mm]: extensometer displacement
    • C_1_Déplacement[mm]: cross-head displacement
    • Eng_Stress[MPa]: engineering stress
    • Eng_Strain[]: engineering strain
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: specimen surface temperature in degC

    The data can be loaded and used similarly to the downsampled data.

    File Format: Overall_Summary

    The overall summary file provides data on all the test specimens in the database. The columns include:

    • hidden_index: internal reference ID
    • grade: material grade
    • spec: specifications for the material
    • source: base material for the test specimen
    • id: internal name for the specimen
    • lp: load protocol
    • size: type of specimen (M8, M12, M20)
    • gage_length_mm_: unreduced section length in mm
    • avg_reduced_dia_mm_: average measured diameter for the reduced section in mm
    • avg_fractured_dia_top_mm_: average measured diameter of the top fracture surface in mm
    • avg_fractured_dia_bot_mm_: average measured diameter of the bottom fracture surface in mm
    • fy_n_mpa_: nominal yield stress
    • fu_n_mpa_: nominal ultimate stress
    • t_a_deg_c_: ambient temperature in degC
    • date: date of test
    • investigator: person(s) who conducted the test
    • location: laboratory where test was conducted
    • machine: setup used to conduct test
    • pid_force_k_p, pid_force_t_i, pid_force_t_d: PID parameters for force control
    • pid_disp_k_p, pid_disp_t_i, pid_disp_t_d: PID parameters for displacement control
    • pid_extenso_k_p, pid_extenso_t_i, pid_extenso_t_d: PID parameters for extensometer control
    • citekey: reference corresponding to the Database_References.bib file
    • yield_stress_mpa_: computed yield stress in MPa
    • elastic_modulus_mpa_: computed elastic modulus in MPa
    • fracture_strain: computed average true strain across the fracture surface
    • c,si,mn,p,s,n,cu,mo,ni,cr,v,nb,ti,al,b,zr,sn,ca,h,fe: chemical compositions in units of %mass
    • file: file name of corresponding clean (downsampled) stress-strain data

    File Format: Summarized_Mechanical_Props_Campaign

    Meant to be loaded in Python as a pandas DataFrame with multi-indexing, e.g.,

    tab1 = pd.read_csv('Summarized_Mechanical_Props_Campaign_' + date + version + '.csv',
              index_col=[0, 1, 2, 3], skipinitialspace=True, header=[0, 1],
              keep_default_na=False, na_values='')
    • citekey: reference in "Campaign_References.bib".
    • Grade: material grade.
    • Spec.: specifications (e.g., J2+N).
    • Yield Stress [MPa]: initial yield stress in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign
    • Elastic Modulus [MPa]: initial elastic modulus in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

    Caveats

    • The files in the following directories were tested before the protocol was established. Therefore, only the true stress-strain is available for each:
      • A500
      • A992_Gr50
      • BCP325
      • BCR295
      • HYP400
      • S460NL
      • S690QL/25mm
      • S355J2_Plates/S355J2_N_25mm and S355J2_N_50mm
  16. t

    INSPIRE Soil / Medium element contents in the subsoil BB - Vdataset - LDM

    • service.tib.eu
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). INSPIRE Soil / Medium element contents in the subsoil BB - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/govdata_d83768b6-cd55-4f07-86b2-40ae9e39924e--1
    Explore at:
    Dataset updated
    Feb 4, 2025
    Description

    The interoperable INSPIRE dataset contains data from the LBGR on the mean element contents in the Brandenburg subsurface, transformed into the INSPIRE target soil scheme. The dataset is provided via an interoperable display and download service. The compliant INSPIRE data set contains data about average element contents in the subsoil in the State of Brandenburg from the LBGR, transformed into the INSPIRE annex schema Soil. The data set is provided via compliant view and download services.

  17. Soil Chemistry England and Wales (version 3)

    • data.europa.eu
    • metadata.bgs.ac.uk
    • +3more
    unknown
    Updated Oct 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    British Geological Survey (BGS) (2021). Soil Chemistry England and Wales (version 3) [Dataset]. https://data.europa.eu/data/datasets/soil-chemistry-england-and-wales-version-3?locale=nl
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Oct 12, 2021
    Dataset provided by
    British Geological Surveyhttps://www.bgs.ac.uk/
    Authors
    British Geological Survey (BGS)
    Area covered
    Wales, England
    Description

    This dataset has now been superseded, please see the Estimated Ambient Background Soil Chemistry England and Wales dataset. This dataset indicates the estimated topsoil Arsenic(As), Cadmium (Cd), Cr (Chromium), Nickel (Ni) and Lead (Pb) concentrations (mg kg-1) derived by spatial interpolation of the point source urban soil PHE (potentially harmful elements) data. Urban soil geochemical data generally have large positive skewness coefficients so were transformed by taking natural logarithms. To overcome the bias associated with traditional measures of location (mean) and scale (standard deviation) for log-normal data, the inverse distance weighted (IDW) mean and standard deviation of log transformed element concentrations were used for mapping the spatial variation in As, Cd, Cr, Ni and Pb concentrations. The soil chemistry data is based on GBASE (Geochemical Baseline Survey of the Environment) soil geochemical data where these are available. Elsewhere the stream sediment data are converted to surface soil equivalent potentially harmful element(PHE) concentrations. This dataset covers England and Wales but data is available for the whole of Great Britain, with the exception of the London area where an inadequate number of geochemical samples are available at the moment.

  18. Z

    Data from: Dataset of Experimental Investigations of a Full-Scale Louvre...

    • data.niaid.nih.gov
    Updated Jan 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bugenings, Laura Annabelle (2025). Dataset of Experimental Investigations of a Full-Scale Louvre Element [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14614813
    Explore at:
    Dataset updated
    Jan 27, 2025
    Dataset authored and provided by
    Bugenings, Laura Annabelle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the raw data and the processed results of the experimental investigations of a full-scale louvre element conducted in September 2024 in the laboratories of the Department of Civil and Architectural Engineering, Aarhus University.

    The dataset is structured as follows:

    Dataset

    ¦ Result summary.xlsx

    ¦

    +---WHS 3ACH

    ¦ WHS_3ACH_CS_velocity.csv

    ¦ WHS_3ACH_DL1_temperature.csv

    ¦ WHS_3ACH_DL2_temperature.csv

    ¦ WHS_3ACH_DL3_temperature.csv

    ¦ WHS_3ACH_DL4_temperature.csv

    ¦ WHS_3ACH_flow_meter.csv

    ¦ WHS_3ACH_VIVO_velocity.csv

    ¦

    +---WHS 5ACH

    ¦ WHS_5ACH_CS_velocity.csv

    ¦ WHS_5ACH_DL1_temperature.csv

    ¦ WHS_5ACH_DL2_temperature.csv

    ¦ WHS_5ACH_DL3_temperature.csv

    ¦ WHS_5ACH_DL4_temperature.csv

    ¦ WHS_5ACH_flow_meter.csv

    ¦ WHS_5ACH_VIVO_velocity.csv

    ¦

    +---WHS 7ACH

    ¦ WHS_7ACH_CS_velocity.csv

    ¦ WHS_7ACH_DL1_temperature.csv

    ¦ WHS_7ACH_DL2_temperature.csv

    ¦ WHS_7ACH_DL3_temperature.csv

    ¦ WHS_7ACH_DL4_temperature.csv

    ¦ WHS_7ACH_flow_meter.csv

    ¦ WHS_7ACH_VIVO_velocity.csv

    ¦

    +---WOHS 3ACH

    ¦ WOHS_3ACH_CS_velocity.csv

    ¦ WOHS_3ACH_DL1_temperature.csv

    ¦ WOHS_3ACH_DL2_temperature.csv

    ¦ WOHS_3ACH_DL3_temperature.csv

    ¦ WOHS_3ACH_DL4_temperature.csv

    ¦ WOHS_3ACH_flow_meter.csv

    ¦ WOHS_3ACH_VIVO_velocity.csv

    ¦

    +---WOHS 5ACH

    ¦ WOHS_5ACH_CS_velocity.csv

    ¦ WOHS_5ACH_DL1_temperature.csv

    ¦ WOHS_5ACH_DL2_temperature.csv

    ¦ WOHS_5ACH_DL3_temperature.csv

    ¦ WOHS_5ACH_DL4_temperature.csv

    ¦ WOHS_5ACH_flow_meter.csv

    ¦ WOHS_5ACH_VIVO_velocity.csv

    ¦

    +---WOHS 7ACH

      WOHS_7ACH_CS_velocity.csv
    
      WOHS_7ACH_DL1_temperature.csv
    
      WOHS_7ACH_DL2_temperature.csv
    
      WOHS_7ACH_DL3_temperature.csv
    
      WOHS_7ACH_DL4_temperature.csv
    
      WOHS_7ACH_flow_meter.csv
    
      WOHS_7ACH_VIVO_velocity.csv
    

    The result summary contains 8 sheets with the following information:

    Overview:

    Measurement cases with target flow rate and heat source presence.

    The date of the experiment and the time period in which the data was averaged for the processed results.

    The allocation of the thermocoulples to the datalogger.

    The sensor location on the stands (temperature and velocity).

    The sensor location on the surfaces (temperature)

    The sensors used for the ice point references.

    The sensors used in the anteroom.

    The seonsors used for the heat source.

    Graphical representation of sensor location and room.

    Calibration curves:

    Calibration curves for all thermocouples according to datalogger.

    WOHS/WHS:

    Mean temperature according to sensor, datalogger, location, height.

    Standard deviation according to sensor, datalogger, location, height.

    Mean velocity according to sensor, datalogger, location, height.

    Standard deviation according to sensor, datalogger, location, height.

    Turbulence intensity.

    u_u0: mean velocity at sensor/mean velocity in flow meter.

    Mean temperature at flow meter.

    Mean velocity at flow meter.

    Mean flow rate at flow meter.

    Files with the ending _temperature.csv contain the following:

    Column 1 (datetime): date and time in ISO8601 format (YYYY-MM-DDThh:mm:ssZ)

    Column 2 (sensorname): temperature at sensor in °C

    Files with the ending _flow_meter contain the following:

    Column 1 (datetime): date and time in ISO8601 format (YYYY-MM-DDThh:mm:ssZ)

    Column 2 (velocity): velocity at flow meter in m/s

    Column 3 (exhaust_temperature): temperature at flow meter in °C

    Column 4 (flow_rate): flow rate at flow meter in m3/h

    Files with the ending _CS_velocity.csv (CS stands for the comfort sense sensors) contain the following:

    Column 1 (datetime): date and time in ISO8601 format (YYYY-MM-DDThh:mm:ssZ)

    Column 2-17 (sensorename): velocity at sensor in m/s

    Files with the ending _VIVO_velocity.csv contain the following:

    Column 1 (datetime): date and time in ISO8601 format (YYYY-MM-DDThh:mm:ssZ)

    Column 2-7 (sensorename): velocity at sensor in m/s

    Note for that the VIVO system each sensor logged their result individually which means measurements are not at the same time stamp. This leads to NA entries.

  19. d

    Oregon Average Annual Mean Temperature 1991-2020

    • catalog.data.gov
    • data.oregon.gov
    • +3more
    Updated May 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State of Oregon (2025). Oregon Average Annual Mean Temperature 1991-2020 [Dataset]. https://catalog.data.gov/dataset/oregon-average-annual-mean-temperature-1991-2020
    Explore at:
    Dataset updated
    May 17, 2025
    Dataset provided by
    State of Oregon
    Area covered
    Oregon
    Description

    This is a dataset download, not a document. The Open button will start the download.This data layer is an element of the Oregon GIS Framework. Monthly 30-year "normal" dataset covering Oregon, averaged over the climatological period 1991-2020. Contains spatially gridded average daily mean temperature at 800m grid cell resolution. Distribution of the point measurements to the spatial grid was accomplished using the PRISM model, developed and applied by Dr. Christopher Daly of the PRISM Climate Group at Oregon State University. This dataset is available free-of-charge on the PRISM website.

  20. e

    DNS of Turbulent Heat Transfer in Impinging Jets at Different Reynolds and...

    • b2find.eudat.eu
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). DNS of Turbulent Heat Transfer in Impinging Jets at Different Reynolds and Prandtl Numbers - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/4f1e3e94-2b31-5ded-ac96-44a8ea9f17dd
    Explore at:
    Dataset updated
    May 31, 2024
    Description

    The heat transfer between an impinging circular jet and a flat plate is studied by means of direct numerical simulations for different Prandtl numbers of the fluid. The thermal field is resolved for $Pr=1$, $0.72$, $0.025$, and $0.01$. The flow is incompressible and the temperature is treated as a passive scalar field. The jet originates from a fully developed turbulent pipe flow and impinges perpendicularly on a smooth solid heated plate placed at two pipe diameters distance from the jet exit section. The values of Reynolds numbers based on the pipe diameter and bulk mean velocity in the pipe are set to $Re=5300$ and $Re=10000$. Inflow boundary conditions are enforced using a precursor simulation. Heat transfer at the wall is addressed through the Nusselt number distribution and main flow field statistics. At fixed Reynolds number it is shown that the Prandtl number influences the intensity of the Nusselt number at a given radial location, and that the Nusselt number distribution along the plate exhibit similar features at different Prandtl numbers. The characteristic secondary peak in the Nusselt number distribution is found for both Reynolds numbers for $Pr=0.025$ and $Pr=0.01$. All the simulations presented in this study were performed with the high order spectral element code Nek5000. Data contain mean flow field statistics along the impingement wall of a turbulent impinging jet at Re=5300 and Re=10000. Data are separated in directories for each Reynolds number case. For each case, mean flow statistics of the precursor pipe flow simulation are also reported. Data are stored as csv ascii files and information about the columns are found directly in the files.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Laura Miron; Rafael Gonçalves; Mark A. Musen (2023). Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov" [Dataset]. http://doi.org/10.6084/m9.figshare.12743939.v2

Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Laura Miron; Rafael Gonçalves; Mark A. Musen
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.

Search
Clear search
Close search
Google apps
Main menu