86 datasets found

f
Dataset for fig 2 (NMDS)
figshare.com
txt
Updated Mar 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angela Bartlett (2023). Dataset for fig 2 (NMDS) [Dataset]. http://doi.org/10.6084/m9.figshare.22361590.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22361590.v2
Dataset updated
Mar 30, 2023
Dataset provided by
figshare
Authors
Angela Bartlett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset for fig 2 (NMDS)
d
Data from: Microbial volatile organic compounds mediate attraction by a...
catalog.data.gov
agdatacommons.nal.usda.gov
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data from: Microbial volatile organic compounds mediate attraction by a primary but not secondary stored product insect pest in wheat [Dataset]. https://catalog.data.gov/dataset/data-from-microbial-volatile-organic-compounds-mediate-attraction-by-a-primary-but-not-sec-ce3b9
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Service
Description
This dataset is associated with the forthcoming publication entitled, "Microbial volatile organic compounds mediate attraction by a primary but not secondary stored product insect pest in wheat", and includes data on grain damage from near infrared spectroscopy, behavioral data from wind tunnel and release-recapture experiments, as well as volatile characterization of headspace from moldy grain. For all files, incubation intervals 9, 18, and 27 d represent how long grain was incubated after being tempered to a grain moisture of 12, 15, or 19% or left untempered (ctrl; 10.8% grain moisture). TSO = Trece storgard oil; empty = negative control (no stimulus), LGB = lesser grain borer (Rhzyopertha dominica), and RFB = red flour beetle (Tribolium castaneum). Note: The resource 'GC/MS Grain MVOC Headspace Data' was added 2021-08-04 with the deletion of some compounds as unlikely natural compounds and potential contaminants. This is the dataset that undergirds the non-metric multidimensional scaling analysis. See the included file list for more information about methods and results of each file in this dataset. Resources in this dataset:Resource Title: GC-MS/Headspace Data. File Name: tvw_final_gc_ms_data.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Microbial damage on wheat evaluated with near-infrared spectroscopy. File Name: tvw_nearinfrared_sorting_damaged_grain_fungal_exp.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Release-Recapture Datasets with LGB & RFB. File Name: tvw_rr_lgb_rfb_microbial_cues.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Wind tunnel response by RGB & LGB. File Name: tvw_wt_lgb_rfb_data_microbial_cues.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: GC/MS Grain MVOC Headspace Data. File Name: taylor_headspace_final_data_peer_reviewed_ag_commons.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: README file list. File Name: file_list_MVOCwheat.txt
d
Non-metric multidimensional scaling analyses species scores of ostracode...
search.dataone.org
doi.pangaea.de
Updated Feb 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yamaguchi, Tatsuhiko; Bornemann, André; Matsui, Hiroki; Nishi, Hiroshi (2018). Non-metric multidimensional scaling analyses species scores of ostracode taxa from the Maastrichtian-Thanetian samples [Dataset]. http://doi.org/10.1594/PANGAEA.872186
Explore at:
Unique identifier
https://doi.org/10.1594/PANGAEA.872186
Dataset updated
Feb 14, 2018
Dataset provided by
PANGAEA Data Publisher for Earth and Environmental Science
Authors
Yamaguchi, Tatsuhiko; Bornemann, André; Matsui, Hiroki; Nishi, Hiroshi
Area covered

Description
No description is available. Visit https://dataone.org/datasets/868c087baba217e46370fe2f9dd37505 for complete metadata about this dataset.
B
Cranial Nonmetric Trait Database, 2013
borealisdata.ca
search.dataone.org
Updated May 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nancy Ossenberg (2021). Cranial Nonmetric Trait Database, 2013 [Dataset]. http://doi.org/10.5683/SP3/FUYXAS
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/FUYXAS
Dataset updated
May 12, 2021
Dataset provided by
Borealis
Authors
Nancy Ossenberg
License
https://borealisdata.ca/api/datasets/:persistentId/versions/7.6/customlicense?persistentId=doi:10.5683/SP3/FUYXAShttps://borealisdata.ca/api/datasets/:persistentId/versions/7.6/customlicense?persistentId=doi:10.5683/SP3/FUYXAS
Area covered
South America, Greenland, Northeast Asia, United States, Japan, Africa, Canada, Armenia, Italy, Iceland
Dataset funded by
The Canada Council
Boreal Institute of the University of Alberta
The National Science and Engineering Research Council
Description
A database of cranial measurements covering the Arctic and Northwestern North America as well as Northeast Asia, Eurasia, Africa, and the South Pacific.
f
Frail or hale: Skeletal frailty indices in Medieval London skeletons
plos.figshare.com
figshare.com
docx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kathryn E. Marklein; Douglas E. Crews (2023). Frail or hale: Skeletal frailty indices in Medieval London skeletons [Dataset]. http://doi.org/10.1371/journal.pone.0176025
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0176025
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Kathryn E. Marklein; Douglas E. Crews
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
London
Description
To broaden bioarchaeological applicability of skeletal frailty indices (SFIs) and increase sample size, we propose indices with fewer biomarkers (2–11 non-metric biomarkers) and compare these reduced biomarker SFIs to the original metric/non-metric 13-biomarker SFI. From the 2-11-biomarker SFIs, we choose the index with the fewest biomarkers (6-biomarker SFI), which still maintains the statistical robusticity of a 13-biomarker SFI, and apply this index to the same Medieval monastic and nonmonastic populations, albeit with an increased sample size. For this increased monastic and nonmonastic sample, we also propose and implement a 4-biomarker SFI, comprised of biomarkers from each of four stressor categories, and compare these SFI distributions with those of the non-metric biomarker SFIs. From the Museum of London WORD database, we tabulate multiple SFIs (2- to 13-biomarkers) for Medieval monastic and nonmonastic samples (N = 134). We evaluate associations between these ten non-metric SFIs and the 13-biomarker SFI using Spearman’s correlation coefficients. Subsequently, we test non-metric 6-biomarker and 4-biomarker SFI distributions for associations with cemetery, age, and sex using Analysis of Variance/Covariance (ANOVA/ANCOVA) on larger samples from the monastic and nonmonastic cemeteries (N = 517). For Medieval samples, Spearman’s correlation coefficients show a significant association between the 13-biomarker SFI and all non-metric SFIs. Utilizing a 6-biomarker and parsimonious 4-biomarker SFI, we increase the nonmonastic and monastic samples and demonstrate significant lifestyle and sex differences in frailty that were not observed in the original, smaller sample. Results from the 6-biomarker and parsimonious 4-biomarker SFIs generally indicate similarities in means, explained variation (R2), and associated P-values (ANOVA/ANCOVA) within and between nonmonastic and monastic samples. We show that non-metric reduced biomarker SFIs provide alternative indices for application to other bioarchaeological collections. These findings suggest that a SFI, comprised of six or more non-metric biomarkers available for the specific sample, may have greater applicability than, but comparable statistical characteristics to, the originally proposed 13-biomarker SFI.
Data from: Non-metric multidimensional scaling analyses of ostracode taxa...
doi.pangaea.de
html, tsv
Updated Feb 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tatsuhiko Yamaguchi; André Bornemann; Hiroki Matsui; Hiroshi Nishi (2017). Non-metric multidimensional scaling analyses of ostracode taxa from the Maastrichtian-Thanetian samples at IODP Site 342-U1407 [Dataset]. http://doi.org/10.1594/PANGAEA.872185
Explore at:
html, tsvAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.872185
Dataset updated
Feb 16, 2017
Dataset provided by
PANGAEA
Authors
Tatsuhiko Yamaguchi; André Bornemann; Hiroki Matsui; Hiroshi Nishi
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered

Variables measured
Axis 1, Axis 2, Rarefaction, Depth, top/min, Depth, bottom/max, DEPTH, sediment/rock, Shannon Diversity Index, Rarefaction, standard deviation
Description
This dataset is about: Non-metric multidimensional scaling analyses of ostracode taxa from the Maastrichtian-Thanetian samples at IODP Site 342-U1407. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.872187 for more information.
d
Stream Channel and Floodplain Metric Toolbox
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Stream Channel and Floodplain Metric Toolbox [Dataset]. https://catalog.data.gov/dataset/stream-channel-and-floodplain-metric-toolbox
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The Stream Channel and Floodplain Metric Toolbox was developed to demonstrate the feasibility of mapping fluvial geomorphic features from high-resolution bare-earth elevation data. A Python toolbox for ArcGIS was built to calculate key metrics describing channel and floodplain geometry. Channel and Floodplain Metric Toolbox provides this ability in an automated fashion, allowing for regional analyses based solely on digital elevation models (DEMs). This manual describes the general operation of the toolbox and technical details describing the specific algorithms. The toolbox works best in a watershed no larger than HUC 10 (< 1,000 sq. km).
f
Goodness-of-fit filtering in classical metric multidimensional scaling with...
tandf.figshare.com
pdf
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan Graffelman (2023). Goodness-of-fit filtering in classical metric multidimensional scaling with large datasets [Dataset]. http://doi.org/10.6084/m9.figshare.11389830.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11389830.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
Jan Graffelman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Metric multidimensional scaling (MDS) is a widely used multivariate method with applications in almost all scientific disciplines. Eigenvalues obtained in the analysis are usually reported in order to calculate the overall goodness-of-fit of the distance matrix. In this paper, we refine MDS goodness-of-fit calculations, proposing additional point and pairwise goodness-of-fit statistics that can be used to filter poorly represented observations in MDS maps. The proposed statistics are especially relevant for large data sets that contain outliers, with typically many poorly fitted observations, and are helpful for improving MDS output and emphasizing the most important features of the dataset. Several goodness-of-fit statistics are considered, and both Euclidean and non-Euclidean distance matrices are considered. Some examples with data from demographic, genetic and geographic studies are shown.
4
Data and code underlying the publication "Alternative climate metrics to the...
data.4tu.nl
zip
Updated May 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liam Megill; Kathrin Deck; Volker Grewe (2024). Data and code underlying the publication "Alternative climate metrics to the Global Warming Potential are more suitable for assessing aviation non-CO2 effects" [Dataset]. http://doi.org/10.4121/344e24ad-b2f5-4ed9-8d49-6efa2081d30c.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/344e24ad-b2f5-4ed9-8d49-6efa2081d30c.v1
Dataset updated
May 8, 2024
Dataset provided by
4TU.ResearchData
Authors
Liam Megill; Kathrin Deck; Volker Grewe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
German Federal Government
Airbus / DLR
Clean Sky 2 Joint Undertaking
Description
Data and code accompanying Megill et al. (2024): "Alternative climate metrics to the Global Warming Potential are more suitable for assessing aviation non-CO2 effects", published in Communications Earth & Environment. This dataset contains all data and code developed during research towards the linked article and contains all elements required to reproduce the linked figures and analysis.
The data was generated using the V2.1 of the climate-chemistry response model AirClim (Grewe and Stenke, 2008; Dahlmann et al., 2016, see References). The dataset includes analyses of the full aviation industry (scenarios CORSIA, COVID15s, CurTec, Fa1 and FP2050 from Grewe et al., 2021, see References) and of individual, theoretical aircraft designs of Category 4 (152 - 201 seats). Trajectory data is taken from DLR WeCare (Grewe et al., 2017, see References). The analysis of the results is performed with four Jupyter notebooks running Python.
All data files are licensed under a CC-BY 4.0. All Jupyter notebooks and the Python script are licensed under an Apache License v2.0.
Please note: The software code AirClim is confidential proprietary information of the DLR and cannot be made available to the public or readers without restrictions. Licensing of the code to third parties is conditioned upon the prior conclusion of a licensing agreement with the DLR. Qualified researchers can request an agreement on reasonable request from the corresponding author.
Heard Island, Laurens Peninsula, Coastal Orthophoto Mosaic derived from...
data.gov.au
data.aad.gov.au
+2more
cfm, pdf, shp
Updated Dec 11, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australian Antarctic Division (2015). Heard Island, Laurens Peninsula, Coastal Orthophoto Mosaic derived from Non-Metric Photography [Dataset]. https://data.gov.au/data/dataset/aad-photo-mosaic-laurens-or
Explore at:
cfm, shp, pdfAvailable download formats
Dataset updated
Dec 11, 2015
Dataset provided by
Australian Antarctic Divisionhttps://www.antarctica.gov.au/
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered
Heard Island and McDonald Islands
Description
The orthophoto mosaic is a rectified georeferenced image of the Heard Island, Laurens Peninsula Coastal Area. Distortions due to relief and tilt displacement have been removed. Orthophotos were derived from non-metric cameras (focal length unknown).
g
Vision Zero Benchmarking
gimi9.com
data.sfgov.org
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Vision Zero Benchmarking [Dataset]. https://gimi9.com/dataset/data-gov_vision-zero-benchmarking/
Explore at:
Dataset updated
Jun 13, 2024
Description
A. SUMMARY This dataset contains the underlying data for the Vision Zero Benchmarking website. Vision Zero is the collaborative, citywide effort to end traffic fatalities in San Francisco. The goal of this benchmarking effort is to provide context to San Francisco’s work and progress on key Vision Zero metrics alongside its peers. The Controller's Office City Performance team collaborated with the San Francisco Municipal Transportation Agency, the San Francisco Department of Public Health, the San Francisco Police Department, and other stakeholders on this project. B. HOW THE DATASET IS CREATED The Vision Zero Benchmarking website has seven major metrics. The City Performance team collected the data for each metric separately, cleaned it, and visualized it on the website. This dataset has all seven metrics and some additional underlying data. The majority of the data is available through public sources, but a few data points came from the peer cities themselves. C. UPDATE PROCESS This dataset is for historical purposes only and will not be updated. To explore more recent data, visit the source website for the relevant metrics. D. HOW TO USE THIS DATASET This dataset contains all of the Vision Zero Benchmarking metrics. Filter for the metric of interest, then explore the data. Where applicable, datasets already include a total. For example, under the Fatalities metric, the "Total Fatalities" category within the metric shows the total fatalities in that city. Any calculations should be reviewed to not double-count data with this total. E. RELATED DATASETS N/A
Heard Island Topographic Mapping from Orthophotos derived from Non-Metric...
data.wu.ac.at
data.aad.gov.au
+3more
cfm, pdf, shp
Updated Dec 11, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australian Antarctic Division (2015). Heard Island Topographic Mapping from Orthophotos derived from Non-Metric Photography [Dataset]. https://data.wu.ac.at/schema/data_gov_au/ZTkwODRlZTItYjYyOC00YWYxLTlmM2QtODYxNjMwMjE4YWNj
Explore at:
shp, cfm, pdfAvailable download formats
Dataset updated
Dec 11, 2015
Dataset provided by
Australian Antarctic Divisionhttps://www.antarctica.gov.au/
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered
a587ef6d157389b6af3ee660aa9fa626be1922d0, Heard Island and McDonald Islands
Description
The Heard Island Topographic Data was mapped from Ortho-rectified non-metric photography. The data consists of Coastline, Glacier, Lagoon, Offshore Rocks, Water Storage and Watercourse datasets digitised from the photography, all of which are available for download at the url given below.
d
Long-term monotonic trends in annual and monthly streamflow metrics at...
catalog.data.gov
data.usgs.gov
+1more
Updated Oct 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Long-term monotonic trends in annual and monthly streamflow metrics at streamgages in the United States (ver. 2.0, October 2024) [Dataset]. https://catalog.data.gov/dataset/long-term-monotonic-trends-in-annual-and-monthly-streamflow-metrics-at-streamgages-in-the-
Explore at:
Dataset updated
Oct 5, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The U.S. Geological Survey (USGS) Water Resources Mission Area (WMA) is working to address a need to understand where the Nation is experiencing water shortages or surpluses relative to the demand for water need by delivering routine assessments of water supply and demand and an understanding of the natural and human factors affecting the balance between supply and demand. A key part of these national assessments is identifying long-term trends in water availability, including groundwater and surface water quantity, quality, and use. This data release contains Mann-Kendall monotonic trend analyses for 18 observed annual and monthly streamflow metrics at 6,347 U.S. Geological Survey streamgages located in the conterminous United States, Alaska, Hawaii, and Puerto Rico. Streamflow metrics include annual mean flow, maximum 1-day and 7-day flows, minimum 7-day and 30-day flows, and the date of the center of volume (the date on which 50% of the annual flow has passed by a gage), along with the mean flow for each month of the year. Annual streamflow metrics are computed from mean daily discharge records at U.S. Geological Survey streamgages that are publicly available from the National Water Information System (NWIS). Trend analyses are computed using annual streamflow metrics computed through climate year 2022 (April 2022- March 2023) for low-flow metrics and water year 2022 (October 2021 - September 2022) for all other metrics. Trends at each site are available for up to four different periods: (i) the longest possible period that meets completeness criteria at each site, (ii) 1980-2020, (iii) 1990-2020, (iv) 2000-2020. Annual metric time series analyzed for trends must have 80 percent complete records during fixed periods. In addition, each of these time series must have 80 percent complete records during their first and last decades. All longest possible period time series must be at least 10 years long and have annual metric values for at least 80% of the years running from 2013 to 2022. This data release provides the following five CSV output files along with a model archive: (1) streamflow_trend_results.csv - contains test results of all trend analyses with each row representing one unique combination of (i) NWIS streamgage identifiers, (ii) metric (computed using Oct 1 - Sep 30 water years except for low-flow metrics computed using climate years (Apr 1 - Mar 31), (iii) trend periods of interest (longest possible period through 2022, 1980-2020, 1990-2020, 2000-2020) and (iv) records containing either the full trend period or only a portion of the trend period following substantial increases in cumulative upstream reservoir storage capacity. This is an output from the final process step (#5) of the workflow. (2) streamflow_trend_trajectories_with_confidence_bands.csv - contains annual trend trajectories estimated using Theil-Sen regression, which estimates the median of the probability distribution of a metric for a given year, along with 90 percent confidence intervals (5th and 95h percentile values). This is an output from the final process step (#5) of the workflow. (3) streamflow_trend_screening_all_steps.csv - contains the screening results of all 7,873 streamgages initially considered as candidate sites for trend analysis and identifies the screens that prevented some sites from being included in the Mann-Kendall trend analysis. (4) all_site_year_metrics.csv - contains annual time series values of streamflow metrics computed from mean daily discharge data at 7,873 candidate sites. This is an output of Process Step 1 in the workflow. (5) all_site_year_filters.csv - contains information about the completeness and quality of daily mean discharge at each streamgage during each year (water year, climate year, and calendar year). This is also an output of Process Step 1 in the workflow and is combined with all_site_year_metrics.csv in Process Step 2. In addition, a .zip file contains a model archive for reproducing the trend results using R 4.4.1 statistical software. See the README file contained in the model archive for more information. Caution must be exercised when utilizing monotonic trend analyses conducted over periods of up to several decades (and in some places longer ones) due to the potential for confounding deterministic gradual trends with multi-decadal climatic fluctuations. In addition, trend results are available for post-reservoir construction periods within the four trend periods described above to avoid including abrupt changes arising from the construction of larger reservoirs in periods for which gradual monotonic trends are computed. Other abrupt changes, such as changes to water withdrawals and wastewater return flows, or episodic disturbances with multi-year recovery periods, such as wildfires, are not evaluated. Sites with pronounced abrupt changes or other non-monotonic trajectories of change may require more sophisticated trend analyses than those presented in this data release.
Fitness Tracker Dataset
kaggle.com
Updated Dec 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nadeem Majeed (2024). Fitness Tracker Dataset [Dataset]. https://www.kaggle.com/datasets/nadeemajeedch/fitness-tracker-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nadeem Majeed
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
The Fitness Tracker Dataset contains detailed information about individuals' fitness metrics, exercise routines, and health parameters. This dataset is designed to provide insights into fitness trends, workout habits, and overall health patterns. It is ideal for exploratory data analysis (EDA), machine learning applications, and health analytics. The dataset can help identify relationships between physical activity, body metrics, and health outcomes.

Features: Age: Age of the individual in years. Gender:Gender of the individual (e.g., Male, Female). Weight (kg): Weight of the individual in kilograms. Height (m): Height of the individual in meters. Max_BPM:Maximum heartbeats per minute recorded during exercise. Avg_BPM: Average heartbeats per minute during a workout session. Resting_BPM:Resting heartbeats per minute. Session_Duration (hours):Duration of the workout session in hours. Calories_Burned:Total calories burned during a workout session. Workout_Type:Type of workout performed (e.g., Cardio, Strength, Yoga). Fat_Percentage:Percentage of body fat. Water_Intake (liters):Water intake in liters during or after the workout. Workout_Frequency (days/week): Number of days per week the individual exercises. Experience_Level:Level of fitness experience (e.g., Beginner, Intermediate, Advanced). BMI:Body Mass Index, calculated as weight (kg) / height (m)^2.

##Usage: This dataset is suitable for: - Analyzing the impact of fitness routines on health metrics. Exploring trends in heart rate, calorie burn, and workout habits. Correlating body metrics like BMI and fat percentage with exercise patterns. Building predictive models for fitness and health analytics. This is a synthetic dataset created for educational and analytical purposes and does not represent real-world data.
Data from: The FAIR Assessment Conundrum: Reflections on Tools and Metrics -...
zenodo.org
data.niaid.nih.gov
csv
Updated Apr 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leonardo Candela; Leonardo Candela; Dario Mangione; Dario Mangione; Gina Pavone; Gina Pavone (2024). The FAIR Assessment Conundrum: Reflections on Tools and Metrics - Data Set [Dataset]. http://doi.org/10.5281/zenodo.10986748
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10986748
Dataset updated
Apr 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Leonardo Candela; Leonardo Candela; Dario Mangione; Dario Mangione; Gina Pavone; Gina Pavone
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data sets accompanying the paper "The FAIR Assessment Conundrum: Reflections on Tools and Metrics", an analysis of a comprehensive set of FAIR assessment tools and the metrics used by these tools for the assessment.

The data set "metrics.csv" consists of the metrics collected from several sources linked to the analysed FAIR assessments tools. It is structured into 11 columns: (i) tool_id, (ii) tool_name, (iii) metric_discarded, (iv) metric_fairness_scope_declared, (v) metric_fairness_scope_observed, (vi) metric_id, (vii) metric_text, (viii) metric_technology, (ix) metric_approach, (x) last_accessed_date, and (xi) provenance.

The columns tool_id and tool_name are used for the identifier we assigned to each tool analysed and the full name of the tool respectively.

The metric_discarded column refers to the selection we operated on the collected metrics, since we excluded the metrics created for testing purposes or written in a language different from English. The possible values are boolean. We assigned TRUE if the metric was discarded.

The columns metric_fairness_scope_declared and metric_fairness_scope_observed are used for indicating the declared intent of the metrics, with respect to the FAIR principle assessed, and the one we observed respectively. Possible values are: (a) a letter of the FAIR acronym (for the metrics without a link declared to a specific FAIR principle), (b) one or more identifiers of the FAIR principles (F1, F2…), (c) n/a, if no FAIR references were declared, or (d) none, if no FAIR references were observed.

The metric_id and metric_text columns are used for the identifiers of the metrics and the textual and human-oriented content of the metrics respectively.

The column metric_technology is used for enumerating the technologies (a term used in its widest acceptation) mentioned or used by the metrics for the specific assessment purpose. Such technologies include very diverse typologies ranging from (meta)data formats to standards, semantic technologies, protocols, and services. For tools implementing automated assessments, the technologies listed take into consideration also the available code and documentation, not just the metric text.

The column metric_approach is used for identifying the type of implementation observed in the assessments. The identification of the implementation types followed a bottom-to-top approach applied to the metrics organised by the metric_fairness_scope_declared values. Consequently, while the labels used for creating the implementation type strings are the same, their combination and specialisation varies based on the characteristics of the actual set of metrics analysed. The main labels used are: (a) 3rd party service-based, (b) documentation-centred, (c) format-centred, (d) generic, (e) identifier-centred, (f) policy-centred, (g) protocol-centred, (h) metadata element-centred, (i) metadata schema-centred, (j) metadata value-centred, (k) service-centred, and (l) na.

The columns provenance and last_accessed_date are used for the main source of information about each metric (at least with regard to the text) and the date we last accessed it respectively.

The data set "classified_technologies.csv" consists of the technologies mentioned or used by the metrics for the specific assessment purpose. It is structured into 3 columns: (i) technology, (ii) class, and (iii) discarded.

The column technology is used for the names of the different technologies mentioned or used by the metrics.

The column class is used for specifying the type of technology used. Possible values are: (a) application programming interface, (b) format, (c) identifier, (d) library, (e) licence, (f) protocol, (g) query language, (h) registry, (i) repository, (j) search engine, (k) semantic artefact, and (l) service.

The discarded column refers to the exclusion of the value 'linked data' from the accepted technologies since it is too generic. The possible values are boolean. We assigned TRUE if the technology was discarded.
Z
Data from: Multi-Source Distributed System Data for AI-powered Analytics
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Nov 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge Cardoso (2022). Multi-Source Distributed System Data for AI-powered Analytics [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3484800
Explore at:
Dataset updated
Nov 10, 2022
Dataset provided by
Jorge Cardoso
Sasho Nedelkoski
Jasmin Bogatinovski
Ajay Kumar Mandapati
Soeren Becker
Odej Kao
Description
Abstract:

In recent years there has been an increased interest in Artificial Intelligence for IT Operations (AIOps). This field utilizes monitoring data from IT systems, big data platforms, and machine learning to automate various operations and maintenance (O&M) tasks for distributed systems. The major contributions have been materialized in the form of novel algorithms. Typically, researchers took the challenge of exploring one specific type of observability data sources, such as application logs, metrics, and distributed traces, to create new algorithms. Nonetheless, due to the low signal-to-noise ratio of monitoring data, there is a consensus that only the analysis of multi-source monitoring data will enable the development of useful algorithms that have better performance.
Unfortunately, existing datasets usually contain only a single source of data, often logs or metrics. This limits the possibilities for greater advances in AIOps research. Thus, we generated high-quality multi-source data composed of distributed traces, application logs, and metrics from a complex distributed system. This paper provides detailed descriptions of the experiment, statistics of the data, and identifies how such data can be analyzed to support O&M tasks such as anomaly detection, root cause analysis, and remediation.

General Information:

This repository contains the simple scripts for data statistics, and link to the multi-source distributed system dataset.

You may find details of this dataset from the original paper:

Sasho Nedelkoski, Ajay Kumar Mandapati, Jasmin Bogatinovski, Soeren Becker, Jorge Cardoso, Odej Kao, "Multi-Source Distributed System Data for AI-powered Analytics". [link very soon]

If you use the data, implementation, or any details of the paper, please cite!

The multi-source/multimodal dataset is composed of distributed traces, application logs, and metrics produced from running a complex distributed system (Openstack). In addition, we also provide the workload and fault scripts together with the Rally report which can serve as ground truth (all at the Zenodo link below). We provide two datasets, which differ on how the workload is executed. The openstack_multimodal_sequential_actions is generated via executing workload of sequential user requests. The openstack_multimodal_concurrent_actions is generated via executing workload of concurrent user requests.

The difference of the concurrent dataset is that:

Due to the heavy load on the control node, the metric data for wally113 (control node) is not representative and we excluded it.

Three rally actions are executed in parallel: boot_and_delete, create_and_delete_networks, create_and_delete_image, whereas for the sequential there were 5 actions executed.

The raw logs in both datasets contain the same files. If the user wants the logs filetered by time with respect to the two datasets, should refer to the timestamps at the metrics (they provide the time window). In addition, we suggest to use the provided aggregated time ranged logs for both datasets in CSV format.

Important: The logs and the metrics are synchronized with respect time and they are both recorded on CEST (central european standard time). The traces are on UTC (Coordinated Universal Time -2 hours). They should be synchronized if the user develops multimodal methods.

Our GitHub repository can be found at: https://github.com/SashoNedelkoski/multi-source-observability-dataset/
Weekly Hospital Respiratory Data (HRD) Metrics by Jurisdiction, National...
data.cdc.gov
healthdata.gov
+1more
application/rdfxml +5
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Division of Healthcare Quality Promotion (DHQP) Surveillance Branch, National Healthcare Safety Network (NHSN) (2025). Weekly Hospital Respiratory Data (HRD) Metrics by Jurisdiction, National Healthcare Safety Network (NHSN) [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/Weekly-Hospital-Respiratory-Data-HRD-Metrics-by-Ju/ua7e-t2fy
Explore at:
xml, application/rdfxml, application/rssxml, csv, json, tsvAvailable download formats
Dataset updated
Jun 27, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Division of Healthcare Quality Promotion (DHQP) Surveillance Branch, National Healthcare Safety Network (NHSN)
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
This dataset represents weekly hospital respiratory data and metrics aggregated to national and state/territory levels reported to CDC’s National Health Safety Network (NHSN) beginning August 2020. Data for reporting dates through April 30, 2024 represent data reported during a previous mandated reporting period as specified by the HHS Secretary. Data for reporting dates May 1, 2024 – October 31, 2024 represent voluntarily reported data in the absence of a mandate. Data for reporting dates beginning November 1, 2024 represent data reported during a current mandated reporting period. All data and metrics capturing information on respiratory syncytial virus (RSV) were voluntarily reported until November 1, 2024. All data included in this dataset represent aggregated counts, and include metrics capturing information specific to hospital capacity, occupancy, hospitalizations, and new hospital admissions with corresponding metrics indicating reporting coverage for a given reporting week. NHSN monitors national and local trends in healthcare system stress and capacity for all acute care and critical access hospitals in the United States.

For more information on the reporting mandate per the Centers for Medicare and Medicaid Services (CMS) requirements, visit: Updates to the Condition of Participation (CoP) Requirements for Hospitals and Critical Access Hospitals (CAHs) To Report Acute Respiratory Illnesses.

For more information regarding NHSN’s collection of these data, including full reporting guidance, visit: NHSN Hospital Respiratory Data.

Source: CDC National Healthcare Safety Network (NHSN).
Data source description (updated November 15, 2024): As of October 9, 2024, Hospital Respiratory Data (HRD; formerly Respiratory Pathogen, Hospital Capacity, and Supply data or 'COVID-19 hospital data') are reported to HHS through CDC's National Healthcare Safety Network (NHSN) based on updated requirements from the Centers for Medicare and Medicaid Services (CMS). These data were voluntarily reported to NHSN May 1, 2024 until November 1, 2024, at which time CMS began requiring acute care and critical access hospitals to electronically report information via NHSN about COVID-19, influenza, and RSV, hospital bed census and capacity. Hospital bed capacity and occupancy data for all patients and for patients with COVID-19 or influenza for collection dates prior to May 1, 2024, represent data reported during a previously mandated reporting period as specified by the HHS Secretary, and data for collection dates May 1, 2024 – October 31, 2024 represent data reported voluntarily to NHSN. All RSV data through October 31, 2024 represent voluntarily reported data; as such, all voluntarily reported data included in this dataset represent reporting hospitals only for a given week and might not be complete or representative of all hospitals during the specified reporting periods.
NHSN monitors national and local trends in healthcare system stress and capacity for all acute care and critical access hospitals in the United States. Data reported by hospitals to NHSN represent aggregated counts and include metrics capturing information specific to hospital capacity, occupancy, hospitalizations, and admissions. Find more information about reporting to NHSN: https://www.cdc.gov/nhsn/psc/hospital-respiratory-reporting.html.
Data quality: While CDC reviews reported data for completeness and errors and corrects those found, some reporting errors might still exist within the data. CDC and partners work with reporters to correct these errors and update the data in subsequent weeks. Data reported as of December 1, 2020 are subject to thorough, routine data quality review procedures, including identifying and excluding invalid values from metric calculations and application of error correction methodology; data prior to this date may have anomalies that are not yet resolved. Data prior to August 1, 2020, are unavailable. As a result of data quality implementation and submission of any backfilled data, data and metrics might fluctuate or change week-over-week after initial posting.
Inclusion criteria and metric calculations:
Facility types and status: Many hospital subtypes, including acute care and critical access hospitals, are included in the metric calculations displayed on this page. Psychiatric, rehabilitation, and religious non-medical hospital types are excluded from calculations. Number of reporting hospitals is determined based on the NHSN unique hospital identifier and not aggregated to the CMS certification number (CCN). Only hospitals indicated as active reporters in NHSN are included.
For occupancy metrics through week ending October 5, 2024: hospitals that reported those data at least one day during a given week are included in the metric calculation, which are displayed as weekly averages.
For occupancy metrics beginning week ending October 12, 2024: hospitals that reported those data for Wednesday during a given week are included in the metric calculation, which are displayed as single day (i.e. Wednesday) values.
For new hospital admissions metrics through week ending October 5, 2024: hospitals that reported those data at least one day during a given week are included in the metric calculation, which are displayed as weekly totals. Under previous reporting requirements, new hospital admissions data were reported daily to NHSN, as the number of new hospital admissions for the previous day.
For new hospital admissions metrics beginning week ending October 12, 2024: hospitals that reported those data for an entire reporting week are included in the metric calculation, which are displayed as weekly totals. Under current reporting requirements, new admissions data are reported to represent the number of new admissions occurring on a given reporting date (rather than previous day) or during a given reporting week.
Find full details on NHSN Hospital Respiratory Data (HRD) reporting guidance, including additional information on bed type definitions at https://www.cdc.gov/nhsn/psc/hospital-respiratory-reporting.html.

Archived datasets updated during the mandatory hospital reporting period from August 1, 2020, to April 30, 2024:
https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-COVID-19-Hospitalization-Metr/akn2-qxic/about_data
https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-COVID-19-Hospitalization-Metr/82ci-krud/about_data
https://data.cdc.gov/Public-Health-Surveillance/Respiratory-Virus-Response-RVR-United-States-Hospi/9t9r-e5a3/about_data
https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-COVID-19-Hospitalization-Metr/7dk4-g6vg/about_data
https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-Hospitalization-Metrics-by-/39z2-9zu6/about_data

Archived datasets updated during the voluntary hospital reporting period from May 1, 2024, to October 31, 2024:
https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-COVID-19-Hospitalization-Metr/akn2-qxic/about_data
https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-Hospitalization-Metrics-by-Ju/ype6-idgy

Note: June 13th, 2025: Data for American Samoa (AS) for the June 1st, 2025 through June 7th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on June 13th, 2025.

June 6th, 2025: Data for American Samoa (AS) for the May 25th, 2025 through May 31th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on June 6th, 2025.

May 30th, 2025: Data for American Samoa (AS) for the May 18th, 2025 through May 24th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on May 30th, 2025.

May 23rd, 2025: Data for American Samoa (AS) for the May 11th, 2025 through May 17th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on May 23rd, 2025.

April 25th, 2025: Data for American Samoa (AS) for the April 13th, 2025 through April 19th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 25th, 2025.

April 18th, 2025: Data for American Samoa (AS) for the April 6th, 2025 through April 12th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 18th, 2025.

April 11th, 2025: Data for American Samoa (AS) for the March 30th, 2025 through April 5th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 11th, 2025.

March 28th, 2025: Data for Guam (GU) for the March 16th, 2025 through March 22nd, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on March 28th, 2025.

March 21st, 2025: Data for the Commonwealth of the Northern Mariana Islands (CNMI) for the March 9th, 2025 through March 15th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on March 21st, 2025.

March 14th, 2025: Data for American Samoa (AS) and the Commonwealth of the Northern Mariana Islands (CNMI) for the March 2nd, 2025 through March 8th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report
Z
Dataset on the Human Body as a Signal Propagation Medium
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A. Sevcenko (2024). Dataset on the Human Body as a Signal Propagation Medium [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8214496
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
V. Aristovs
J. Ormanis
V. Medvedevs
V. Abolins
A. Elsts
A. Sevcenko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview: This is a large-scale dataset with impedance and signal loss data recorded on volunteer test subjects using low-voltage alternate current sine-shaped signals. The signal frequencies are from 50 kHz to 20 MHz.

Applications: The intention of this dataset is to allow to investigate the human body as a signal propagation medium, and capture information related to how the properties of the human body (age, sex, composition etc.), the measurement locations, and the signal frequencies impact the signal loss over the human body.

Overview statistics:

Number of subjects: 30

Number of transmitter locations: 6

Number of receiver locations: 6

Number of measurement frequencies: 19

Input voltage: 1 V

Load resistance: 50 ohm and 1 megaohm

Measurement group statistics:

Height: 174.10 (7.15)

Weight: 72.85 (16.26)

BMI: 23.94 (4.70)

Body fat %: 21.53 (7.55)

Age group: 29.00 (11.25)

Male/female ratio: 50%

Included files:

experiment_protocol_description.docx - protocol used in the experiments

electrode_placement_schematic.png - schematic of placement locations

electrode_placement_photo.jpg - visualization on the experiment, on a volunteer subject

RawData - the full measurement results and experiment info sheets

all_measurements.csv - the most important results extracted to .csv

all_measurements_filtered.csv - same, but after z-score filtering

all_measurements_by_freq.csv - the most important results extracted to .csv, single frequency per row

all_measurements_by_freq_filtered.csv - same, but after z-score filtering

summary_of_subjects.csv - key statistics on the subjects from the experiment info sheets

process_json_files.py - script that creates .csv from the raw data

filter_results.py - outlier removal based on z-score

plot_sample_curves.py - visualization of a randomly selected measurement result subset

plot_measurement_group.py - visualization of the measurement group

CSV file columns:

subject_id - participant's random unique ID

experiment_id - measurement session's number for the participant

height - participant's height, cm

weight - participant's weight, kg

BMI - body mass index, computed from the valued above

body_fat_% - body fat composition, as measured by bioimpedance scales

age_group - age rounded to 10 years, e.g. 20, 30, 40 etc.

male - 1 if male, 0 if female

tx_point - transmitter point number

rx_point - receiver point number

distance - distance, in relative units, between the tx and rx points. Not scaled in terms of participant's height and limb lengths!

tx_point_fat_level - transmitter point location's average fat content metric. Not scaled for each participant individually.

rx_point_fat_level - receiver point location's average fat content metric. Not scaled for each participant individually.

total_fat_level - sum of rx and tx fat levels

bias - constant term to simplify data analytics, always equal to 1.0

CSV file columns, frequency-specific:

tx_abs_Z_... - transmitter-side impedance, as computed by the process_json_files.py script from the voltage drop

rx_gain_50_f_... - experimentally measured gain on the receiver, in dB, using 50 ohm load impedance

rx_gain_1M_f_... - experimentally measured gain on the receiver, in dB, using 1 megaohm load impedance

Acknowledgments: The dataset collection was funded by the Latvian Council of Science, project “Body-Coupled Communication for Body Area Networks”, project No. lzp-2020/1-0358.

References: For a more detailed information, see this article: J. Ormanis, V. Medvedevs, A. Sevcenko, V. Aristovs, V. Abolins, and A. Elsts. Dataset on the Human Body as a Signal Propagation Medium for Body Coupled Communication. Submitted to Elsevier Data in Brief, 2023.

Contact information: info@edi.lv
f
Metricating #respbib18 and #ResponsibleMetrics: A Comparison
city.figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ernesto Priego (2023). Metricating #respbib18 and #ResponsibleMetrics: A Comparison [Dataset]. http://doi.org/10.25383/city.5868756.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.25383/city.5868756.v2
Dataset updated
May 30, 2023
Dataset provided by
City, University of London
Authors
Ernesto Priego
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summaries of Twitter numerical data captured from archiving #respbib18 (Responsible use of Bibliometrics in Practice, London, 31 January 2018) and #ResponsibleMetrics (The turning tide: A new culture of responsible metrics for research, London, 8 February 2018).No sensitive, personal nor personally-identifiable data is contained in this dataset. Usernames and names of individuals were removed from text analysis results. Text analysis performed with Voyant Tools. Categories included:

table { }tr { }col { }br { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri, sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }

Event title

Date

Times

URL

Sheet ID

Hashtag

Number of links

Number of RTs

Number of Tweets

Number of Unique tweets

First Tweet in Archive

Last Tweet in Archive

Number of In Reply Ids

Number of In Reply @s

Number of UsernamesNumber of Unique Usernames who used tag only once30 Most Frequent Terms in each archiveRaw FrequenciesRelative FrequenciesDistributions Stop words were applied, including usernames.
Dental Morphology data for "Demographic history of early centralized...
zenodo.org
tsv
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Demet Delibaş; Demet Delibaş; Nefize Ezgi Altınışık; Nefize Ezgi Altınışık; Marin Pilloud; Marin Pilloud; Meliha Melis Koruyucu; Meliha Melis Koruyucu; Yilmaz Selim Erdal; Yilmaz Selim Erdal (2025). Dental Morphology data for "Demographic history of early centralized societies: A biodistance study on prehistoric Anatolia" [Dataset]. http://doi.org/10.5281/zenodo.15058168
Explore at:
tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15058168
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Demet Delibaş; Demet Delibaş; Nefize Ezgi Altınışık; Nefize Ezgi Altınışık; Marin Pilloud; Marin Pilloud; Meliha Melis Koruyucu; Meliha Melis Koruyucu; Yilmaz Selim Erdal; Yilmaz Selim Erdal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes non-metric and metric dental trait data used in the article Demographic History of Early Centralized Societies: A Biodistance Study on Prehistoric Anatolia, published in Journal of Archaeological Science: Reports. The dataset consists of two files:

delibas_etal_nonmetric_rawdata.tsv: Contains non-metric dental trait data.

delibas_etal_metric_rawdata.tsv: Contains metric dental trait data.

Both files include data from six prehistoric settlements in Anatolia: Bakla Tepe, İkiztepe, Küllüoba, Titriş Höyük, Çatalhöyük, and Aşıklı Höyük. Data from Bakla Tepe, İkiztepe, Küllüoba, and Titriş Höyük are presented in their most raw form, while right and left jaws were merged in Çatalhöyük and Aşıklı Höyük to account for missing data.

All analyses, preprocessing procedures and abbreviations are detailed in the published article and its related supplementary files.