Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for fig 2 (NMDS)
This dataset is associated with the forthcoming publication entitled, "Microbial volatile organic compounds mediate attraction by a primary but not secondary stored product insect pest in wheat", and includes data on grain damage from near infrared spectroscopy, behavioral data from wind tunnel and release-recapture experiments, as well as volatile characterization of headspace from moldy grain. For all files, incubation intervals 9, 18, and 27 d represent how long grain was incubated after being tempered to a grain moisture of 12, 15, or 19% or left untempered (ctrl; 10.8% grain moisture). TSO = Trece storgard oil; empty = negative control (no stimulus), LGB = lesser grain borer (Rhzyopertha dominica), and RFB = red flour beetle (Tribolium castaneum). Note: The resource 'GC/MS Grain MVOC Headspace Data' was added 2021-08-04 with the deletion of some compounds as unlikely natural compounds and potential contaminants. This is the dataset that undergirds the non-metric multidimensional scaling analysis. See the included file list for more information about methods and results of each file in this dataset. Resources in this dataset:Resource Title: GC-MS/Headspace Data. File Name: tvw_final_gc_ms_data.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Microbial damage on wheat evaluated with near-infrared spectroscopy. File Name: tvw_nearinfrared_sorting_damaged_grain_fungal_exp.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Release-Recapture Datasets with LGB & RFB. File Name: tvw_rr_lgb_rfb_microbial_cues.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Wind tunnel response by RGB & LGB. File Name: tvw_wt_lgb_rfb_data_microbial_cues.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: GC/MS Grain MVOC Headspace Data. File Name: taylor_headspace_final_data_peer_reviewed_ag_commons.csvResource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: README file list. File Name: file_list_MVOCwheat.txt
No description is available. Visit https://dataone.org/datasets/868c087baba217e46370fe2f9dd37505 for complete metadata about this dataset.
https://borealisdata.ca/api/datasets/:persistentId/versions/7.6/customlicense?persistentId=doi:10.5683/SP3/FUYXAShttps://borealisdata.ca/api/datasets/:persistentId/versions/7.6/customlicense?persistentId=doi:10.5683/SP3/FUYXAS
A database of cranial measurements covering the Arctic and Northwestern North America as well as Northeast Asia, Eurasia, Africa, and the South Pacific.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To broaden bioarchaeological applicability of skeletal frailty indices (SFIs) and increase sample size, we propose indices with fewer biomarkers (2–11 non-metric biomarkers) and compare these reduced biomarker SFIs to the original metric/non-metric 13-biomarker SFI. From the 2-11-biomarker SFIs, we choose the index with the fewest biomarkers (6-biomarker SFI), which still maintains the statistical robusticity of a 13-biomarker SFI, and apply this index to the same Medieval monastic and nonmonastic populations, albeit with an increased sample size. For this increased monastic and nonmonastic sample, we also propose and implement a 4-biomarker SFI, comprised of biomarkers from each of four stressor categories, and compare these SFI distributions with those of the non-metric biomarker SFIs. From the Museum of London WORD database, we tabulate multiple SFIs (2- to 13-biomarkers) for Medieval monastic and nonmonastic samples (N = 134). We evaluate associations between these ten non-metric SFIs and the 13-biomarker SFI using Spearman’s correlation coefficients. Subsequently, we test non-metric 6-biomarker and 4-biomarker SFI distributions for associations with cemetery, age, and sex using Analysis of Variance/Covariance (ANOVA/ANCOVA) on larger samples from the monastic and nonmonastic cemeteries (N = 517). For Medieval samples, Spearman’s correlation coefficients show a significant association between the 13-biomarker SFI and all non-metric SFIs. Utilizing a 6-biomarker and parsimonious 4-biomarker SFI, we increase the nonmonastic and monastic samples and demonstrate significant lifestyle and sex differences in frailty that were not observed in the original, smaller sample. Results from the 6-biomarker and parsimonious 4-biomarker SFIs generally indicate similarities in means, explained variation (R2), and associated P-values (ANOVA/ANCOVA) within and between nonmonastic and monastic samples. We show that non-metric reduced biomarker SFIs provide alternative indices for application to other bioarchaeological collections. These findings suggest that a SFI, comprised of six or more non-metric biomarkers available for the specific sample, may have greater applicability than, but comparable statistical characteristics to, the originally proposed 13-biomarker SFI.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: Non-metric multidimensional scaling analyses of ostracode taxa from the Maastrichtian-Thanetian samples at IODP Site 342-U1407. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.872187 for more information.
The Stream Channel and Floodplain Metric Toolbox was developed to demonstrate the feasibility of mapping fluvial geomorphic features from high-resolution bare-earth elevation data. A Python toolbox for ArcGIS was built to calculate key metrics describing channel and floodplain geometry. Channel and Floodplain Metric Toolbox provides this ability in an automated fashion, allowing for regional analyses based solely on digital elevation models (DEMs). This manual describes the general operation of the toolbox and technical details describing the specific algorithms. The toolbox works best in a watershed no larger than HUC 10 (< 1,000 sq. km).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Metric multidimensional scaling (MDS) is a widely used multivariate method with applications in almost all scientific disciplines. Eigenvalues obtained in the analysis are usually reported in order to calculate the overall goodness-of-fit of the distance matrix. In this paper, we refine MDS goodness-of-fit calculations, proposing additional point and pairwise goodness-of-fit statistics that can be used to filter poorly represented observations in MDS maps. The proposed statistics are especially relevant for large data sets that contain outliers, with typically many poorly fitted observations, and are helpful for improving MDS output and emphasizing the most important features of the dataset. Several goodness-of-fit statistics are considered, and both Euclidean and non-Euclidean distance matrices are considered. Some examples with data from demographic, genetic and geographic studies are shown.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and code accompanying Megill et al. (2024): "Alternative climate metrics to the Global Warming Potential are more suitable for assessing aviation non-CO2 effects", published in Communications Earth & Environment. This dataset contains all data and code developed during research towards the linked article and contains all elements required to reproduce the linked figures and analysis.
The data was generated using the V2.1 of the climate-chemistry response model AirClim (Grewe and Stenke, 2008; Dahlmann et al., 2016, see References). The dataset includes analyses of the full aviation industry (scenarios CORSIA, COVID15s, CurTec, Fa1 and FP2050 from Grewe et al., 2021, see References) and of individual, theoretical aircraft designs of Category 4 (152 - 201 seats). Trajectory data is taken from DLR WeCare (Grewe et al., 2017, see References). The analysis of the results is performed with four Jupyter notebooks running Python.
All data files are licensed under a CC-BY 4.0. All Jupyter notebooks and the Python script are licensed under an Apache License v2.0.
Please note: The software code AirClim is confidential proprietary information of the DLR and cannot be made available to the public or readers without restrictions. Licensing of the code to third parties is conditioned upon the prior conclusion of a licensing agreement with the DLR. Qualified researchers can request an agreement on reasonable request from the corresponding author.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The orthophoto mosaic is a rectified georeferenced image of the Heard Island, Laurens Peninsula Coastal Area. Distortions due to relief and tilt displacement have been removed. Orthophotos were derived from non-metric cameras (focal length unknown).
A. SUMMARY This dataset contains the underlying data for the Vision Zero Benchmarking website. Vision Zero is the collaborative, citywide effort to end traffic fatalities in San Francisco. The goal of this benchmarking effort is to provide context to San Francisco’s work and progress on key Vision Zero metrics alongside its peers. The Controller's Office City Performance team collaborated with the San Francisco Municipal Transportation Agency, the San Francisco Department of Public Health, the San Francisco Police Department, and other stakeholders on this project. B. HOW THE DATASET IS CREATED The Vision Zero Benchmarking website has seven major metrics. The City Performance team collected the data for each metric separately, cleaned it, and visualized it on the website. This dataset has all seven metrics and some additional underlying data. The majority of the data is available through public sources, but a few data points came from the peer cities themselves. C. UPDATE PROCESS This dataset is for historical purposes only and will not be updated. To explore more recent data, visit the source website for the relevant metrics. D. HOW TO USE THIS DATASET This dataset contains all of the Vision Zero Benchmarking metrics. Filter for the metric of interest, then explore the data. Where applicable, datasets already include a total. For example, under the Fatalities metric, the "Total Fatalities" category within the metric shows the total fatalities in that city. Any calculations should be reviewed to not double-count data with this total. E. RELATED DATASETS N/A
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The Heard Island Topographic Data was mapped from Ortho-rectified non-metric photography. The data consists of Coastline, Glacier, Lagoon, Offshore Rocks, Water Storage and Watercourse datasets digitised from the photography, all of which are available for download at the url given below.
The U.S. Geological Survey (USGS) Water Resources Mission Area (WMA) is working to address a need to understand where the Nation is experiencing water shortages or surpluses relative to the demand for water need by delivering routine assessments of water supply and demand and an understanding of the natural and human factors affecting the balance between supply and demand. A key part of these national assessments is identifying long-term trends in water availability, including groundwater and surface water quantity, quality, and use. This data release contains Mann-Kendall monotonic trend analyses for 18 observed annual and monthly streamflow metrics at 6,347 U.S. Geological Survey streamgages located in the conterminous United States, Alaska, Hawaii, and Puerto Rico. Streamflow metrics include annual mean flow, maximum 1-day and 7-day flows, minimum 7-day and 30-day flows, and the date of the center of volume (the date on which 50% of the annual flow has passed by a gage), along with the mean flow for each month of the year. Annual streamflow metrics are computed from mean daily discharge records at U.S. Geological Survey streamgages that are publicly available from the National Water Information System (NWIS). Trend analyses are computed using annual streamflow metrics computed through climate year 2022 (April 2022- March 2023) for low-flow metrics and water year 2022 (October 2021 - September 2022) for all other metrics. Trends at each site are available for up to four different periods: (i) the longest possible period that meets completeness criteria at each site, (ii) 1980-2020, (iii) 1990-2020, (iv) 2000-2020. Annual metric time series analyzed for trends must have 80 percent complete records during fixed periods. In addition, each of these time series must have 80 percent complete records during their first and last decades. All longest possible period time series must be at least 10 years long and have annual metric values for at least 80% of the years running from 2013 to 2022. This data release provides the following five CSV output files along with a model archive: (1) streamflow_trend_results.csv - contains test results of all trend analyses with each row representing one unique combination of (i) NWIS streamgage identifiers, (ii) metric (computed using Oct 1 - Sep 30 water years except for low-flow metrics computed using climate years (Apr 1 - Mar 31), (iii) trend periods of interest (longest possible period through 2022, 1980-2020, 1990-2020, 2000-2020) and (iv) records containing either the full trend period or only a portion of the trend period following substantial increases in cumulative upstream reservoir storage capacity. This is an output from the final process step (#5) of the workflow. (2) streamflow_trend_trajectories_with_confidence_bands.csv - contains annual trend trajectories estimated using Theil-Sen regression, which estimates the median of the probability distribution of a metric for a given year, along with 90 percent confidence intervals (5th and 95h percentile values). This is an output from the final process step (#5) of the workflow. (3) streamflow_trend_screening_all_steps.csv - contains the screening results of all 7,873 streamgages initially considered as candidate sites for trend analysis and identifies the screens that prevented some sites from being included in the Mann-Kendall trend analysis. (4) all_site_year_metrics.csv - contains annual time series values of streamflow metrics computed from mean daily discharge data at 7,873 candidate sites. This is an output of Process Step 1 in the workflow. (5) all_site_year_filters.csv - contains information about the completeness and quality of daily mean discharge at each streamgage during each year (water year, climate year, and calendar year). This is also an output of Process Step 1 in the workflow and is combined with all_site_year_metrics.csv in Process Step 2. In addition, a .zip file contains a model archive for reproducing the trend results using R 4.4.1 statistical software. See the README file contained in the model archive for more information. Caution must be exercised when utilizing monotonic trend analyses conducted over periods of up to several decades (and in some places longer ones) due to the potential for confounding deterministic gradual trends with multi-decadal climatic fluctuations. In addition, trend results are available for post-reservoir construction periods within the four trend periods described above to avoid including abrupt changes arising from the construction of larger reservoirs in periods for which gradual monotonic trends are computed. Other abrupt changes, such as changes to water withdrawals and wastewater return flows, or episodic disturbances with multi-year recovery periods, such as wildfires, are not evaluated. Sites with pronounced abrupt changes or other non-monotonic trajectories of change may require more sophisticated trend analyses than those presented in this data release.
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The Fitness Tracker Dataset contains detailed information about individuals' fitness metrics, exercise routines, and health parameters. This dataset is designed to provide insights into fitness trends, workout habits, and overall health patterns. It is ideal for exploratory data analysis (EDA), machine learning applications, and health analytics. The dataset can help identify relationships between physical activity, body metrics, and health outcomes.
Features: Age: Age of the individual in years. Gender:Gender of the individual (e.g., Male, Female). Weight (kg): Weight of the individual in kilograms. Height (m): Height of the individual in meters. Max_BPM:Maximum heartbeats per minute recorded during exercise. Avg_BPM: Average heartbeats per minute during a workout session. Resting_BPM:Resting heartbeats per minute. Session_Duration (hours):Duration of the workout session in hours. Calories_Burned:Total calories burned during a workout session. Workout_Type:Type of workout performed (e.g., Cardio, Strength, Yoga). Fat_Percentage:Percentage of body fat. Water_Intake (liters):Water intake in liters during or after the workout. Workout_Frequency (days/week): Number of days per week the individual exercises. Experience_Level:Level of fitness experience (e.g., Beginner, Intermediate, Advanced). BMI:Body Mass Index, calculated as weight (kg) / height (m)^2.
##Usage: This dataset is suitable for: - Analyzing the impact of fitness routines on health metrics. Exploring trends in heart rate, calorie burn, and workout habits. Correlating body metrics like BMI and fat percentage with exercise patterns. Building predictive models for fitness and health analytics. This is a synthetic dataset created for educational and analytical purposes and does not represent real-world data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data sets accompanying the paper "The FAIR Assessment Conundrum: Reflections on Tools and Metrics", an analysis of a comprehensive set of FAIR assessment tools and the metrics used by these tools for the assessment.
The data set "metrics.csv" consists of the metrics collected from several sources linked to the analysed FAIR assessments tools. It is structured into 11 columns: (i) tool_id, (ii) tool_name, (iii) metric_discarded, (iv) metric_fairness_scope_declared, (v) metric_fairness_scope_observed, (vi) metric_id, (vii) metric_text, (viii) metric_technology, (ix) metric_approach, (x) last_accessed_date, and (xi) provenance.
The columns tool_id and tool_name are used for the identifier we assigned to each tool analysed and the full name of the tool respectively.
The metric_discarded column refers to the selection we operated on the collected metrics, since we excluded the metrics created for testing purposes or written in a language different from English. The possible values are boolean. We assigned TRUE if the metric was discarded.
The columns metric_fairness_scope_declared and metric_fairness_scope_observed are used for indicating the declared intent of the metrics, with respect to the FAIR principle assessed, and the one we observed respectively. Possible values are: (a) a letter of the FAIR acronym (for the metrics without a link declared to a specific FAIR principle), (b) one or more identifiers of the FAIR principles (F1, F2…), (c) n/a, if no FAIR references were declared, or (d) none, if no FAIR references were observed.
The metric_id and metric_text columns are used for the identifiers of the metrics and the textual and human-oriented content of the metrics respectively.
The column metric_technology is used for enumerating the technologies (a term used in its widest acceptation) mentioned or used by the metrics for the specific assessment purpose. Such technologies include very diverse typologies ranging from (meta)data formats to standards, semantic technologies, protocols, and services. For tools implementing automated assessments, the technologies listed take into consideration also the available code and documentation, not just the metric text.
The column metric_approach is used for identifying the type of implementation observed in the assessments. The identification of the implementation types followed a bottom-to-top approach applied to the metrics organised by the metric_fairness_scope_declared values. Consequently, while the labels used for creating the implementation type strings are the same, their combination and specialisation varies based on the characteristics of the actual set of metrics analysed. The main labels used are: (a) 3rd party service-based, (b) documentation-centred, (c) format-centred, (d) generic, (e) identifier-centred, (f) policy-centred, (g) protocol-centred, (h) metadata element-centred, (i) metadata schema-centred, (j) metadata value-centred, (k) service-centred, and (l) na.
The columns provenance and last_accessed_date are used for the main source of information about each metric (at least with regard to the text) and the date we last accessed it respectively.
The data set "classified_technologies.csv" consists of the technologies mentioned or used by the metrics for the specific assessment purpose. It is structured into 3 columns: (i) technology, (ii) class, and (iii) discarded.
The column technology is used for the names of the different technologies mentioned or used by the metrics.
The column class is used for specifying the type of technology used. Possible values are: (a) application programming interface, (b) format, (c) identifier, (d) library, (e) licence, (f) protocol, (g) query language, (h) registry, (i) repository, (j) search engine, (k) semantic artefact, and (l) service.
The discarded column refers to the exclusion of the value 'linked data' from the accepted technologies since it is too generic. The possible values are boolean. We assigned TRUE if the technology was discarded.
Abstract:
In recent years there has been an increased interest in Artificial Intelligence for IT Operations (AIOps). This field utilizes monitoring data from IT systems, big data platforms, and machine learning to automate various operations and maintenance (O&M) tasks for distributed systems.
The major contributions have been materialized in the form of novel algorithms.
Typically, researchers took the challenge of exploring one specific type of observability data sources, such as application logs, metrics, and distributed traces, to create new algorithms.
Nonetheless, due to the low signal-to-noise ratio of monitoring data, there is a consensus that only the analysis of multi-source monitoring data will enable the development of useful algorithms that have better performance.
Unfortunately, existing datasets usually contain only a single source of data, often logs or metrics. This limits the possibilities for greater advances in AIOps research.
Thus, we generated high-quality multi-source data composed of distributed traces, application logs, and metrics from a complex distributed system. This paper provides detailed descriptions of the experiment, statistics of the data, and identifies how such data can be analyzed to support O&M tasks such as anomaly detection, root cause analysis, and remediation.
General Information:
This repository contains the simple scripts for data statistics, and link to the multi-source distributed system dataset.
You may find details of this dataset from the original paper:
Sasho Nedelkoski, Ajay Kumar Mandapati, Jasmin Bogatinovski, Soeren Becker, Jorge Cardoso, Odej Kao, "Multi-Source Distributed System Data for AI-powered Analytics". [link very soon]
If you use the data, implementation, or any details of the paper, please cite!
The multi-source/multimodal dataset is composed of distributed traces, application logs, and metrics produced from running a complex distributed system (Openstack). In addition, we also provide the workload and fault scripts together with the Rally report which can serve as ground truth (all at the Zenodo link below). We provide two datasets, which differ on how the workload is executed. The openstack_multimodal_sequential_actions is generated via executing workload of sequential user requests. The openstack_multimodal_concurrent_actions is generated via executing workload of concurrent user requests.
The difference of the concurrent dataset is that:
Due to the heavy load on the control node, the metric data for wally113 (control node) is not representative and we excluded it.
Three rally actions are executed in parallel: boot_and_delete, create_and_delete_networks, create_and_delete_image, whereas for the sequential there were 5 actions executed.
The raw logs in both datasets contain the same files. If the user wants the logs filetered by time with respect to the two datasets, should refer to the timestamps at the metrics (they provide the time window). In addition, we suggest to use the provided aggregated time ranged logs for both datasets in CSV format.
Important: The logs and the metrics are synchronized with respect time and they are both recorded on CEST (central european standard time). The traces are on UTC (Coordinated Universal Time -2 hours). They should be synchronized if the user develops multimodal methods.
Our GitHub repository can be found at: https://github.com/SashoNedelkoski/multi-source-observability-dataset/
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
This dataset represents weekly hospital respiratory data and metrics aggregated to national and state/territory levels reported to CDC’s National Health Safety Network (NHSN) beginning August 2020. Data for reporting dates through April 30, 2024 represent data reported during a previous mandated reporting period as specified by the HHS Secretary. Data for reporting dates May 1, 2024 – October 31, 2024 represent voluntarily reported data in the absence of a mandate. Data for reporting dates beginning November 1, 2024 represent data reported during a current mandated reporting period. All data and metrics capturing information on respiratory syncytial virus (RSV) were voluntarily reported until November 1, 2024. All data included in this dataset represent aggregated counts, and include metrics capturing information specific to hospital capacity, occupancy, hospitalizations, and new hospital admissions with corresponding metrics indicating reporting coverage for a given reporting week. NHSN monitors national and local trends in healthcare system stress and capacity for all acute care and critical access hospitals in the United States.
For more information on the reporting mandate per the Centers for Medicare and Medicaid Services (CMS) requirements, visit: Updates to the Condition of Participation (CoP) Requirements for Hospitals and Critical Access Hospitals (CAHs) To Report Acute Respiratory Illnesses.
For more information regarding NHSN’s collection of these data, including full reporting guidance, visit: NHSN Hospital Respiratory Data.
Source: CDC National Healthcare Safety Network (NHSN).
Archived datasets updated during the mandatory hospital reporting period from August 1, 2020, to April 30, 2024:
Archived datasets updated during the voluntary hospital reporting period from May 1, 2024, to October 31, 2024:
Note: June 13th, 2025: Data for American Samoa (AS) for the June 1st, 2025 through June 7th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on June 13th, 2025.
June 6th, 2025: Data for American Samoa (AS) for the May 25th, 2025 through May 31th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on June 6th, 2025.
May 30th, 2025: Data for American Samoa (AS) for the May 18th, 2025 through May 24th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on May 30th, 2025.
May 23rd, 2025: Data for American Samoa (AS) for the May 11th, 2025 through May 17th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on May 23rd, 2025.
April 25th, 2025: Data for American Samoa (AS) for the April 13th, 2025 through April 19th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 25th, 2025.
April 18th, 2025: Data for American Samoa (AS) for the April 6th, 2025 through April 12th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 18th, 2025.
April 11th, 2025: Data for American Samoa (AS) for the March 30th, 2025 through April 5th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on April 11th, 2025.
March 28th, 2025: Data for Guam (GU) for the March 16th, 2025 through March 22nd, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on March 28th, 2025.
March 21st, 2025: Data for the Commonwealth of the Northern Mariana Islands (CNMI) for the March 9th, 2025 through March 15th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report released on March 21st, 2025.
March 14th, 2025: Data for American Samoa (AS) and the Commonwealth of the Northern Mariana Islands (CNMI) for the March 2nd, 2025 through March 8th, 2025 reporting period are not available for the Weekly NHSN Hospital Respiratory Data report
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This is a large-scale dataset with impedance and signal loss data recorded on volunteer test subjects using low-voltage alternate current sine-shaped signals. The signal frequencies are from 50 kHz to 20 MHz.
Applications: The intention of this dataset is to allow to investigate the human body as a signal propagation medium, and capture information related to how the properties of the human body (age, sex, composition etc.), the measurement locations, and the signal frequencies impact the signal loss over the human body.
Overview statistics:
Number of subjects: 30
Number of transmitter locations: 6
Number of receiver locations: 6
Number of measurement frequencies: 19
Input voltage: 1 V
Load resistance: 50 ohm and 1 megaohm
Measurement group statistics:
Height: 174.10 (7.15)
Weight: 72.85 (16.26)
BMI: 23.94 (4.70)
Body fat %: 21.53 (7.55)
Age group: 29.00 (11.25)
Male/female ratio: 50%
Included files:
experiment_protocol_description.docx - protocol used in the experiments
electrode_placement_schematic.png - schematic of placement locations
electrode_placement_photo.jpg - visualization on the experiment, on a volunteer subject
RawData - the full measurement results and experiment info sheets
all_measurements.csv - the most important results extracted to .csv
all_measurements_filtered.csv - same, but after z-score filtering
all_measurements_by_freq.csv - the most important results extracted to .csv, single frequency per row
all_measurements_by_freq_filtered.csv - same, but after z-score filtering
summary_of_subjects.csv - key statistics on the subjects from the experiment info sheets
process_json_files.py - script that creates .csv from the raw data
filter_results.py - outlier removal based on z-score
plot_sample_curves.py - visualization of a randomly selected measurement result subset
plot_measurement_group.py - visualization of the measurement group
CSV file columns:
subject_id - participant's random unique ID
experiment_id - measurement session's number for the participant
height - participant's height, cm
weight - participant's weight, kg
BMI - body mass index, computed from the valued above
body_fat_% - body fat composition, as measured by bioimpedance scales
age_group - age rounded to 10 years, e.g. 20, 30, 40 etc.
male - 1 if male, 0 if female
tx_point - transmitter point number
rx_point - receiver point number
distance - distance, in relative units, between the tx and rx points. Not scaled in terms of participant's height and limb lengths!
tx_point_fat_level - transmitter point location's average fat content metric. Not scaled for each participant individually.
rx_point_fat_level - receiver point location's average fat content metric. Not scaled for each participant individually.
total_fat_level - sum of rx and tx fat levels
bias - constant term to simplify data analytics, always equal to 1.0
CSV file columns, frequency-specific:
tx_abs_Z_... - transmitter-side impedance, as computed by the process_json_files.py
script from the voltage drop
rx_gain_50_f_... - experimentally measured gain on the receiver, in dB, using 50 ohm load impedance
rx_gain_1M_f_... - experimentally measured gain on the receiver, in dB, using 1 megaohm load impedance
Acknowledgments: The dataset collection was funded by the Latvian Council of Science, project “Body-Coupled Communication for Body Area Networks”, project No. lzp-2020/1-0358.
References: For a more detailed information, see this article: J. Ormanis, V. Medvedevs, A. Sevcenko, V. Aristovs, V. Abolins, and A. Elsts. Dataset on the Human Body as a Signal Propagation Medium for Body Coupled Communication. Submitted to Elsevier Data in Brief, 2023.
Contact information: info@edi.lv
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summaries of Twitter numerical data captured from archiving #respbib18 (Responsible use of Bibliometrics in Practice, London, 31 January 2018) and #ResponsibleMetrics (The turning tide: A new culture of responsible metrics for research, London, 8 February 2018).No sensitive, personal nor personally-identifiable data is contained in this dataset. Usernames and names of individuals were removed from text analysis results. Text analysis performed with Voyant Tools. Categories included:
table { }tr { }col { }br { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri, sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }
Event title
Date
Times
URL
Sheet ID
Hashtag
Number of links
Number of RTs
Number of Tweets
Number of Unique tweets
First Tweet in Archive
Last Tweet in Archive
Number of In Reply Ids
Number of In Reply @s
Number of UsernamesNumber of Unique Usernames who used tag only once30 Most Frequent Terms in each archiveRaw FrequenciesRelative FrequenciesDistributions Stop words were applied, including usernames.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes non-metric and metric dental trait data used in the article Demographic History of Early Centralized Societies: A Biodistance Study on Prehistoric Anatolia, published in Journal of Archaeological Science: Reports. The dataset consists of two files:
Both files include data from six prehistoric settlements in Anatolia: Bakla Tepe, İkiztepe, Küllüoba, Titriş Höyük, Çatalhöyük, and Aşıklı Höyük. Data from Bakla Tepe, İkiztepe, Küllüoba, and Titriş Höyük are presented in their most raw form, while right and left jaws were merged in Çatalhöyük and Aşıklı Höyük to account for missing data.
All analyses, preprocessing procedures and abbreviations are detailed in the published article and its related supplementary files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for fig 2 (NMDS)