100+ datasets found

f
Frequency table for different selected variables.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barna, Sutapa Dey; Raihan, Hasin; Khan, Nafiul Alam; Hossain, Tanvir; Islam, Akhtarul (2020). Frequency table for different selected variables. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000486086
Explore at:
Dataset updated
Aug 26, 2020
Authors
Barna, Sutapa Dey; Raihan, Hasin; Khan, Nafiul Alam; Hossain, Tanvir; Islam, Akhtarul
Description
Frequency table for different selected variables.
f
Weighted frequency distribution for selected variables.
plos.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xls
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mst. Tanmin Nahar; S. M. Farhad Ibn Anik; Md. Akhtarul Islam; Sheikh Mohammed Shariful Islam (2023). Weighted frequency distribution for selected variables. [Dataset]. http://doi.org/10.1371/journal.pone.0267660.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0267660.t001
Dataset updated
Jun 14, 2023
Dataset provided by
PLOS ONE
Authors
Mst. Tanmin Nahar; S. M. Farhad Ibn Anik; Md. Akhtarul Islam; Sheikh Mohammed Shariful Islam
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Weighted frequency distribution for selected variables.
ORCA-VFD Dataset: Reliability, Degradation, and Remaining Useful Life Data...
figshare.com
csv
Updated Nov 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carl Tolbert (2025). ORCA-VFD Dataset: Reliability, Degradation, and Remaining Useful Life Data for Variable Frequency Drives [Dataset]. http://doi.org/10.6084/m9.figshare.30727865.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30727865.v1
Dataset updated
Nov 26, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Carl Tolbert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ORCA-VFD is a multi-domain dataset for reliability modeling and remaining useful life (RUL) estimation of variable frequency drives (VFDs).The dataset integrates physics-derived sequences, processed fault-injection data, and field-informed degradation patterns to create a unified framework for lifecycle analysis and predictive maintenance research.This Figshare record contains the full synthesized ORCA-VFD lifecycle dataset, including:100,000-hour lifecycle trajectories spanning infant-mortality, useful-life, and wearout phases.Anomaly score sequences derived from physics-informed feature engineering.Core-8 reliability features computed from VFD electrical signatures.Processed versions of physics, fault, and field datasets, transformed into the standardized ORCA-VFD format.Training, validation, and test sets used for remaining useful life model development.Metadata files, including feature definitions and lifecycle documentation.Raw third-party datasets (e.g., Hanke physics data, PMSM fault data) are not redistributed here and are available at their original sources as cited in the ORCA-VFD manuscript.This Figshare package includes only newly created or transformed data, compliant with open data licensing practices.The ORCA-VFD dataset supports research in:predictive maintenancephysics-informed machine learningVFD degradation modelingreliability engineeringRUL predictiondomain adaptation and cross-domain validationeconomic optimization of maintenance actionsThe companion GitHub repository provides the full modeling code, lifecycle synthesis scripts, feature engineering tools, and sample files:https://github.com/gencaddy2/ORCA-VFD
d
Response variables derived from predicted high-frequency chloride...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Response variables derived from predicted high-frequency chloride concentrations and specific conductance values [Dataset]. https://catalog.data.gov/dataset/response-variables-derived-from-predicted-high-frequency-chloride-concentrations-and-speci
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This data set contains 18 metrics used to describe patterns in specific conductance (SC) and chloride concentrations in 93 streams located across the eastern United States. These data were quantified for an analysis described in Moore and others (in review). All metrics were quantified for a water year and a median was taken across all years for which data were available to provide a single value for each site. High-frequency SC and chloride were measured or estimated at sub-daily time steps from 2-minute intervals to hourly intervals (e.g., high-frequency) depending on the site. Moore, J., R. Fanelli, and A. Sekellick. In review. High-frequency data reveal deicing salts drive elevated conductivity and chloride along with pervasive and frequent exceedances of the EPA aquatic life criteria for chloride in urban streams. Submitted to Environmental Science and Technology.
Frequency distributions of the background variables (N = 542).
plos.figshare.com
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph T. F. Lau; Zixin Wang; Jean H. Kim; Mason Lau; Coco H. Y. Lai; Phoenix K. H. Mo (2023). Frequency distributions of the background variables (N = 542). [Dataset]. http://doi.org/10.1371/journal.pone.0057204.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0057204.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Joseph T. F. Lau; Zixin Wang; Jean H. Kim; Mason Lau; Coco H. Y. Lai; Phoenix K. H. Mo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Frequency distributions of the background variables (N = 542).
w
COVID-19 High Frequency Phone Survey of Households 2020 - World Bank LSMS...
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated Oct 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Statistics Agency of Ethiopia (2021). COVID-19 High Frequency Phone Survey of Households 2020 - World Bank LSMS Harmonized Dataset - Ethiopia [Dataset]. https://microdata.worldbank.org/index.php/catalog/4072
Explore at:
Dataset updated
Oct 25, 2021
Dataset authored and provided by
Central Statistics Agency of Ethiopia
Time period covered
2018 - 2021
Area covered
Ethiopia
Description
Abstract

To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.

The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.

Two harmonized datafiles are prepared for each survey. The two datafiles are: 1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales. 2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.

Geographic coverage

National coverage

Analysis unit

Households

Individuals

Universe

The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

Kind of data

Sample survey data [ssd]

Sampling procedure

See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.

Mode of data collection

Computer Assisted Personal Interview [capi]

Cleaning operations

Ethiopia Socioeconomic Survey (ESS) 2018-2019 and Ethiopia COVID-19 High Frequency Phone Survey of Households (HFPS) 2020 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).

The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.

Response rate

See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.
High-Frequency Phone Survey on COVID-19 - World Bank LSMS Harmonized Dataset...
microdata.worldbank.org
catalog.ihsn.org
Updated Oct 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Malawi National Statistical Office (NSO) (2021). High-Frequency Phone Survey on COVID-19 - World Bank LSMS Harmonized Dataset - Malawi [Dataset]. https://microdata.worldbank.org/index.php/catalog/4071
Explore at:
Dataset updated
Oct 25, 2021
Dataset provided by
National Statistical Office of Malawihttp://www.nsomalawi.mw/
Authors
Malawi National Statistical Office (NSO)
Time period covered
2019 - 2021
Area covered
Malawi
Description
Abstract

To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.

The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.

Two harmonized datafiles are prepared for each survey. The two datafiles are: 1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales.
2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.

Geographic coverage

National coverage

Analysis unit

Households

Individuals

Universe

The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

Kind of data

Sample survey data [ssd]

Sampling procedure

See “Malawi - Integrated Household Panel Survey 2010-2013-2016-2019 (Long-Term Panel, 102 EAs)” and “Malawi - High-Frequency Phone Survey on COVID-19” available in the Microdata Library for details.

Mode of data collection

Computer Assisted Personal Interview [capi]

Cleaning operations

Malawi Integrated Household Panel Survey (IHPS) 2019 and Malawi High-Frequency Phone Survey on COVID-19 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).

The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.

Response rate

See “Malawi - Integrated Household Panel Survey 2010-2013-2016-2019 (Long-Term Panel, 102 EAs)” and “Malawi - High-Frequency Phone Survey on COVID-19” available in the Microdata Library for details.
Number and frequency (% within parentheses) of a variable being found...
plos.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xls
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S. Matthew Drenner; Timothy D. Clark; Charlotte K. Whitney; Eduardo G. Martins; Steven J. Cooke; Scott G. Hinch (2023). Number and frequency (% within parentheses) of a variable being found significant out of the total number of significant findings for behaviour (n = 151) or survival (n = 66). [Dataset]. http://doi.org/10.1371/journal.pone.0031311.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0031311.t005
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
S. Matthew Drenner; Timothy D. Clark; Charlotte K. Whitney; Eduardo G. Martins; Steven J. Cooke; Scott G. Hinch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note that the table is based on studies focusing solely on behaviour or survival, but not both.
Pre and Post-Exercise Heart Rate Analysis
kaggle.com
zip
Updated Sep 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdullah M Almutairi (2024). Pre and Post-Exercise Heart Rate Analysis [Dataset]. https://www.kaggle.com/datasets/abdullahmalmutairi/pre-and-post-exercise-heart-rate-analysis
Explore at:
zip(3857 bytes)Available download formats
Dataset updated
Sep 29, 2024
Authors
Abdullah M Almutairi
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview:

This dataset contains simulated (hypothetical) but almost realistic (based on AI) data related to sleep, heart rate, and exercise habits of 500 individuals. It includes both pre-exercise and post-exercise resting heart rates, allowing for analyses such as a dependent t-test (Paired Sample t-test) to observe changes in heart rate after an exercise program. The dataset also includes additional health-related variables, such as age, hours of sleep per night, and exercise frequency.

The data is designed for tasks involving hypothesis testing, health analytics, or even machine learning applications that predict changes in heart rate based on personal attributes and exercise behavior. It can be used to understand the relationships between exercise frequency, sleep, and changes in heart rate.

File: Filename: heart_rate_data.csv File Format: CSV

- Features (Columns):

Age: Description: The age of the individual. Type: Integer Range: 18-60 years Relevance: Age is an important factor in determining heart rate and the effects of exercise.

Sleep Hours: Description: The average number of hours the individual sleeps per night. Type: Float Range: 3.0 - 10.0 hours Relevance: Sleep is a crucial health metric that can impact heart rate and exercise recovery.

Exercise Frequency (Days/Week): Description: The number of days per week the individual engages in physical exercise. Type: Integer Range: 1-7 days/week Relevance: More frequent exercise may lead to greater heart rate improvements and better cardiovascular health.

Resting Heart Rate Before: Description: The individual’s resting heart rate measured before beginning a 6-week exercise program. Type: Integer Range: 50 - 100 bpm (beats per minute) Relevance: This is a key health indicator, providing a baseline measurement for the individual’s heart rate.

Resting Heart Rate After: Description: The individual’s resting heart rate measured after completing the 6-week exercise program. Type: Integer Range: 45 - 95 bpm (lower than the "Resting Heart Rate Before" due to the effects of exercise). Relevance: This variable is essential for understanding how exercise affects heart rate over time, and it can be used to perform a dependent t-test analysis.

Max Heart Rate During Exercise: Description: The maximum heart rate the individual reached during exercise sessions. Type: Integer Range: 120 - 190 bpm Relevance: This metric helps in understanding cardiovascular strain during exercise and can be linked to exercise frequency or fitness levels.

Potential Uses: Dependent T-Test Analysis: The dataset is particularly suited for a dependent (paired) t-test where you compare the resting heart rate before and after the exercise program for each individual.

Exploratory Data Analysis (EDA):Investigate relationships between sleep, exercise frequency, and changes in heart rate. Potential analyses include correlations between sleep hours and resting heart rate improvement, or regression analyses to predict heart rate after exercise.

Machine Learning: Use the dataset for predictive modeling, and build a beginner regression model to predict post-exercise heart rate using age, sleep, and exercise frequency as features.

Health and Fitness Insights: This dataset can be useful for studying how different factors like sleep and age influence heart rate changes and overall cardiovascular health.

License: Choose an appropriate open license, such as:

CC BY 4.0 (Attribution 4.0 International).

Inspiration for Kaggle Users: How does exercise frequency influence the reduction in resting heart rate? Is there a relationship between sleep and heart rate improvements post-exercise? Can we predict the post-exercise heart rate using other health variables? How do age and exercise frequency interact to affect heart rate?

Acknowledgments: This is a simulated dataset for educational purposes, generated to demonstrate statistical and machine learning applications in the field of health analytics.
w
COVID-19 National Longitudinal Phone Survey 2020 – World Bank LSMS...
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated Oct 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Bureau of Statistics (NBS) (2021). COVID-19 National Longitudinal Phone Survey 2020 – World Bank LSMS Harmonized Dataset - Nigeria [Dataset]. https://microdata.worldbank.org/index.php/catalog/3856
Explore at:
Dataset updated
Oct 25, 2021
Dataset authored and provided by
National Bureau of Statistics (NBS)
Time period covered
2018 - 2021
Area covered
Nigeria
Description
Abstract

To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.

The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.

Two harmonized datafiles are prepared for each survey. The two datafiles are: 1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales.
2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.

Geographic coverage

National coverage

Analysis unit

Households

Individuals

Universe

The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

Kind of data

Sample survey data [ssd]

Sampling procedure

See “Nigeria - General Household Survey, Panel 2018-2019, Wave 4” and “Nigeria - COVID-19 National Longitudinal Phone Survey 2020” available in the Microdata Library for details.

Mode of data collection

Computer Assisted Personal Interview [capi]

Cleaning operations

Nigeria General Household Survey, Panel (GHS-Panel) 2018-2019 and Nigeria COVID-19 National Longitudinal Phone Survey (COVID-19 NLPS) 2020 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).

The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.

Response rate

See “Nigeria - General Household Survey, Panel 2018-2019, Wave 4” and “Nigeria - COVID-19 National Longitudinal Phone Survey 2020” available in the Microdata Library for details.
f
frequency and percentage distribution of the dependent variables.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Feb 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Belay, Denekew Bitew; Mulat, Seniat; Birhan, Nigussie Adam; Chen, Ding-Geng (2025). frequency and percentage distribution of the dependent variables. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001376419
Explore at:
Dataset updated
Feb 10, 2025
Authors
Belay, Denekew Bitew; Mulat, Seniat; Birhan, Nigussie Adam; Chen, Ding-Geng
Description
frequency and percentage distribution of the dependent variables.
g
Data from: CoDEx-VFD: Controlled Disturbance Experiment - Variable Frequency...
gimi9.com
rdr.kuleuven.be
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CoDEx-VFD: Controlled Disturbance Experiment - Variable Frequency Drive [Dataset]. https://gimi9.com/dataset/eu_doi-10-48804-n4h9hp/
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The CoDEx-VFD dataset provides time-series current measurements from a three-phase Variable Frequency Drive (VFD) system subjected to controlled electromagnetic disturbances (EMD). This dataset is designed for benchmarking and comparing anomaly detection algorithms in the context of electromagnetic compatibility (EMC). The data was collected under controlled laboratory conditions, with varying levels of disturbance severity and frequency, providing a valuable resource for researchers developing and evaluating methods for EMI detection and mitigation in electronic systems. The dataset comprises 100 CSV files, each representing a single measurement run with different anomaly scenarios. Measurements include two directly measured phase currents along with a binary label indicating the presence or absence of an injected disturbance at each time point. The sampling rate is 2.5 MHz, providing high temporal resolution for capturing transient EMI events. Key experimental parameters, including disturbance characteristics and equipment details, are documented in the accompanying README file.
Los Angeles, California, Earthquake Dataset
kaggle.com
zip
Updated Sep 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Mohamed Zaki (2025). Los Angeles, California, Earthquake Dataset [Dataset]. https://www.kaggle.com/datasets/ahmeduzaki/los-angeles-california-earthquake-dataset
Explore at:
zip(3002469 bytes)Available download formats
Dataset updated
Sep 17, 2025
Authors
Ahmed Mohamed Zaki
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
California, Los Angeles
Description
Los Angeles, California, Earthquake Dataset with Feature-Engineered Variables

Dataset Description

This comprehensive earthquake dataset contains detailed records of seismic events in Southern California, specifically filtered to focus on a 100 km radius around Los Angeles from January 1, 2012, to September 1, 2024. The dataset was developed to support advanced machine learning and neural network algorithms for earthquake forecasting and prediction.

Key Characteristics:

Geographic Scope: 100 km radius around Los Angeles, California

Temporal Coverage: January 1, 2012 - September 1, 2024

File Size: 6.8 MB

Format: CSV file (LosAngeles_Earthquake_Dataset.csv)

Purpose: Machine learning-based earthquake prediction and forecasting

Data Source and Processing:

The dataset was compiled from the Southern California Earthquake Data Center (SCEDC) and underwent extensive preprocessing including: - Magnitude standardization to local magnitude (ML) scale - Spatial filtering within 100 km radius of Los Angeles - Feature engineering for enhanced predictive modeling - Quality control to exclude inconsistent magnitude types

Columns Description

Primary Seismic Data:

Magnitude: Standardized local magnitude (ML) values for all seismic events

Depth: Depth of earthquake occurrence below surface level

Latitude/Longitude: Geographic coordinates of earthquake epicenters

DateTime: Timestamp of earthquake occurrence

Feature-Engineered Variables:

The dataset includes multiple engineered features designed to enhance predictive modeling capabilities:

Rolling Mean of Depth (d̄ᵢ): Moving average of earthquake depths, identified as the second most influential variable according to Information Gain analysis

Temporal Features: Time-based patterns and trends in seismic activity

Spatial Features: Geographic distribution patterns and clustering indicators

Magnitude Distribution Features: Statistical measures of magnitude patterns over time

Frequency Characteristics: Seismic frequency patterns and anomaly indicators

Seismic Activity Indicators: Patterns identifying potential precursory seismic behavior

Geographic Clustering Variables: Spatial relationship features between earthquake events

Temporal Sequence Features: Time-series patterns and dependencies

Target Variable:

Class: The maximum earthquake magnitude category that occurs within 30 days of each recorded event, classified into six distinct categories.

Dataset Applications

Six-category classification of earthquake magnitude classes

Citation

DOI: Yavas, C. E., Chen, L., Kadlec, C., & Ji, Y. (2024). Los Angeles, California, Earthquake Dataset with Feature-Engineered Variables (Version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13738726
Frequency distributions of selected variables in HCC case and control...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hui-Feng Chen; Jian-Rong Mai; Jian-Xin Wan; Yan-fang Gao; Li-Na Lin; Song-Zi Wang; Yu-Xi Chen; Chen-Zi Zhang; Yu-Jing Zhang; Bin Xia; Kun Liao; Yu-Chun Lin; Zhong-Ning Lin (2023). Frequency distributions of selected variables in HCC case and control subjects. [Dataset]. http://doi.org/10.1371/journal.pone.0059574.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0059574.t001
Dataset updated
Jun 9, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Hui-Feng Chen; Jian-Rong Mai; Jian-Xin Wan; Yan-fang Gao; Li-Na Lin; Song-Zi Wang; Yu-Xi Chen; Chen-Zi Zhang; Yu-Jing Zhang; Bin Xia; Kun Liao; Yu-Chun Lin; Zhong-Ning Lin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
aP value for a two-sided χ2 test.b+/− : presence or absence of HBV infection.
Multi-LEX: a database of multi-word frequencies (English files)
data.europa.eu
data.niaid.nih.gov
unknown
Updated Oct 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2022). Multi-LEX: a database of multi-word frequencies (English files) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7214223?locale=cs
Explore at:
unknown(1405011)Available download formats
Dataset updated
Oct 16, 2022
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Written word frequency is a key variable used in many psycholinguistic studies and is central in explaining visual word recognition. Indeed, methodological advances on single word frequency estimates have helped to uncover novel language-related cognitive processes, fostering new ideas and studies. In an attempt to support and promote research on a related emerging topic, visual multi-word recognition, we extracted from the exhaustive Google Ngram datasets a selection of millions of multi-word sequences and computed their associated frequency estimate. Such sequences are presented with Part-of-Speech information for each individual word. An online behavioral investigation making use of the French 4-gram lexicon in a grammatical decision task was carried out. The results show an item-level frequency effect of word sequences. Moreover, the proposed datasets were found useful during the stimulus selection phase, allowing more precise control of the multi-word characteristics.
CRU CY4.08: Climatic Research Unit year-by-year variation of selected...
catalogue.ceda.ac.uk
Updated Jul 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian C Harris; Philip D. Jones; Tim Osborn (2024). CRU CY4.08: Climatic Research Unit year-by-year variation of selected climate variables by country version 4.08 (Jan. 1901 - Dec. 2023) [Dataset]. https://catalogue.ceda.ac.uk/uuid/3b7f475a30a642e9af5323cef748bb00
Explore at:
Dataset updated
Jul 31, 2024
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
Ian C Harris; Philip D. Jones; Tim Osborn
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Time period covered
Jan 1, 1901 - Dec 31, 2023
Area covered
Description
The Climatic Research Unit (CRU) Country (CY) data version 4.08 dataset consists of ten climate variables for country averages at a monthly, seasonal and annual frequency: including cloud cover, diurnal temperature range, frost day frequency, precipitation, daily mean temperature, monthly average daily maximum and minimum temperature, vapour pressure, potential evapotranspiration and wet day frequency. This version uses the updated set of country definitions, please see the appropriate Release Notes.

This dataset was produced in 2024 by CRU at the University of East Anglia and extends the CRU CY4.07 data to include 2023. The data are available as text files with the extension '.per' and can be opened by most text editors.

Spatial averages are calculated using area-weighted means. CRU CY4.08 is derived directly from the CRU time series (TS) 4.07 dataset. CRU CY version 4.08 spans the period 1901-2023 for 292 countries.

To understand the CRU CY4.08 dataset, it is important to understand the construction and limitations of the underlying dataset, CRU TS4.07. It is therefore recommended that all users read the Harris et al, 2020 paper and the CRU TS4.08 release notes listed in the online documentation on this record.

CRU CY data are available for download to all CEDA users.
Radio Frequency Interference Measurements of Industrial Machinery
catalog.data.gov
Updated Jul 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2025). Radio Frequency Interference Measurements of Industrial Machinery [Dataset]. https://catalog.data.gov/dataset/radio-frequency-interference-measurements-of-industrial-machinery
Explore at:
Dataset updated
Jul 9, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
The 2.4 GHz ISM band is shared by Wi-Fi, Bluetooth, Wireless HART, ISA100.11a, and several other industrial wireless systems. Our dataset contains comprehensive electromagnetic interference (EMI) measurements from machinery taken in various industrial environments. The measurements were taken at two frequencies: 900 MHz, 2.4 GHz. This dataset may be useful for understanding EMI emitters in factories and can be instrumental in developing interference mitigation strategies, aiding in RF band selection and enterprise frequency planning, improving wireless technology, and informing communications standardization activities such as the IEEE 3388 industrial wireless performance evaluation standard.The interference measurements were taken in the following types of industrial environments:1) Infrared Curing Machine: Curing process using infrared radiation producing EMI across the 2.4 GHz band, 2) Crane with an Unshielded VFD: Overhead gantry crane operating at 900 MHz with an unshielded variable frequency drive (VFD) causing broadband interference, 3) Microwave Dryer: Two independent sets of measurements of a microwave oven baking machines used for a ceramic drying process. Multiple magnetrons are used with a power output of 1100 Watts each, 4) Unidentified Interference: General recording of the 2400 MHz band capturing both wireless network traffic and an unidentified broadband RFI emitter possibly caused by an unshielded VFD.NIST Disclaimer: Certain commercial equipment, instruments, or materials are identified in this publication in order to describe the experimental procedures and data adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.
Sea ice edge and type daily gridded data from 1978 to present derived from...
cds.climate.copernicus.eu
netcdf-4
Updated Oct 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ECMWF (2025). Sea ice edge and type daily gridded data from 1978 to present derived from satellite observations [Dataset]. http://doi.org/10.24381/cds.29c46d83
Explore at:
netcdf-4Available download formats
Unique identifier
https://doi.org/10.24381/cds.29c46d83
Dataset updated
Oct 21, 2025
Dataset provided by
European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
Authors
ECMWF
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 25, 1978 - Sep 24, 2025
Description
This dataset provides daily gridded data of sea ice edge and sea ice type derived from brightness temperatures measured by satellite passive microwave radiometers. Sea ice is an important component of our climate system and a sensitive indicator of climate change. Its presence or its retreat has a strong impact on air-sea interactions, the Earth’s energy budget as well as marine ecosystems. It is recognized by the Global Climate Observing System as an Essential Climate Variable. Sea ice edge and type are some of the parameters used to characterise sea ice. Other parameters include sea ice concentration and sea ice thickness, also available in the Climate Data Store. Sea ice edge and type are defined as follows:

Sea ice edge classifies the sea surface into open water, open ice, and closed ice depending on the amount of sea ice present in each grid cell. This variable is provided for both the Northern and Southern Hemispheres. Note that a sea ice concentration threshold of 30% is used to distinguish between open water and open ice, which differs from the 15% threshold commonly used for other sea ice products such as sea ice extent. Sea ice type classifies ice-covered areas into two categories based on the age of the sea ice: multiyear ice versus seasonal first-year ice. This variable is currently only available for the Northern Hemisphere and limited to the extended boreal winter months (October through April). Sea ice type classification during summer is difficult due to the effect of melting at the ice surface which disturbs the passive microwave signature.

Both sea ice products are based on measurements from the series of Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave/Imager (SSM/I), and Special Sensor Microwave Imager/Sounder (SSMIS) sensors and share the same algorithm baseline. However, sea ice edge makes use of two lower frequencies near 19 GHz and 37 GHz and a higher frequency near 90 GHz whereas sea ice type only uses the two lower frequencies. This dataset combines Climate Data Records (CDRs), which are intended to have sufficient length, consistency, and continuity to assess climate variability and change, and Interim Climate Data Records (ICDRs), which provide regular temporal extensions to the CDRs and where consistency with the CDRs is expected but not extensively checked. For this dataset, both the CDR and ICDR parts of each product were generated using the same software and algorithms. The CDRs of sea ice edge and type currently extend from 25 October 1978 to 31 December 2020 whereas the corresponding ICDRs extend from January 2021 to present (with a 16-day latency behind real time). All data from the current release of the datasets (version 3.0) are Level-4 products, in which data gaps are filled by temporal and spatial interpolation. For product limitations and known issues, please consult the Product User Guide. This dataset is produced on behalf of Copernicus Climate Change Service (C3S), with heritage from the operational products generated by EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSI SAF).
C
CRU CY3.21: Climatic Research Unit (CRU) Year-by-Year Variation of Selected...
catalogue.ceda.ac.uk
data-search.nerc.ac.uk
Updated Sep 25, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian C Harris (2013). CRU CY3.21: Climatic Research Unit (CRU) Year-by-Year Variation of Selected Climate Variables by CountrY (CY) version 3.21 (Jan. 1901 - Dec. 2012) [Dataset]. https://catalogue.ceda.ac.uk/uuid/8482b5af7dded1f1b94a4a9ac4ce8a26
Explore at:
Dataset updated
Sep 25, 2013
Dataset provided by
NCAS British Atmospheric Data Centre (NCAS BADC)
Authors
Ian C Harris
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Time period covered
Jan 1, 1901 - Dec 31, 2012
Area covered
Description
The CRU CY3.21 dataset consists of country averages at a monthly, seasonal and annual frequency, for ten climate variables in 289 countries for the period Jan. 1901 to Dec. 2012. It was produced in 2013 by the Climatic Research Unit (CRU) at the University of East Anglia. Spatial averages are calculated using area-weighted means. Variables include cloud cover (cld), diurnal temperature range (dtr), frost day frequency (frs), precipitation (pre), daily mean temperature (tmp), monthly average daily maximum (tmx) and minimum (tmn) temperature, vapour pressure (vap), Potential Evapo-transpiration (pet) and wet day frequency (wet).

CRU CY3.21 is derived directly from the CRU TS3.21 dataset. Version numbering is matched between the two datasets. The data are available as text files with the extension '.per' and can be opened by most text editors.

To understand the CRU-CY3.21 dataset, it is important to understand the construction and limitations of the underlying dataset, CRU TS3.21. It is therefore recommended that all users read the paper referenced below (Harris et al, 2014).

CRU CY data are available for download to all CEDA users.
r
Lake variables - Chlorophyll from Erken
researchdata.se
demo.researchdata.se
Updated Sep 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erken Laboratory (2025). Lake variables - Chlorophyll from Erken [Dataset]. https://researchdata.se/en/catalogue/dataset/sites-w1di9z4rhvxirqr7fxm02trb
Explore at:
Dataset updated
Sep 18, 2025
Dataset provided by
Uppsala University
Authors
Erken Laboratory
Time period covered
Apr 16, 2019 - Nov 13, 2024
Description
A high-frequency record of chlorophyll fluorescence measurements collected during both under-ice and open-water periods using sonde technology. The dataset includes values adjusted using laboratory measurements. Erken Laboratory (2025). Lake variables - Chlorophyll from Erken, 2019-04-17–2024-11-13 [Data set]. Swedish Infrastructure for Ecosystem Science (SITES). https://hdl.handle.net/11676.1/w1di9Z4rHVXirQR7FxM02TRB

Facebook

Twitter

Click to copy link

Link copied

Cite

Barna, Sutapa Dey; Raihan, Hasin; Khan, Nafiul Alam; Hossain, Tanvir; Islam, Akhtarul (2020). Frequency table for different selected variables. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000486086

Frequency table for different selected variables.

Explore at:

Dataset updated

Aug 26, 2020

Authors

Barna, Sutapa Dey; Raihan, Hasin; Khan, Nafiul Alam; Hossain, Tanvir; Islam, Akhtarul

Description

Frequency table for different selected variables.

Clear search

Close search

Google apps

Main menu

Frequency table for different selected variables.

Weighted frequency distribution for selected variables.

ORCA-VFD Dataset: Reliability, Degradation, and Remaining Useful Life Data...

Response variables derived from predicted high-frequency chloride...

Frequency distributions of the background variables (N = 542).

COVID-19 High Frequency Phone Survey of Households 2020 - World Bank LSMS...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Cleaning operations

Response rate

High-Frequency Phone Survey on COVID-19 - World Bank LSMS Harmonized Dataset...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Cleaning operations

Response rate

Number and frequency (% within parentheses) of a variable being found...

Pre and Post-Exercise Heart Rate Analysis

COVID-19 National Longitudinal Phone Survey 2020 – World Bank LSMS...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Cleaning operations

Response rate

frequency and percentage distribution of the dependent variables.

Data from: CoDEx-VFD: Controlled Disturbance Experiment - Variable Frequency...

Los Angeles, California, Earthquake Dataset

Los Angeles, California, Earthquake Dataset with Feature-Engineered Variables

Dataset Description

Key Characteristics:

Data Source and Processing:

Columns Description

Primary Seismic Data:

Feature-Engineered Variables:

Target Variable:

Dataset Applications

Citation

Frequency distributions of selected variables in HCC case and control...

Multi-LEX: a database of multi-word frequencies (English files)

CRU CY4.08: Climatic Research Unit year-by-year variation of selected...

Radio Frequency Interference Measurements of Industrial Machinery

Sea ice edge and type daily gridded data from 1978 to present derived from...

CRU CY3.21: Climatic Research Unit (CRU) Year-by-Year Variation of Selected...

Lake variables - Chlorophyll from Erken

Frequency table for different selected variables.See More Versions

Frequency table for different selected variables.