98 datasets found
  1. r

    Dataset for The effects of a number line intervention on calculation skills

    • researchdata.edu.au
    • figshare.mq.edu.au
    Updated May 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas (2023). Dataset for The effects of a number line intervention on calculation skills [Dataset]. http://doi.org/10.25949/22799717.V1
    Explore at:
    Dataset updated
    May 18, 2023
    Dataset provided by
    Macquarie University
    Authors
    Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas
    Description

    Study information

    The sample included in this dataset represents five children who participated in a number line intervention study. Originally six children were included in the study, but one of them fulfilled the criterion for exclusion after missing several consecutive sessions. Thus, their data is not included in the dataset.

    All participants were currently attending Year 1 of primary school at an independent school in New South Wales, Australia. For children to be able to eligible to participate they had to present with low mathematics achievement by performing at or below the 25th percentile in the Maths Problem Solving and/or Numerical Operations subtests from the Wechsler Individual Achievement Test III (WIAT III A & NZ, Wechsler, 2016). Participants were excluded from participating if, as reported by their parents, they have any other diagnosed disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, intellectual disability, developmental language disorder, cerebral palsy or uncorrected sensory disorders.

    The study followed a multiple baseline case series design, with a baseline phase, a treatment phase, and a post-treatment phase. The baseline phase varied between two and three measurement points, the treatment phase varied between four and seven measurement points, and all participants had 1 post-treatment measurement point.

    The number of measurement points were distributed across participants as follows:

    Participant 1 – 3 baseline, 6 treatment, 1 post-treatment

    Participant 3 – 2 baseline, 7 treatment, 1 post-treatment

    Participant 5 – 2 baseline, 5 treatment, 1 post-treatment

    Participant 6 – 3 baseline, 4 treatment, 1 post-treatment

    Participant 7 – 2 baseline, 5 treatment, 1 post-treatment

    In each session across all three phases children were assessed in their performance on a number line estimation task, a single-digit computation task, a multi-digit computation task, a dot comparison task and a number comparison task. Furthermore, during the treatment phase, all children completed the intervention task after these assessments. The order of the assessment tasks varied randomly between sessions.


    Measures

    Number Line Estimation. Children completed a computerised bounded number line task (0-100). The number line is presented in the middle of the screen, and the target number is presented above the start point of the number line to avoid signalling the midpoint (Dackermann et al., 2018). Target numbers included two non-overlapping sets (trained and untrained) of 30 items each. Untrained items were assessed on all phases of the study. Trained items were assessed independent of the intervention during baseline and post-treatment phases, and performance on the intervention is used to index performance on the trained set during the treatment phase. Within each set, numbers were equally distributed throughout the number range, with three items within each ten (0-10, 11-20, 21-30, etc.). Target numbers were presented in random order. Participants did not receive performance-based feedback. Accuracy is indexed by percent absolute error (PAE) [(number estimated - target number)/ scale of number line] x100.


    Single-Digit Computation. The task included ten additions with single-digit addends (1-9) and single-digit results (2-9). The order was counterbalanced so that half of the additions present the lowest addend first (e.g., 3 + 5) and half of the additions present the highest addend first (e.g., 6 + 3). This task also included ten subtractions with single-digit minuends (3-9), subtrahends (1-6) and differences (1-6). The items were presented horizontally on the screen accompanied by a sound and participants were required to give a verbal response. Participants did not receive performance-based feedback. Performance on this task was indexed by item-based accuracy.


    Multi-digit computational estimation. The task included eight additions and eight subtractions presented with double-digit numbers and three response options. None of the response options represent the correct result. Participants were asked to select the option that was closest to the correct result. In half of the items the calculation involved two double-digit numbers, and in the other half one double and one single digit number. The distance between the correct response option and the exact result of the calculation was two for half of the trials and three for the other half. The calculation was presented vertically on the screen with the three options shown below. The calculations remained on the screen until participants responded by clicking on one of the options on the screen. Participants did not receive performance-based feedback. Performance on this task is measured by item-based accuracy.


    Dot Comparison and Number Comparison. Both tasks included the same 20 items, which were presented twice, counterbalancing left and right presentation. Magnitudes to be compared were between 5 and 99, with four items for each of the following ratios: .91, .83, .77, .71, .67. Both quantities were presented horizontally side by side, and participants were instructed to press one of two keys (F or J), as quickly as possible, to indicate the largest one. Items were presented in random order and participants did not receive performance-based feedback. In the non-symbolic comparison task (dot comparison) the two sets of dots remained on the screen for a maximum of two seconds (to prevent counting). Overall area and convex hull for both sets of dots is kept constant following Guillaume et al. (2020). In the symbolic comparison task (Arabic numbers), the numbers remained on the screen until a response was given. Performance on both tasks was indexed by accuracy.


    The Number Line Intervention

    During the intervention sessions, participants estimated the position of 30 Arabic numbers in a 0-100 bounded number line. As a form of feedback, within each item, the participants’ estimate remained visible, and the correct position of the target number appeared on the number line. When the estimate’s PAE was lower than 2.5, a message appeared on the screen that read “Excellent job”, when PAE was between 2.5 and 5 the message read “Well done, so close! and when PAE was higher than 5 the message read “Good try!” Numbers were presented in random order.


    Variables in the dataset

    Age = age in ‘years, months’ at the start of the study

    Sex = female/male/non-binary or third gender/prefer not to say (as reported by parents)

    Math_Problem_Solving_raw = Raw score on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Math_Problem_Solving_Percentile = Percentile equivalent on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Num_Ops_Raw = Raw score on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Math_Problem_Solving_Percentile = Percentile equivalent on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).


    The remaining variables refer to participants’ performance on the study tasks. Each variable name is composed by three sections. The first one refers to the phase and session. For example, Base1 refers to the first measurement point of the baseline phase, Treat1 to the first measurement point on the treatment phase, and post1 to the first measurement point on the post-treatment phase.


    The second part of the variable name refers to the task, as follows:

    DC = dot comparison

    SDC = single-digit computation

    NLE_UT = number line estimation (untrained set)

    NLE_T= number line estimation (trained set)

    CE = multidigit computational estimation

    NC = number comparison

    The final part of the variable name refers to the type of measure being used (i.e., acc = total correct responses and pae = percent absolute error).


    Thus, variable Base2_NC_acc corresponds to accuracy on the number comparison task during the second measurement point of the baseline phase and Treat3_NLE_UT_pae refers to the percent absolute error on the untrained set of the number line task during the third session of the Treatment phase.





  2. Z

    Fused Image dataset for convolutional neural Network-based crack Detection...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Apr 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043
    Explore at:
    Dataset updated
    Apr 20, 2023
    Dataset provided by
    Carlos Canchila
    Wei Song
    Shanglian Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

    The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

    If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

    In addition, an image dataset for crack classification has also been published at [6].

    References:

    [1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

    [2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

    [3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

    [4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

    5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

    [6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

  3. w

    Dataset of books called A gyrokinetic calculation of transmission &...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called A gyrokinetic calculation of transmission & reflection of the fast wave in the ion cyclotron range of frequencies [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=A+gyrokinetic+calculation+of+transmission+%26+reflection+of+the+fast+wave+in+the+ion+cyclotron+range+of+frequencies
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is A gyrokinetic calculation of transmission & reflection of the fast wave in the ion cyclotron range of frequencies. It features 7 columns including author, publication date, language, and book publisher.

  4. Control Measure Dataset

    • datasets.ai
    • catalog.data.gov
    • +1more
    Updated Sep 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2024). Control Measure Dataset [Dataset]. https://datasets.ai/datasets/control-measure-dataset
    Explore at:
    Dataset updated
    Sep 21, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Authors
    U.S. Environmental Protection Agency
    Description

    The EPA Control Measure Dataset is a collection of documents describing air pollution control available to regulated facilities for the control and abatement of air pollution emissions from a range of regulated source types, whether directly through the use of technical measures, or indirectly through economic or other measures.

  5. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  6. f

    Collection of example datasets used for the book - R Programming -...

    • figshare.com
    txt
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kingsley Okoye; Samira Hosseini (2023). Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research [Dataset]. http://doi.org/10.6084/m9.figshare.24728073.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    figshare
    Authors
    Kingsley Okoye; Samira Hosseini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.

  7. d

    Data from: Half interpercentile range (half of the difference between the...

    • catalog.data.gov
    • data.usgs.gov
    • +5more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Half interpercentile range (half of the difference between the 16th and 84th percentiles) of wave-current bottom shear stress in the Middle Atlantic Bight for May, 2010 - May, 2011 (MAB_hIPR.SHP) [Dataset]. https://catalog.data.gov/dataset/half-interpercentile-range-half-of-the-difference-between-the-16th-and-84th-percentiles-of
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.

  8. Consecutive Bates Range - Gap Finder

    • kaggle.com
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Zelazko (2023). Consecutive Bates Range - Gap Finder [Dataset]. https://www.kaggle.com/datasets/patrickzel/consecutive-bates-range
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Patrick Zelazko
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Here's a sample Production Bates Range for a Gap Analysis exercise via Python. It's a CSV with one column containing a range of numbers following the convention "D0000001, D0000002, .... D0099999."

    This script can be run against a variable/column on a document production index to identify document sequence gaps, which can be helpful to determine missing documents in a set or to diagnose a technical issue during data processing or exchange phases.

    More broadly, this code can be updated to apply over any sequential data range (dates, student ID, serial number, item number, etc.), to show any gaps or available digits.

  9. Z

    Data from: Traffic and Log Data Captured During a Cyber Defense Exercise

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Vykopal (2020). Traffic and Log Data Captured During a Cyber Defense Exercise [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3746128
    Explore at:
    Dataset updated
    Jun 12, 2020
    Dataset provided by
    Stanislav Špaček
    Jan Vykopal
    Daniel Tovarňák
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was acquired during Cyber Czech – a hands-on cyber defense exercise (Red Team/Blue Team) held in March 2019 at Masaryk University, Brno, Czech Republic. Network traffic flows and a high variety of event logs were captured in an exercise network deployed in the KYPO Cyber Range Platform.

    Contents

    The dataset covers two distinct time intervals, which correspond to the official schedule of the exercise. The timestamps provided below are in the ISO 8601 date format.

    Day 1, March 19, 2019

    Start: 2019-03-19T11:00:00.000000+01:00

    End: 2019-03-19T18:00:00.000000+01:00

    Day 2, March 20, 2019

    Start: 2019-03-20T08:00:00.000000+01:00

    End: 2019-03-20T15:30:00.000000+01:00

    The captured and collected data were normalized into three distinct event types and they are stored as structured JSON. The data are sorted by a timestamp, which represents the time they were observed. Each event type includes a raw payload ready for further processing and analysis. The description of the respective event types and the corresponding data files follows.

    cz.muni.csirt.IpfixEntry.tgz – an archive of IPFIX traffic flows enriched with an additional payload of parsed application protocols in raw JSON.

    cz.muni.csirt.SyslogEntry.tgz – an archive of Linux Syslog entries with the payload of corresponding text-based log messages.

    cz.muni.csirt.WinlogEntry.tgz – an archive of Windows Event Log entries with the payload of original events in raw XML.

    Each archive listed above includes a directory of the same name with the following four files, ready to be processed.

    data.json.gz – the actual data entries in a single gzipped JSON file.

    dictionary.yml – data dictionary for the entries.

    schema.ddl – data schema for Apache Spark analytics engine.

    schema.jsch – JSON schema for the entries.

    Finally, the exercise network topology is described in a machine-readable NetJSON format and it is a part of a set of auxiliary files archive – auxiliary-material.tgz – which includes the following.

    global-gateway-config.json – the network configuration of the global gateway in the NetJSON format.

    global-gateway-routing.json – the routing configuration of the global gateway in the NetJSON format.

    redteam-attack-schedule.{csv,odt} – the schedule of the Red Team attacks in CSV and ODT format. Source for Table 2.

    redteam-reserved-ip-ranges.{csv,odt} – the list of IP segments reserved for the Red Team in CSV and ODT format. Source for Table 1.

    topology.{json,pdf,png} – the topology of the complete Cyber Czech exercise network in the NetJSON, PDF and PNG format.

    topology-small.{pdf,png} – simplified topology in the PDF and PNG format. Source for Figure 1.

  10. National Residential Efficiency Measures Database (REMDB)

    • data.openei.org
    • s.cnmilf.com
    • +2more
    data, website
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan Moore; Noel Merket; Scott Horowitz; Micah Webb; Dave Roberts; Brennan Less; Nathan Moore; Noel Merket; Scott Horowitz; Micah Webb; Dave Roberts; Brennan Less (2023). National Residential Efficiency Measures Database (REMDB) [Dataset]. https://data.openei.org/submissions/8336
    Explore at:
    data, websiteAvailable download formats
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    National Renewable Energy Lab - NREL
    Open Energy Data Initiative (OEDI)
    Authors
    Nathan Moore; Noel Merket; Scott Horowitz; Micah Webb; Dave Roberts; Brennan Less; Nathan Moore; Noel Merket; Scott Horowitz; Micah Webb; Dave Roberts; Brennan Less
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This project provides a national unified database of residential building retrofit measures and associated retail prices and end-user might experience. These data are accessible to software programs that evaluate most cost-effective retrofit measures to improve the energy efficiency of residential buildings and are used in the consumer-facing website https://remdb.nrel.gov/

    This publicly accessible, centralized database of retrofit measures offers the following benefits:

    • Provides information in a standardized format
    • Improves the technical consistency and accuracy of the results of software programs
    • Enables experts and stakeholders to view the retrofit information and provide comments to improve data quality
    • Supports building science R&D
    • Enhances transparency

    This database provides full price estimates for many different retrofit measures. For each measure, the database provides a range of prices, as the data for a measure can vary widely across regions, houses, and contractors. Climate, construction, home features, local economy, maturity of a market, and geographic location are some of the factors that may affect the actual price of these measures.

    This database is not intended to provide specific cost estimates for a specific project. The cost estimates do not include any rebates or tax incentives that may be available for the measures. Rather, it is meant to help determine which measures may be more cost-effective. The National Renewable Energy Laboratory (NREL) makes every effort to ensure accuracy of the data; however, NREL does not assume any legal liability or responsibility for the accuracy or completeness of the information.

  11. Dataset for the paper "Observation of Acceleration and Deceleration Periods...

    • zenodo.org
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yide Qian; Yide Qian (2025). Dataset for the paper "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023 " [Dataset]. http://doi.org/10.5281/zenodo.15022854
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yide Qian; Yide Qian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pine Island Glacier
    Description

    Dataset and codes for "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023 "

    • Description of the data and file structure

    The MATLAB codes and related datasets are used for generating the figures for the paper "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023".

    Files and variables

    File 1: Data_and_Code.zip

    Directory: Main_function

    **Description:****Include MATLAB scripts and functions. Each script include discriptions that guide the user how to used it and how to find the dataset that used for processing.

    MATLAB Main Scripts: Include the whole steps to process the data, output figures, and output videos.

    Script_1_Ice_velocity_process_flow.m

    Script_2_strain_rate_process_flow.m

    Script_3_DROT_grounding_line_extraction.m

    Script_4_Read_ICESat2_h5_files.m

    Script_5_Extraction_results.m

    MATLAB functions: Five Files that includes MATLAB functions that support the main script:

    1_Ice_velocity_code: Include MATLAB functions related to ice velocity post-processing, includes remove outliers, filter, correct for atmospheric and tidal effect, inverse weited averaged, and error estimate.

    2_strain_rate: Include MATLAB functions related to strain rate calculation.

    3_DROT_extract_grounding_line_code: Include MATLAB functions related to convert range offset results output from GAMMA to differential vertical displacement and used the result extract grounding line.

    4_Extract_data_from_2D_result: Include MATLAB functions that used for extract profiles from 2D data.

    5_NeRD_Damage_detection: Modified code fom Izeboud et al. 2023. When apply this code please also cite Izeboud et al. 2023 (https://www.sciencedirect.com/science/article/pii/S0034425722004655).

    6_Figure_plotting_code:Include MATLAB functions related to Figures in the paper and support information.

    Director: data_and_result

    Description:**Include directories that store the results output from MATLAB. user only neeed to modify the path in MATLAB script to their own path.

    1_origin : Sample data ("PS-20180323-20180329", “PS-20180329-20180404”, “PS-20180404-20180410”) output from GAMMA software in Geotiff format that can be used to calculate DROT and velocity. Includes displacment, theta, phi, and ccp.

    2_maskccpN: Remove outliers by ccp < 0.05 and change displacement to velocity (m/day).

    3_rockpoint: Extract velocities at non-moving region

    4_constant_detrend: removed orbit error

    5_Tidal_correction: remove atmospheric and tidal induced error

    6_rockpoint: Extract non-aggregated velocities at non-moving region

    6_vx_vy_v: trasform velocities from va/vr to vx/vy

    7_rockpoint: Extract aggregated velocities at non-moving region

    7_vx_vy_v_aggregate_and_error_estimate: inverse weighted average of three ice velocity maps and calculate the error maps

    8_strain_rate: calculated strain rate from aggregate ice velocity

    9_compare: store the results before and after tidal correction and aggregation.

    10_Block_result: times series results that extrac from 2D data.

    11_MALAB_output_png_result: Store .png files and time serties result

    12_DROT: Differential Range Offset Tracking results

    13_ICESat_2: ICESat_2 .h5 files and .mat files can put here (in this file only include the samples from tracks 0965 and 1094)

    14_MODIS_images: you can store MODIS images here

    shp: grounding line, rock region, ice front, and other shape files.

    File 2 : PIG_front_1947_2023.zip

    Includes Ice front positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.

    File 3 : PIG_DROT_GL_2016_2021.zip

    Includes grounding line positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.

    Data was derived from the following sources:
    Those links can be found in MATLAB scripts or in the paper "**Open Research" **section.

  12. w

    City and County of Denver: Range Points

    • data.wu.ac.at
    application/acad, csv +3
    Updated Oct 7, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City and County of Denver (2018). City and County of Denver: Range Points [Dataset]. https://data.wu.ac.at/schema/data_opencolorado_org/MzA2NTQ1ZWMtMTZjNy00ZGUzLWIxODctN2Y1ODUxNjI4MzJm
    Explore at:
    kmz(1289991.0), zip(1072194.0), xml(12990.0), csv(2532681.0), application/acad(448532.0), zip(1083119.0)Available download formats
    Dataset updated
    Oct 7, 2018
    Dataset provided by
    City and County of Denver
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Denver
    Description

    This dataset is a point feature representing range points within the City and County of Denver. Range points are termini for range lines, which serve as offsets to right-of-way lines and block lines. Range points are typically located below surface streets.

    Disclaimer

    ACCESS CONSTRAINTS:
    None.

    USE CONSTRAINTS: The City and County of Denver is not responsible and shall not be liable to the user for damages of any kind arising out of the use of data or information provided by the City and County of Denver, including the installation of the data or information, its use, or the results obtained from its use.

    ANY DATA OR INFORMATION PROVIDED BY THE City and County of Denver IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Data or information provided by the City and County of Denver shall be used and relied upon only at the user's sole risk, and the user agrees to indemnify and hold harmless the City and County of Denver, its officials, officers and employees from any liability arising out of the use of the data/information provided.

    NOT FOR ENGINEERING PURPOSES

  13. E-Commerce Healthcare Orders Dataset

    • kaggle.com
    Updated Sep 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adish Golechha (2021). E-Commerce Healthcare Orders Dataset [Dataset]. https://www.kaggle.com/datasets/adishgolechha/ecommerce-healthcare-orders-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Adish Golechha
    Description

    Context

    XYZ Pvt Ltd is an E-Commerce Company dealing in a wide range of Healthy Products combined with the power of Artificial Intelligence. But recently it has started facing an issue of HIGH Return Rates throughout India. (A return order is when the order is in transit but a customer refuses to accept it sighting different reasons)

    Content

    The dataset has 1600 orders with every detail ranging from city and state for geographical analysis or dates for time-series analysis, each product's category, name, cost and ID has also been given for more detailed analysis.

    If there are columns you would like me to add please let me know in the comments.

    The latest data has been cleaned.

    Inspiration

    Study the dataset to figure out the Return Rate Patterns amongst the customers. Every column has been carefully added for you to analyze which may/may not directly influence the return rates.

  14. CFRAM Model Nodes - Mid-Range Future Scenario - Dataset - data.gov.ie

    • data.gov.ie
    Updated Mar 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.ie (2025). CFRAM Model Nodes - Mid-Range Future Scenario - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/cfram-model-nodes-mrfs
    Explore at:
    Dataset updated
    Mar 15, 2025
    Dataset provided by
    data.gov.ie
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Abstract: This data shows the model nodes, indicating water level only and/or flow and water levels along the centre-line of rivers that have been modelled to generate the CFRAM flood maps. The nodes estimate maximum design event flood flows and maximum flood levels. Flood event probabilities are referred to in terms of a percentage Annual Exceedance Probability, or ‘AEP’. This represents the probability of an event of this, or greater, severity occurring in any given year. These probabilities may also be expressed as odds (e.g. 100 to 1) of the event occurring in any given year. They are also commonly referred to in terms of a return period (e.g. the 100-year flood), although this period is not the length of time that will elapse between two such events occurring, as, although unlikely, two very severe events may occur within a short space of time. The following sets out a range of flood event probabilities for which fluvial and coastal flood maps are typically developed, expressed in terms of Annual Exceedance Probability (AEP), and identifies their parallels under other forms of expression: 10% (High Probability) Annual Exceedance Probability which can also be expressed as the 10 Year Return Period and as a 10:1 odds of occurrence in any given year. 1% (Medium Probability - Fluvail/River Flood Maps) Annual Exceedance Probability which can also be expressed as the 100 Year Return Period and as 100:1 odds of occurrence in any given year. 0.5% (Medium Probability - Coastal Flood Maps) Annual Exceedance Probability which can also be expressed as the 200 Year Return Period and as 200:1 odds of occurrence in any given year. 0.1% (Low Probability) Annual Exceedance Probability which can also be expressed as the 1000 Year Return Period and as 1000:1 odds of occurrence in any given year. The Mid-Range Future Scenario extents where generated taking in in the potential effects of climate change using an increase in rainfall of 20% and sea level rise of 500mm (20 inches). Data has been produced for the 'Areas of Further Assessment' (AFAs), as required by the EU 'Floods' Directive [2007/60/EC] and designated under the Preliminary Flood Risk Assessment, and also for other reaches between the AFAs and down to the sea that are referred to as 'Medium Priority Watercourses' (MPWs). River reaches that have been modelled are indicated by the CFRAM Modelled River Centrelines dataset. Flooding from other reaches of river may occur, but has not been mapped, and so areas that are not shown as being within a flood extent may therefore be at risk of flooding from unmodelled rivers (as well as from other sources). The purpose of the Flood Maps is not to designate individual properties at risk of flooding. They are community-based maps. Lineage: Fluvial and coastal flood map data is developed using hydrodynamic modelling, based on calculated design river flows and extreme sea levels, surveyed channel cross-sections, in-bank / bank-side / coastal structures, Digital Terrain Models, and other relevant datasets (e.g. land use, data on past floods for model calibration, etc.). The process may vary for particular areas or maps. Technical Hydrology and Hydraulics Reports set out full technical details on the derivation of the flood maps. For fluvial flood levels, calibration and verification of the models make use of the best available data, including hydrometric records, photographs, videos, press articles and anecdotal information. Subject to the availability of suitable calibration data, models are verified in so far as possible to target vertical water level accuracies of approximately +/-0.2m for areas within the AFAs, and approximately +/-0.4m along the MPWs. For coastal flood levels, the accuracy of the predicted annual exceedance probability (AEP) of combined tide and surge levels depends on the accuracy of the various components used in deriving these levels i.e. accuracy of the tidal and surge model, the accuracy of the statistical data and the accuracy for the conversion from marine datum to land levelling datum. The output of the water level modelling, combined with the extreme value analysis undertaken as detailed above is generally within +/-0.2m for confidence limits of 95% at the 0.1% AEP. Higher probability (lower return period) events are expected to have tighter confidence limits. v101 (March 2025) The section of map near Oranmore Galway updated following a map review process see https://www.floodinfo.ie/map-review/ for further information, Map Review Code: MR019. v102 (July 2025) The section of map near Claregalway updated following a map review process see https://www.floodinfo.ie/map-review/ for further information, Map Review Code: MR057. Purpose: The data has been developed to comply with the requirements of the European Communities (Assessment and Management of Flood Risks) Regulations 2010 to 2015 (the “Regulations”) (implementing Directive 2007/60/EC) for the purposes of establishing a framework for the assessment and management of flood risks, aiming at the reduction of adverse consequences for human health, the environment, cultural heritage and economic activity associated with floods.

  15. Human Vital Sign Dataset

    • kaggle.com
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DatasetEngineer (2024). Human Vital Sign Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/8992827
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DatasetEngineer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview The Human Vital Signs Dataset is a comprehensive collection of key physiological parameters recorded from patients. This dataset is designed to support research in medical diagnostics, patient monitoring, and predictive analytics. It includes both original attributes and derived features to provide a holistic view of patient health.

    Attributes Patient ID

    Description: A unique identifier assigned to each patient. Type: Integer Example: 1, 2, 3, ... Heart Rate

    Description: The number of heartbeats per minute. Type: Integer Range: 60-100 bpm (for this dataset) Example: 72, 85, 90 Respiratory Rate

    Description: The number of breaths taken per minute. Type: Integer Range: 12-20 breaths per minute (for this dataset) Example: 16, 18, 15 Timestamp

    Description: The exact time at which the vital signs were recorded. Type: Datetime Format: YYYY-MM-DD HH:MM Example: 2023-07-19 10:15:30 Body Temperature

    Description: The body temperature measured in degrees Celsius. Type: Float Range: 36.0-37.5°C (for this dataset) Example: 36.7, 37.0, 36.5 Oxygen Saturation

    Description: The percentage of oxygen-bound hemoglobin in the blood. Type: Float Range: 95-100% (for this dataset) Example: 98.5, 97.2, 99.1 Systolic Blood Pressure

    Description: The pressure in the arteries when the heart beats (systolic pressure). Type: Integer Range: 110-140 mmHg (for this dataset) Example: 120, 130, 115 Diastolic Blood Pressure

    Description: The pressure in the arteries when the heart rests between beats (diastolic pressure). Type: Integer Range: 70-90 mmHg (for this dataset) Example: 80, 75, 85 Age

    Description: The age of the patient. Type: Integer Range: 18-90 years (for this dataset) Example: 25, 45, 60 Gender

    Description: The gender of the patient. Type: Categorical Categories: Male, Female Example: Male, Female Weight (kg)

    Description: The weight of the patient in kilograms. Type: Float Range: 50-100 kg (for this dataset) Example: 70.5, 80.3, 65.2 Height (m)

    Description: The height of the patient in meters. Type: Float Range: 1.5-2.0 m (for this dataset) Example: 1.75, 1.68, 1.82 Derived Features Derived_HRV (Heart Rate Variability)

    Description: A measure of the variation in time between heartbeats. Type: Float Formula: 𝐻 𝑅

    𝑉

    Standard Deviation of Heart Rate over a Period Mean Heart Rate over the Same Period HRV= Mean Heart Rate over the Same Period Standard Deviation of Heart Rate over a Period ​

    Example: 0.10, 0.12, 0.08 Derived_Pulse_Pressure (Pulse Pressure)

    Description: The difference between systolic and diastolic blood pressure. Type: Integer Formula: 𝑃

    𝑃

    Systolic Blood Pressure − Diastolic Blood Pressure PP=Systolic Blood Pressure−Diastolic Blood Pressure Example: 40, 45, 30 Derived_BMI (Body Mass Index)

    Description: A measure of body fat based on weight and height. Type: Float Formula: 𝐵 𝑀

    𝐼

    Weight (kg) ( Height (m) ) 2 BMI= (Height (m)) 2

    Weight (kg) ​

    Example: 22.8, 25.4, 20.3 Derived_MAP (Mean Arterial Pressure)

    Description: An average blood pressure in an individual during a single cardiac cycle. Type: Float Formula: 𝑀 𝐴

    𝑃

    Diastolic Blood Pressure + 1 3 ( Systolic Blood Pressure − Diastolic Blood Pressure ) MAP=Diastolic Blood Pressure+ 3 1 ​ (Systolic Blood Pressure−Diastolic Blood Pressure) Example: 93.3, 100.0, 88.7 Target Feature Risk Category Description: Classification of patients into "High Risk" or "Low Risk" based on their vital signs. Type: Categorical Categories: High Risk, Low Risk Criteria: High Risk: Any of the following conditions Heart Rate: > 90 bpm or < 60 bpm Respiratory Rate: > 20 breaths per minute or < 12 breaths per minute Body Temperature: > 37.5°C or < 36.0°C Oxygen Saturation: < 95% Systolic Blood Pressure: > 140 mmHg or < 110 mmHg Diastolic Blood Pressure: > 90 mmHg or < 70 mmHg BMI: > 30 or < 18.5 Low Risk: None of the above conditions Example: High Risk, Low Risk This dataset, with a total of 200,000 samples, provides a robust foundation for various machine learning and statistical analysis tasks aimed at understanding and predicting patient health outcomes based on vital signs. The inclusion of both original attributes and derived features enhances the richness and utility of the dataset.

  16. Z

    Graded Incremental Test Data (Cycling, Running, Kayaking, Rowing): an open...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crampton, David (2024). Graded Incremental Test Data (Cycling, Running, Kayaking, Rowing): an open access dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6325734
    Explore at:
    Dataset updated
    Mar 19, 2024
    Dataset provided by
    Donne, Bernard
    Campbell, Garry
    Fleming, Neil
    Mahony, Nick
    Crampton, David
    Ward, Tomás
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Section 1: Introduction

    Brief overview of dataset contents:

    Current database contains anonymised data collected during exercise testing services performed on male and female participants (cycling, rowing, kayaking and running) provided by the Human Performance Laboratory, School of Medicine, Trinity College Dublin, Dublin 2, Ireland.

    835 graded incremental exercise test files (285 cycling, 266 rowing / kayaking, 284 running)

    Description file with each row representing a test file - COLUMNS: file name (AXXX), sport (cycling, running, rowing or kayaking)

    Anthropometric data of participants by sport (age, gender, height, body mass, BMI, skinfold thickness,% body fat, lean body mass and haematological data; namely, haemoglobin concentration (Hb), haematocrit (Hct), red blood cell (RBC) count and white blood cell (WBC) count )

    Test data (HR, VO2 and lactate data) at rest and across a range of exercise intensities

    Derived physiological indices quantifying each individual’s endurance profile

    Following a request from athletes seeking assessment by phone or e-mail the test protocol, risks, benefits and test and medical requirements, were explained verbally or by return e-mail. Subsequently, an appointment for an exercise assessment was arranged following the regulatory reflection period (7 days). Following this regulatory period each participant’s verbal consent was obtained pre-test, for participants under 18 years of age parent / guardian consent was obtained in writing. Ethics approval was obtained from the Faculty of Health Sciences ethics committee and all testing procedures were performed in compliance with Declaration of Helsinki guidelines.

    All consenting participants were required to attend the laboratory on one occasion in a rested, carbohydrate loaded and well-hydrated state, and for male participants’ clean shaven in the facial region. All participants underwent a pre-test medical examination, including assessment of resting blood pressure, pulmonary function testing and haematological (Coulter Counter Act Diff, Beckmann Coulter, CA,US) review performed by a qualified medical doctor prior to exercise testing. Any person presenting with any cardiac abnormalities, respiratory difficulties, symptoms of cold or influenza, musculoskeletal injury that could impair performance, diabetes, hypertension, metabolic disorders, or any other contra-indicatory symptoms were excluded. In addition, participants completed a medical questionnaire detailing training history, previous personal and family health abnormalities, recent illness or injury, menstrual status for female participants, as well as details of recent travel and current vaccination status, and current medications, supplements and allergies. Barefoot height in metre (Holtain, Crymych, UK), body mass (counter balanced scales) in kilogram (Seca, Hamburg, Germany) and skinfold thickness in millimetre using a Harpenden skinfold caliper (Bath International, West Sussex, UK) were recorded pre-exercise.

    Section 2: Testing protocols

    2.1: Cycling

    A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on an electromagnetically braked cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). Participants initially identified a cycling position in which they were most comfortable by adjusting saddle height, saddle fore-aft position relative to the crank axis, saddle to handlebar distance and handlebar height. Participant’s feet were secured to the ergometer using their own cycling shoes with cleats and accompanying pedals. The protocol commenced with a 15-min warm-up at a workload of 120 Watt (W), followed by a 10-min rest. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a workload of 100 or 120 W for female and male participants, respectively, and subsequently increasing by a 20, 30 or 40 W incremental increase every 3-min depending on gender and current competition category. During assessment participants maintained a constant self-selected cadence chosen during their warm-up (permitted window was 5 rev.min−1 within a permitted absolute range of 75 to 95 rev.min−1) and the test was terminated when a participant was no longer able to maintain a constant cadence.

    Heart rate (HR) data were recorded continuously by radio-telemetry using a Cosmed HR monitor (Cosmed, Rome, Italy). During the test, blood samples were collected from the middle finger of the right hand at the end of the second minute of each 3-min interval. The fingertip was cleaned to remove any sweat or blood and lanced using a long point sterile lancet (Braun, Melsungen, Germany). The blood sample was collected into a heparinised capillary tube (Brand, Wertheim, Germany) by holding the tube horizontal to the droplet and allowing transfer by capillary action. Subsequently, a 25μL aliquot of whole blood was drawn from the capillary tube using a YSI syringepet (YSI, OH, USA) and added into the chamber of a YSI 1500 Sport lactate analyser (YSI, OH, USA) for determination of non-lysed [Lac] in mmol.L−1. The lactate analyser was calibrated to the manufacturer’s requirements (± 0.05 mmol.L−1) before each test using a standard solution (YSI, OH, USA) of known concentration (5 mmol.L−1) and analyser linearity was confirmed using either a 15 or 30 mmol.L-1 standard solution (YSI, OH, USA).

    Gas exchange variables including respiration rate (Rf in breaths.min-1), minute ventilation (VE in L.min-1), oxygen consumption (VO2 in L.min-1 and in mL.kg-1.min-1) and carbon dioxide production (VCO2 in L.min-1), were measured on a breath-by-breath basis throughout the test, using a cardiopulmonary exercise testing unit (CPET) and an associated software package (Cosmed, Rome, Italy). Participants wore a face mask (Hans Rudolf, KA, USA) which was connected to the CPET unit. The metabolic unit was calibrated prior to each test using ambient air and an alpha certified gas mixture containing 16% O2, 5% CO2 and 79% N2 (Cosmed, Rome, Italy). Volume calibration was performed using a 3L gas calibration syringe (Cosmed, Rome, Italy). Barometric pressure recorded by the CPET was confirmed by recording barometric pressure using a laboratory grade barometer.

    Following testing mean HR and mean VO2 data at rest and during each exercise increment were computed and tabulated over the final minute of each 3-min interval. A graphical plot of [Lac], mean VO2 and mean HR versus cycling workload was constructed and analysed to quantify physiological endurance indices, see Data Analysis section. Data for VO2 peak in L.min-1 (absolute) and in mL.kg-1.min-1 (relative) and VE peak in L.min-1 were reported as the peak data recorded over any 10 consecutive breaths recorded during the last minute of the final exercise increment.

    2.2: Running protocol

    A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a motorised treadmill (Powerjog, Birmingham, UK). The running protocol, performed at a gradient of 0%, commenced with a 15-min warm-up at a velocity (km.h-1) which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity. Subsequently, the warm-up was followed by a 10 minute rest / dynamic stretching phase. From a safety perspective during all running GxT participants wore a suspended lightweight safety harness to minimise any potential falls risk. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal running velocity which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity, and subsequently increased by ≥ 1 km.h-1 every 3-min depending on gender and current competition category. The test was terminated when a participant was no longer able to maintain the imposed treadmill.

    Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.

    2.3: Rowing / kayaking protocol

    A discontinuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a Concept 2C rowing ergometer (Concept, VA, US) in rowers or a Dansprint kayak ergometer (Dansprint, Hvidovre, Denmark) in flat-water kayakers. The protocol commenced with a 15-min low-intensity warm-up at a workload (W) dependent on gender, sport and competition category, followed by a 10-min rest. For rowing the flywheel damping (120, 125 or 130W) was set dependent on gender and competition category. For kayaking the bungee cord tension was adjusted by individual participants to suit their requirements. A discontinuous protocol of 3-min exercise at a targeted load followed by a 1-min rest phase to facilitate stationary earlobe capillary blood sample collection and resetting of ergometer display (Dansprint ergometer) was used. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal load 80 to 120 W for rowing, 50 to 90 W for kayaking and subsequently increased by 20,30 or 40 W every 3-min depending on gender, sport and current competition category. The test was terminated when a participant was no longer able to maintain the targeted workload.

    Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.

    3.1: Data analysis

    Constructed graphical plots (HR, VO2 and [Lac] versus load / velocity) were analysed to quantify the following; load / velocity at TLac, HR at TLac, [Lac] at TLac, % of VO2 peak at TLac, % of HRmax at TLac, load / velocity and HR at a nominal [Lac] of 2 mmol.L-1, load / velocity, VO2 and [Lac} at a nominal HR of

  17. P

    Countix Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Jan 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Debidatta Dwibedi; Yusuf Aytar; Jonathan Tompson; Pierre Sermanet; Andrew Zisserman (2021). Countix Dataset [Dataset]. https://paperswithcode.com/dataset/countix
    Explore at:
    Dataset updated
    Jan 11, 2021
    Authors
    Debidatta Dwibedi; Yusuf Aytar; Jonathan Tompson; Pierre Sermanet; Andrew Zisserman
    Description

    Countix is a real world dataset of repetition videos collected in the wild (i.e.YouTube) covering a wide range of semantic settings with significant challenges such as camera and object motion, diverse set of periods and counts, and changes in the speed of repeated actions. Countix include repeated videos of workout activities (squats, pull ups, battle rope training, exercising arm), dance moves (pirouetting, pumping fist), playing instruments (playing ukulele), using tools repeatedly (hammer hitting objects, chainsaw cutting wood, slicing onion), artistic performances (hula hooping, juggling soccer ball), sports (playing ping pong and tennis) and many others. Figure 6 illustrates some examples from the dataset as well as the distribution of repetition counts and period lengths.

  18. d

    Data from: Haploids adapt faster than diploids across a range of...

    • datadryad.org
    • data.niaid.nih.gov
    • +2more
    zip
    Updated Dec 7, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto (2010). Haploids adapt faster than diploids across a range of environments [Dataset]. http://doi.org/10.5061/dryad.8048
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2010
    Dataset provided by
    Dryad
    Authors
    Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto
    Time period covered
    2010
    Description

    Raw data to calculate rate of adaptationRaw dataset for rate of adaptation calculations (Figure 1) and related statistics.dataall.csvR code to analyze raw data for rate of adaptationCompetition Analysis.RRaw data to calculate effective population sizesdatacount.csvR code to analayze effective population sizesR code used to analyze effective population sizes; Figure 2Cell Count Ne.RR code to determine our best estimate of the dominance coefficient in each environmentR code to produce figures 3, S4, S5 -- what is the best estimate of dominance? Note, competition and effective population size R code must be run first in the same session.what is h.R

  19. m

    USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven

    • app.mobito.io
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven [Dataset]. https://app.mobito.io/data-product/usa-enriched-geospatial-framework-dataset
    Explore at:
    Area covered
    United States
    Description

    Our dataset provides detailed and precise insights into the business, commercial, and industrial aspects of any given area in the USA (Including Point of Interest (POI) Data and Foot Traffic. The dataset is divided into 150x150 sqm areas (geohash 7) and has over 50 variables. - Use it for different applications: Our combined dataset, which includes POI and foot traffic data, can be employed for various purposes. Different data teams use it to guide retailers and FMCG brands in site selection, fuel marketing intelligence, analyze trade areas, and assess company risk. Our dataset has also proven to be useful for real estate investment.- Get reliable data: Our datasets have been processed, enriched, and tested so your data team can use them more quickly and accurately.- Ideal for trainning ML models. The high quality of our geographic information layers results from more than seven years of work dedicated to the deep understanding and modeling of geospatial Big Data. Among the features that distinguished this dataset is the use of anonymized and user-compliant mobile device GPS location, enriched with other alternative and public data.- Easy to use: Our dataset is user-friendly and can be easily integrated to your current models. Also, we can deliver your data in different formats, like .csv, according to your analysis requirements. - Get personalized guidance: In addition to providing reliable datasets, we advise your analysts on their correct implementation.Our data scientists can guide your internal team on the optimal algorithms and models to get the most out of the information we provide (without compromising the security of your internal data).Answer questions like: - What places does my target user visit in a particular area? Which are the best areas to place a new POS?- What is the average yearly income of users in a particular area?- What is the influx of visits that my competition receives?- What is the volume of traffic surrounding my current POS?This dataset is useful for getting insights from industries like:- Retail & FMCG- Banking, Finance, and Investment- Car Dealerships- Real Estate- Convenience Stores- Pharma and medical laboratories- Restaurant chains and franchises- Clothing chains and franchisesOur dataset includes more than 50 variables, such as:- Number of pedestrians seen in the area.- Number of vehicles seen in the area.- Average speed of movement of the vehicles seen in the area.- Point of Interest (POIs) (in number and type) seen in the area (supermarkets, pharmacies, recreational locations, restaurants, offices, hotels, parking lots, wholesalers, financial services, pet services, shopping malls, among others). - Average yearly income range (anonymized and aggregated) of the devices seen in the area.Notes to better understand this dataset:- POI confidence means the average confidence of POIs in the area. In this case, POIs are any kind of location, such as a restaurant, a hotel, or a library. - Category confidences, for example"food_drinks_tobacco_retail_confidence" indicates how confident we are in the existence of food/drink/tobacco retail locations in the area. - We added predictions for The Home Depot and Lowe's Home Improvement stores in the dataset sample. These predictions were the result of a machine-learning model that was trained with the data. Knowing where the current stores are, we can find the most similar areas for new stores to open.How efficient is a Geohash?Geohash is a faster, cost-effective geofencing option that reduces input data load and provides actionable information. Its benefits include faster querying, reduced cost, minimal configuration, and ease of use.Geohash ranges from 1 to 12 characters. The dataset can be split into variable-size geohashes, with the default being geohash7 (150m x 150m).

  20. Z

    REHAB24-6: A multi-modal dataset of physical rehabilitation exercises

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sedmidubsky, Jan (2024). REHAB24-6: A multi-modal dataset of physical rehabilitation exercises [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13305825
    Explore at:
    Dataset updated
    Aug 28, 2024
    Dataset provided by
    Budikova, Petra
    Černek, Andrej
    Sedmidubsky, Jan
    Katzer, Lukáš
    Procházka, Michal
    Jánošová, Miriama
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    To enable the evaluation of HPE models and the development of exercise feedback systems, we produced a new rehabilitation dataset (REHAB24-6). The main focus is on a diverse range of exercises, views, body heights, lighting conditions, and exercise mistakes. With the publicly available RGB videos, skeleton sequences, repetition segmentation, and exercise correctness labels, this dataset offers the most comprehensive testbed for exercise-correctness-related tasks.

    Contents

    65 recordings (184,825 frames, 30 FPS):

    RGB videos from two cameras (videos.zip, horizontal = Camera17, vertical = Camera18);

    3D and 2D projected positions of 41 motion capture marker (<2/3>d_markers.zip, marker labels in marker_names.txt);

    3D and 2D projected positions of 26 skeleton joints (<2/3>d_joints.zip, joint labels in joint_names.txt);

    Annotation of 1,072 exercise repetitions (Segmentation.csv, indexed based only on 30 FPS data, described in Segmentation.txt):

    Temporal segmentation (start/end frame, most between 2–5 seconds);

    Binary correctness label (around 90 from each category in each exercise, except Ex3 with around 50);

    Exercise direction (around 90 from each direction in each exercise);

    Lighting conditions label.

    Recording Conditions

    Our laboratory setup included 18 synchronized sensors (2 RGB video cameras, 16 ultra-wide motion capture cameras) spread around an 8.2 × 7 m room. The RGB cameras were located in the corners of the room, one in a horizontal position (hor.), providing a larger field of view (FoV), and one in a vertical (ver.), resulting in a narrower FoV. Both types of cameras were synchronized with a sampling frequency of 30 frames per second (FPS).

    The subjects wore motion capture body suits with 41 markers attached to them, which were detected by optical cameras. The OptiTrack Motive 2.3.0 software inferred the 3D positions of the markers in virtual centimeters and converted them into a skeleton with 26 joints, forming our human pose 3D ground truth (GT).

    To acquire a 2D version of the ground truth in pixel coordinates, we applied a projection of the virtual coordinates into the camera using the simplified pinhole model. We estimated the parameters for this projection as follows. First, the virtual position of the cameras was estimated using measuring tape and knowledge of the virtual origin. Then, the orientation of the cameras was optimized by matching the virtual marker positions with their position in the videos.

    We also simulated changes in lighting conditions: a few videos were shot in the natural evening light, which resulted in worse visibility, while the rest were under artificial lighting.

    Exercises

    10 subjects participated in our recording and consented to release the data publicly: 6 males and 4 females of different ages (from 25 to 50) and fitness levels. A physiotherapist instructed the subjects on how to perform the exercises so that at least five repetitions were done in what he deemed the correct way and five more incorrectly. The participants had a certain degree of freedom, e.g., in which leg they used in Ex4 and Ex5. Similarly, the physiotherapist suggested different exercise mistakes for each subject.

    Ex1 = Arm abduction: sideway raising of the straightened right arm;

    Ex2 = Arm VW: fluent transition of arms between V (arms straight up) and W (elbows down, hands up) shape;

    Ex3 = Push-ups: push-ups with hands on a table;

    Ex4 = Leg abduction: sideway raising of the straightened leg;

    Ex5 = Leg lunge: pushing a knee of the back leg down while keeping a right angle on the front knee;

    Ex6 = Squats.

    Every exercise was also executed in two directions, resulting in different views of the subject depending on the camera. Facing the horizontal camera resulted in a front view for that camera and a profile from the other. Facing the wall between the cameras shows the subject from half-profile in both cameras. A rare direction, only used for push-ups due to the use of the table, was facing the vertical camera, with the views being reversed compared to the first orientation.

    Citation

    Cite the related conference paper:

    Černek, A., Sedmidubsky, J., Budikova P.: REHAB24-6: Physical Therapy Dataset for Analyzing Pose Estimation Methods. 17th International Conference on Similarity Search and Applications (SISAP). Springer, 14 pages, 2024.

    License

    This dataset is for academic or non-profit organization noncomercial research use only. By using you agree to appropriately reference the paper above in any publication making of its use. For comercial purposes contact us at info@visioncraft.ai

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas (2023). Dataset for The effects of a number line intervention on calculation skills [Dataset]. http://doi.org/10.25949/22799717.V1

Dataset for The effects of a number line intervention on calculation skills

Explore at:
Dataset updated
May 18, 2023
Dataset provided by
Macquarie University
Authors
Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas
Description

Study information

The sample included in this dataset represents five children who participated in a number line intervention study. Originally six children were included in the study, but one of them fulfilled the criterion for exclusion after missing several consecutive sessions. Thus, their data is not included in the dataset.

All participants were currently attending Year 1 of primary school at an independent school in New South Wales, Australia. For children to be able to eligible to participate they had to present with low mathematics achievement by performing at or below the 25th percentile in the Maths Problem Solving and/or Numerical Operations subtests from the Wechsler Individual Achievement Test III (WIAT III A & NZ, Wechsler, 2016). Participants were excluded from participating if, as reported by their parents, they have any other diagnosed disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, intellectual disability, developmental language disorder, cerebral palsy or uncorrected sensory disorders.

The study followed a multiple baseline case series design, with a baseline phase, a treatment phase, and a post-treatment phase. The baseline phase varied between two and three measurement points, the treatment phase varied between four and seven measurement points, and all participants had 1 post-treatment measurement point.

The number of measurement points were distributed across participants as follows:

Participant 1 – 3 baseline, 6 treatment, 1 post-treatment

Participant 3 – 2 baseline, 7 treatment, 1 post-treatment

Participant 5 – 2 baseline, 5 treatment, 1 post-treatment

Participant 6 – 3 baseline, 4 treatment, 1 post-treatment

Participant 7 – 2 baseline, 5 treatment, 1 post-treatment

In each session across all three phases children were assessed in their performance on a number line estimation task, a single-digit computation task, a multi-digit computation task, a dot comparison task and a number comparison task. Furthermore, during the treatment phase, all children completed the intervention task after these assessments. The order of the assessment tasks varied randomly between sessions.


Measures

Number Line Estimation. Children completed a computerised bounded number line task (0-100). The number line is presented in the middle of the screen, and the target number is presented above the start point of the number line to avoid signalling the midpoint (Dackermann et al., 2018). Target numbers included two non-overlapping sets (trained and untrained) of 30 items each. Untrained items were assessed on all phases of the study. Trained items were assessed independent of the intervention during baseline and post-treatment phases, and performance on the intervention is used to index performance on the trained set during the treatment phase. Within each set, numbers were equally distributed throughout the number range, with three items within each ten (0-10, 11-20, 21-30, etc.). Target numbers were presented in random order. Participants did not receive performance-based feedback. Accuracy is indexed by percent absolute error (PAE) [(number estimated - target number)/ scale of number line] x100.


Single-Digit Computation. The task included ten additions with single-digit addends (1-9) and single-digit results (2-9). The order was counterbalanced so that half of the additions present the lowest addend first (e.g., 3 + 5) and half of the additions present the highest addend first (e.g., 6 + 3). This task also included ten subtractions with single-digit minuends (3-9), subtrahends (1-6) and differences (1-6). The items were presented horizontally on the screen accompanied by a sound and participants were required to give a verbal response. Participants did not receive performance-based feedback. Performance on this task was indexed by item-based accuracy.


Multi-digit computational estimation. The task included eight additions and eight subtractions presented with double-digit numbers and three response options. None of the response options represent the correct result. Participants were asked to select the option that was closest to the correct result. In half of the items the calculation involved two double-digit numbers, and in the other half one double and one single digit number. The distance between the correct response option and the exact result of the calculation was two for half of the trials and three for the other half. The calculation was presented vertically on the screen with the three options shown below. The calculations remained on the screen until participants responded by clicking on one of the options on the screen. Participants did not receive performance-based feedback. Performance on this task is measured by item-based accuracy.


Dot Comparison and Number Comparison. Both tasks included the same 20 items, which were presented twice, counterbalancing left and right presentation. Magnitudes to be compared were between 5 and 99, with four items for each of the following ratios: .91, .83, .77, .71, .67. Both quantities were presented horizontally side by side, and participants were instructed to press one of two keys (F or J), as quickly as possible, to indicate the largest one. Items were presented in random order and participants did not receive performance-based feedback. In the non-symbolic comparison task (dot comparison) the two sets of dots remained on the screen for a maximum of two seconds (to prevent counting). Overall area and convex hull for both sets of dots is kept constant following Guillaume et al. (2020). In the symbolic comparison task (Arabic numbers), the numbers remained on the screen until a response was given. Performance on both tasks was indexed by accuracy.


The Number Line Intervention

During the intervention sessions, participants estimated the position of 30 Arabic numbers in a 0-100 bounded number line. As a form of feedback, within each item, the participants’ estimate remained visible, and the correct position of the target number appeared on the number line. When the estimate’s PAE was lower than 2.5, a message appeared on the screen that read “Excellent job”, when PAE was between 2.5 and 5 the message read “Well done, so close! and when PAE was higher than 5 the message read “Good try!” Numbers were presented in random order.


Variables in the dataset

Age = age in ‘years, months’ at the start of the study

Sex = female/male/non-binary or third gender/prefer not to say (as reported by parents)

Math_Problem_Solving_raw = Raw score on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

Math_Problem_Solving_Percentile = Percentile equivalent on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

Num_Ops_Raw = Raw score on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

Math_Problem_Solving_Percentile = Percentile equivalent on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).


The remaining variables refer to participants’ performance on the study tasks. Each variable name is composed by three sections. The first one refers to the phase and session. For example, Base1 refers to the first measurement point of the baseline phase, Treat1 to the first measurement point on the treatment phase, and post1 to the first measurement point on the post-treatment phase.


The second part of the variable name refers to the task, as follows:

DC = dot comparison

SDC = single-digit computation

NLE_UT = number line estimation (untrained set)

NLE_T= number line estimation (trained set)

CE = multidigit computational estimation

NC = number comparison

The final part of the variable name refers to the type of measure being used (i.e., acc = total correct responses and pae = percent absolute error).


Thus, variable Base2_NC_acc corresponds to accuracy on the number comparison task during the second measurement point of the baseline phase and Treat3_NLE_UT_pae refers to the percent absolute error on the untrained set of the number line task during the third session of the Treatment phase.





Search
Clear search
Close search
Google apps
Main menu