23 datasets found
  1. e

    Exploratory Data Analytics and Descriptive Statistics

    • paper.erudition.co.in
    html
    Updated Jun 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2021). Exploratory Data Analytics and Descriptive Statistics [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-2020-2021/5/data-analytics-skills-for-managers
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 1, 2021
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Exploratory Data Analytics and Descriptive Statistics of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021

  2. Black Friday Sales EDA

    • kaggle.com
    Updated Oct 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rushikesh Konapure (2022). Black Friday Sales EDA [Dataset]. https://www.kaggle.com/datasets/rishikeshkonapure/black-friday-sales-eda
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 29, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rushikesh Konapure
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset History

    A retail company “ABC Private Limited” wants to understand the customer purchase behaviour (specifically, purchase amount) against various products of different categories. They have shared purchase summaries of various customers for selected high-volume products from last month. The data set also contains customer demographics (age, gender, marital status, city type, stay in the current city), product details (productid and product category) and Total purchase amount from last month.

    Now, they want to build a model to predict the purchase amount of customers against various products which will help them to create a personalized offer for customers against different products.

    Tasks to perform

    The purchase column is the Target Variable, perform Univariate Analysis and Bivariate Analysis w.r.t the Purchase.

    Masked in the column description means already converted from categorical value to numerical column.

    Below mentioned points are just given to get you started with the dataset, not mandatory to follow the same sequence.

    DATA PREPROCESSING

    • Check the basic statistics of the dataset

    • Check for missing values in the data

    • Check for unique values in data

    • Perform EDA

    • Purchase Distribution

    • Check for outliers

    • Analysis by Gender, Marital Status, occupation, occupation vs purchase, purchase by city, purchase by age group, etc

    • Drop unnecessary fields

    • Convert categorical data into integer using map function (e.g 'Gender' column)

    • Missing value treatment

    • Rename columns

    • Fill nan values

    • map range variables into integers (e.g 'Age' column)

    Data Visualisation

    • visualize individual column
    • Age vs Purchased
    • Occupation vs Purchased
    • Productcategory1 vs Purchased
    • Productcategory2 vs Purchased
    • Productcategory3 vs Purchased
    • City category pie chart
    • check for more possible plots

    All the Best!!

  3. Data from: Supplementary Material for "Sonification for Exploratory Data...

    • search.datacite.org
    • pub.uni-bielefeld.de
    Updated Feb 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Hermann (2019). Supplementary Material for "Sonification for Exploratory Data Analysis" [Dataset]. http://doi.org/10.4119/unibi/2920448
    Explore at:
    Dataset updated
    Feb 5, 2019
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Bielefeld University
    Authors
    Thomas Hermann
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Sonification for Exploratory Data Analysis #### Chapter 8: Sonification Models In Chapter 8 of the thesis, 6 sonification models are presented to give some examples for the framework of Model-Based Sonification, developed in Chapter 7. Sonification models determine the rendering of the sonification and possible interactions. The "model in mind" helps the user to interprete the sound with respect to the data. ##### 8.1 Data Sonograms Data Sonograms use spherical expanding shock waves to excite linear oscillators which are represented by point masses in model space. * Table 8.2, page 87: Sound examples for Data Sonograms File: Iris dataset: started in plot (a) at S0 (b) at S1 (c) at S2
    10d noisy circle dataset: started in plot (c) at S0 (mean) (d) at S1 (edge)
    10d Gaussian: plot (d) started at S0
    3 clusters: Example 1
    3 clusters: invisible columns used as output variables: Example 2 Description: Data Sonogram Sound examples for synthetic datasets and the Iris dataset Duration: about 5 s ##### 8.2 Particle Trajectory Sonification Model This sonification model explores features of a data distribution by computing the trajectories of test particles which are injected into model space and move according to Newton's laws of motion in a potential given by the dataset. * Sound example: page 93, PTSM-Ex-1 Audification of 1 particle in the potential of phi(x). * Sound example: page 93, PTSM-Ex-2 Audification of a sequence of 15 particles in the potential of a dataset with 2 clusters. * Sound example: page 94, PTSM-Ex-3 Audification of 25 particles simultaneous in a potential of a dataset with 2 clusters. * Sound example: page 94, PTSM-Ex-4 Audification of 25 particles simultaneous in a potential of a dataset with 1 cluster. * Sound example: page 95, PTSM-Ex-5 sigma-step sequence for a mixture of three Gaussian clusters * Sound example: page 95, PTSM-Ex-6 sigma-step sequence for a Gaussian cluster * Sound example: page 96, PTSM-Iris-1 Sonification for the Iris Dataset with 20 particles per step. * Sound example: page 96, PTSM-Iris-2 Sonification for the Iris Dataset with 3 particles per step. * Sound example: page 96, PTSM-Tetra-1 Sonification for a 4d tetrahedron clusters dataset. ##### 8.3 Markov chain Monte Carlo Sonification The McMC Sonification Model defines a exploratory process in the domain of a given density p such that the acoustic representation summarizes features of p, particularly concerning the modes of p by sound. * Sound Example: page 105, MCMC-Ex-1 McMC Sonification, stabilization of amplitudes. * Sound Example: page 106, MCMC-Ex-2 Trajectory Audification for 100 McMC steps in 3 cluster dataset * McMC Sonification for Cluster Analysis, dataset with three clusters, page 107 * Stream 1 MCMC-Ex-3.1 * Stream 2 MCMC-Ex-3.2 * Stream 3 MCMC-Ex-3.3 * Mix MCMC-Ex-3.4 * McMC Sonification for Cluster Analysis, dataset with three clusters, T =0.002s, page 107 * Stream 1 MCMC-Ex-4.1 (stream 1) * Stream 2 MCMC-Ex-4.2 (stream 2) * Stream 3 MCMC-Ex-4.3 (stream 3) * Mix MCMC-Ex-4.4 * McMC Sonification for Cluster Analysis, density with 6 modes, T=0.008s, page 107 * Stream 1 MCMC-Ex-5.1 (stream 1) * Stream 2 MCMC-Ex-5.2 (stream 2) * Stream 3 MCMC-Ex-5.3 (stream 3) * Mix MCMC-Ex-5.4 * McMC Sonification for the Iris dataset, page 108 * MCMC-Ex-6.1 * MCMC-Ex-6.2 * MCMC-Ex-6.3 * MCMC-Ex-6.4 * MCMC-Ex-6.5 * MCMC-Ex-6.6 * MCMC-Ex-6.7 * MCMC-Ex-6.8 ##### 8.4 Principal Curve Sonification Principal Curve Sonification represents data by synthesizing the soundscape while a virtual listener moves along the principal curve of the dataset through the model space. * Noisy Spiral dataset, PCS-Ex-1.1 , page 113 * Noisy Spiral dataset with variance modulation PCS-Ex-1.2 , page 114 * 9d tetrahedron cluster dataset (10 clusters) PCS-Ex-2 , page 114 * Iris dataset, class label used as pitch of auditory grains PCS-Ex-3 , page 114 ##### 8.5 Data Crystallization Sonification Model * Table 8.6, page 122: Sound examples for Crystallization Sonification for 5d Gaussian distribution File: DCS started at center, in tail, from far outside Description: DCS for dataset sampled from N{0, I_5} excited at different locations Duration: 1.4 s * Mixture of 2 Gaussians, page 122 * DCS started at point A DCS-Ex1A * DCS started at point B DCS-Ex1B * Table 8.7, page 124: Sound examples for DCS on variation of the harmonics factor File: h_omega = 1, 2, 3, 4, 5, 6 Description: DCS for a mixture of two Gaussians with varying harmonics factor Duration: 1.4 s * Table 8.8, page 124: Sound examples for DCS on variation of the energy decay time File: tau_(1/2) = 0.001, 0.005, 0.01, 0.05, 0.1, 0.2 Description: DCS for a mixture of two Gaussians varying the energy decay time tau_(1/2) Duration: 1.4 s * Table 8.9, page 125: Sound examples for DCS on variation of the sonification time File: T = 0.2, 0.5, 1, 2, 4, 8 Description: DCS for a mixture of two Gaussians on varying the duration T Duration: 0.2s -- 8s * Table 8.10, page 125: Sound examples for DCS on variation of model space dimension File: selected columns of the dataset: (x0) (x0,x1) (x0,...,x2) (x0,...,x3) (x0,...,x4) (x0,...,x5) Description: DCS for a mixture of two Gaussians varying the dimension Duration: 1.4 s * Table 8.11, page 126: Sound examples for DCS for different excitation locations File: starting point: C0, C1, C2 Description: DCS for a mixture of three Gaussians in 10d space with different rank(S) = {2,4,8} Duration: 1.9 s * Table 8.12, page 126: Sound examples for DCS for the mixture of a 2d distribution and a 5d cluster File: condensation nucleus in (x0,x1)-plane at: (-6,0)=C1, (-3,0)=C2, ( 0,0)=C0 Description: DCS for a mixture of a uniform 2d and a 5d Gaussian Duration: 2.16 s * Table 8.13, page 127: Sound examples for DCS for the cancer dataset File: condensation nucleus in (x0,x1)-plane at: benign 1, benign 2
    malignant 1, malignant 2 Description: DCS for a mixture of a uniform 2d and a 5d Gaussian Duration: 2.16 s ##### 8.6 Growing Neural Gas Sonification * Table 8.14, page 133: Sound examples for GNGS Probing File: Cluster C0 (2d): a, b, c
    Cluster C1 (4d): a, b, c
    Cluster C2 (8d): a, b, c Description: GNGS for a mixture of 3 Gaussians in 10d space Duration: 1 s * Table 8.15, page 134: Sound examples for GNGS for the noisy spiral dataset File: (a) GNG with 3 neurons 1, 2
    (b) GNG with 20 neurons end, middle, inner end
    (c) GNG with 45 neurons outer end, middle, close to inner end, at inner end
    (d) GNG with 150 neurons outer end, in the middle, inner end
    (e) GNG with 20 neurons outer end, in the middle, inner end
    (f) GNG with 45 neurons outer end, in the middle, inner end Description: GNG probing sonification for 2d noisy spiral dataset Duration: 1 s * Table 8.16, page 136: Sound examples for GNG Process Monitoring Sonification for different data distributions File: Noisy spiral with 1 rotation: sound
    Noisy spiral with 2 rotations: sound
    Gaussian in 5d: sound
    Mixture of 5d and 2d distributions: sound Description: GNG process sonification examples Duration: 5 s #### Chapter 9: Extensions #### In this chapter, two extensions for Parameter Mapping

  4. u

    ERA5 Reanalysis Monthly Means

    • data.ucar.edu
    • rda.ucar.edu
    grib
    Updated Aug 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Medium-Range Weather Forecasts (2024). ERA5 Reanalysis Monthly Means [Dataset]. http://doi.org/10.5065/D63B5XW1
    Explore at:
    gribAvailable download formats
    Dataset updated
    Aug 4, 2024
    Dataset provided by
    Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory
    Authors
    European Centre for Medium-Range Weather Forecasts
    Time period covered
    Jan 1, 2008 - Dec 31, 2017
    Area covered
    Description

    Please note: Please use ds633.1 to access RDA maintained ERA-5 Monthly Mean data, see ERA5 Reanalysis (Monthly Mean 0.25 Degree Latitude-Longitude Grid), RDA dataset ds633.1. This dataset is no longer being updated, and web access has been removed. After many years of research and technical preparation, the production of a new ECMWF climate reanalysis to replace ERA-Interim is in progress. ERA5 is the fifth generation of ECMWF atmospheric reanalyses of the global climate, which started with the FGGE reanalyses produced in the 1980s, followed by ERA-15, ERA-40 and most recently ERA-Interim. ERA5 will cover the period January 1950 to near real time, though the first segment of data to be released will span the period 2010-2016. ERA5 is produced using high-resolution forecasts (HRES) at 31 kilometer resolution (one fourth the spatial resolution of the operational model) and a 62 kilometer resolution ten member 4D-Var ensemble of data assimilation (EDA) in CY41r2 of ECMWF's Integrated Forecast System (IFS) with 137 hybrid sigma-pressure (model) levels in the vertical, up to a top level of 0.01 hPa. Atmospheric data on these levels are interpolated to 37 pressure levels (the same levels as in ERA-Interim). Surface or single level data are also available, containing 2D parameters such as precipitation, 2 meter temperature, top of atmosphere radiation and vertical integrals over the entire atmosphere. The IFS is coupled to a soil model, the parameters of which are also designated as surface parameters, and an ocean wave model. Generally, the data is available at an hourly frequency and consists of analyses and short (18 hour) forecasts, initialized twice daily from analyses at 06 and 18 UTC. Most analyses parameters are also available from the forecasts. There are a number of forecast parameters, e.g. mean rates and accumulations, that are not available from the analyses. Together, the hourly analysis and twice daily forecast parameters form the basis of the monthly...

  5. Data from: Evaluating the Use of Uncertainty Visualisations for Imputations...

    • osf.io
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhraneel Sarma (2024). Evaluating the Use of Uncertainty Visualisations for Imputations of Data Missing At Random in Scatterplots [Dataset]. https://osf.io/q4y5r
    Explore at:
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Abhraneel Sarma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains supplementary materials for the paper, Evaluating the Use of Uncertainty Visualisations for Imputations of Data Missing At Random in Scatterplots

    Abstract: Most real-world datasets contain missing values yet most exploratory data analysis (EDA) systems only support visualising data points with complete cases. This omission may potentially lead the user to biased analyses and insights. Imputation techniques can help estimate the value of a missing data point, but introduces additional uncertainty. In this work, we investigate the effects of visualising imputed values in charts using different types of uncertainty visualisation techniques—no imputation, mean, 95% confidence intervals, probability density plots, gradient intervals, and hypothetical outcome plots. We focus on scatterplots, which is a commonly used chart type, and conduct a crowdsourced study with 202 participants. We measure users’ bias and precision in performing two tasks—estimating average and detecting trend—and their self-reported confidence in performing these tasks. Our results suggest that, when estimating averages, uncertainty representations may reduce bias but at the cost of decreasing precision. When estimating trend, only hypothetical outcome plots may lead to a small probability of reducing bias while increasing precision. Participants in every uncertainty representation were less certain about their response when compared to the baseline. The findings point towards potential trade-offs in using uncertainty encodings for datasets with a large number of missing values.

  6. Iterative Imputation of Jane St train.csv

    • kaggle.com
    Updated Nov 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tpmeli (2020). Iterative Imputation of Jane St train.csv [Dataset]. https://www.kaggle.com/tpmeli/iterative-imputation-of-jane-st-traincsv/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    tpmeli
    Description

    I will be sharing all of my missing data exploration here:

    https://www.kaggle.com/tpmeli/missing-data-exploration-mean-iterative-more

  7. Data from: Exploratory investigation of historical decorative laminates by...

    • zenodo.org
    Updated Apr 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    An Jacquemain; Klara Retko; Lea Legan; Polonca Ropret; Friederike Waentig; Vincent Cattersel; An Jacquemain; Klara Retko; Lea Legan; Polonca Ropret; Friederike Waentig; Vincent Cattersel (2023). Exploratory investigation of historical decorative laminates by means of vibrational spectroscopic techniques [Dataset]. http://doi.org/10.5281/zenodo.7862015
    Explore at:
    Dataset updated
    Apr 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    An Jacquemain; Klara Retko; Lea Legan; Polonca Ropret; Friederike Waentig; Vincent Cattersel; An Jacquemain; Klara Retko; Lea Legan; Polonca Ropret; Friederike Waentig; Vincent Cattersel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the data used for the publication entitled "Exploratory investigation of historical decorative laminates by means of vibrational spectroscopic techniques".

  8. u

    Data from: Exploratory Twitter hashtag analysis of movie premieres in the...

    • portalcientificovalencia.univeuropea.com
    Updated 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yeste, Víctor; Yeste, Víctor (2024). Exploratory Twitter hashtag analysis of movie premieres in the USA [Dataset]. https://portalcientificovalencia.univeuropea.com/documentos/67321ed1aea56d4af0485dad
    Explore at:
    Dataset updated
    2024
    Authors
    Yeste, Víctor; Yeste, Víctor
    Area covered
    United States
    Description

    This work is an exploratory, quantitative, and not experimental study with an inductive inference type and a longitudinal follow-up. It analyzes movie data and tweets published by users using the official Twitter hashtags of movie premieres the week before, the same week, and the week after each release date.The scope of the study is the collection of movies released in February 2022 in the USA, and the object of the study includes them and the tweets that refer to the film in the 3 closest weeks to their premiere dates. The tweets recollected were classified by the week they were published, so they are classified by a time dimension called timepoint. The week before the release date has been designated as timepoint 1, the week of the release date is timepoint 2, and the week immediately afterward is timepoint 3. Another dimension that has been considered is if the movie has domestic production or not, which means that if one of the countries of origin is the United States, the movie is designated as domestic.The chosen variables are organized in two data tables, one for the movies and one for the collected tweets.Variables related to the movies:id: Internal id of the moviename: Title of the moviehashtag: Official hashtag of the moviecountries: List of countries of the movie, separated by a semicolonmpaa: Film ratings system by the Motion Picture Association of America. It is a completely voluntary rating system and ratings have no legal standing. The currently rating systems include G (general audiences), PG (parental guidance suggested), PG-13 (parents strongly cautioned), R (restricted, under 17 requires accompanying parent or adult guardian) and NC-17 (no one 17 and under admitted)(Film Ratings - Motion Picture Association, n.d.)genres: List of genres of the movie, e.g., Action or Thriller, separated by a semicolonrelease_date: Release date of the movie in a format YYYY-MM-DDopening_grosses: Amount of USA dollars that the movie obtained on the opening date (the first week after the release date)opening_theaters: Amount of USA theaters that released the movie on the opening date (the first week after the release date)rating_avg: Average rating of the movieVariables related to the tweets:id: Internal id of the tweetstatus_id: Twitter id of the tweetmovie_id: Internal id of the movietimepoint: Week number related to the movie premiere that the tweet was published on. “1” is the week before the movie release, “2” is the week after the movie release” and “3” is the second week after the movie release.author_id: Twitter id of the author of the tweetcreated_at: Date and time of the tweet, with format “YYYY-MM-DD HH:MM:SS”quote_count: Number of the tweet’s quotesreply_count: Number of the tweet’s repliesretweet_count: Number of the tweet’s retweetslike_count: Number of the tweet’s likessentiment: Sentiment analysis of the tweet’s content with a range from -1 (negative) to 1 (positive)This dataset has contributed to the elaboration of the book chapters:Yeste, Víctor; Calduch-Losa, Ángeles (2022). Genre classification of movie releases in the USA: Exploring data with Twitter hashtags. In Narrativas emergentes para la comunicación digital (pp. 1012-1044). Dykinson, S. L.Yeste, Víctor; Calduch-Losa, Ángeles (2022). Exploratory Twitter hashtag analysis of movie premieres in the USA. In Desafíos audiovisuales de la tecnología y los contenidos en la cultura digital (pp. 169-187). McGraw-Hill Interamericana de España S.L.Yeste, Víctor; Calduch-Losa, Ángeles (2022). ANOVA to study movie premieres in the USA and online conversation on Twitter. The case of rating average using data from official Twitter hashtags. In El mapa y la brújula. Navegando por las metodologías de investigación en comunicación (pp. 151-168). Editorial Fragua.

  9. m

    Data from: Wrist-worn sensor validation for heart rate variability and...

    • data.mendeley.com
    • data.niaid.nih.gov
    • +1more
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simone Costantini (2023). Wrist-worn sensor validation for heart rate variability and electrodermal activity detection in a stressful driving environment [Dataset]. http://doi.org/10.17632/npnv4tsbg7.1
    Explore at:
    Dataset updated
    Jun 21, 2023
    Authors
    Simone Costantini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The current dataset contributes to assess the accuracy of the Empatica 4 (E4) wristband for the detection of heart rate variability (HRV) and electrodermal activity (EDA) metrics in stress-inducing conditions and growing-risk driving scenarios. Heart Rate Variability (HRV) and ElectroDermal Activity (EDA) signals were recorded over six experimental conditions (i.e., Baseline, Video Clip, Scream, No Risk Driving, Low-Risk Driving, and High-Risk Driving) and by means of two measurement systems: the E4 device and a gold standard system. The raw quality of the physiological signals was enhanced by means of robust semi-automatic reconstruction algorithms. Heart Rate Variability time-domain parameters showed high accuracy in motion-free experimental conditions, while Heart Rate Variability frequency-domain parameters reported sufficient accuracy in almost every experimental condition.

    Folder 01 contains both HRV and EDA parameters for every experimental condition, according to the Gold Standard measurement system and the Empatica 4 device, in two separate Excel files.

    Folder 02 contains supplementary material on the assessment of the signals quality.

    Folder 03 contains the Bland-Altman plot for each HRV and EDA parameter and for each condition (1 .png file per each parameter), and an excel file that resumes the Bland-Altman analyses numerical outcomes.

  10. g

    Data from: Exploratory Research on the Impact of the Growing Oil Industry in...

    • gimi9.com
    • s.cnmilf.com
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Exploratory Research on the Impact of the Growing Oil Industry in North Dakota and Montana on Domestic Violence, Dating Violence, Sexual Assault, and Stalking, 2000-2015 [Dataset]. https://gimi9.com/dataset/data-gov_3b52792d42c345dc455bcde14b2a752051363cac
    Explore at:
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study used secondary analysis of data from several different sources to examine the impact of increased oil development on domestic violence, dating violence, sexual assault, and stalking (DVDVSAS) in the Bakken region of Montana and North Dakota. Distributed here are the code used for the secondary analysis data; the data are not available through other public means. Please refer to the User Guide distributed with this study for a list of instructions on how to obtain all other data used in this study. This collection contains a secondary analysis of the Uniform Crime Reports (UCR). UCR data serve as periodic nationwide assessments of reported crimes not available elsewhere in the criminal justice system. Each year, participating law enforcement agencies contribute reports to the FBI either directly or through their state reporting programs. Distributed here are the codes used to create the datasets and preform the secondary analysis. Please refer to the User Guide, distributed with this study, for more information. This collection contains a secondary analysis of the National Incident Based Reporting System (NIBRS), a component part of the Uniform Crime Reporting Program (UCR) and an incident-based reporting system for crimes known to the police. For each crime incident coming to the attention of law enforcement, a variety of data were collected about the incident. These data included the nature and types of specific offenses in the incident, characteristics of the victim(s) and offender(s), types and value of property stolen and recovered, and characteristics of persons arrested in connection with a crime incident. NIBRS collects data on each single incident and arrest within 22 offense categories, made up of 46 specific crimes called Group A offenses. In addition, there are 11 Group B offense categories for which only arrest data were reported. NIBRS data on different aspects of crime incidents such as offenses, victims, offenders, arrestees, etc., can be examined as different units of analysis. Distributed here are the codes used to create the datasets and preform the secondary analysis. Please refer to the User Guide, distributed with this study, for more information. The collection includes 17 SPSS syntax files. Qualitative data collected for this study are not available as part of the data collection at this time.

  11. f

    Mean and standard deviation of SI by BMI group and maternal age.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anderson Borovac-Pinheiro; Filipe Moraes Ribeiro; Sirlei Siani Morais; Rodolfo Carvalho Pacagnella (2023). Mean and standard deviation of SI by BMI group and maternal age. [Dataset]. http://doi.org/10.1371/journal.pone.0217907.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Anderson Borovac-Pinheiro; Filipe Moraes Ribeiro; Sirlei Siani Morais; Rodolfo Carvalho Pacagnella
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mean and standard deviation of SI by BMI group and maternal age.

  12. Data from: The effects of exploratory behavior on physical activity in a...

    • zenodo.org
    • datadryad.org
    bin
    Updated Oct 21, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cairsty DePasquale; Cairsty DePasquale (2022). The effects of exploratory behavior on physical activity in a common animal model of human disease, zebrafish (Danio rerio) [Dataset]. http://doi.org/10.5061/dryad.c2fqz61c8
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 21, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cairsty DePasquale; Cairsty DePasquale
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Zebrafish (Danio rerio) are widely accepted as a multidisciplinary vertebrate model for neurobehavioral and clinical studies, and more recently have become established as a model for exercise physiology and behavior. Individual differences in activity level (e.g., exploration) have been characterized in zebrafish, however, how different levels of exploration correspond to differences in motivation to engage in swimming behavior has not yet been explored. We screened individual zebrafish in two tests of exploration: the open field and novel tank diving tests. The fish were then exposed to a tank in which they could choose to enter a compartment with a flow of water (as a means of testing voluntary motivation to exercise). After a 2-day habituation period, behavioral observations were conducted. We used correlative analyses to investigate the robustness of the different exploration tests. Due to the complexity of dependent behavioral variables, we used machine learning to determine the personality variables that were best at predicting swimming behavior. Our results show that contrary to our predictions, the correlation between novel tank diving test variables and open field test variables was relatively weak. Novel tank diving variables were more correlated with themselves than open field variables were to each other. Males exhibited stronger relationships between behavioral variables than did females. In terms of swimming behavior, fish that spent more time in the swimming zone spent more time actively swimming, however, swimming behavior was inconsistent across the time of the study. All relationships between swimming variables and exploration tests were relatively weak, though novel tank diving test variables had stronger correlations. Machine learning showed that three novel tank diving variables (entries top/bottom, movement rate, average top entry duration) and one open field variable (proportion of time spent frozen) were the best predictors of swimming behavior, demonstrating that the novel tank diving test is a powerful tool to investigate exploration. Increased knowledge about how individual differences in exploration may play a role in swimming behavior in zebrafish is fundamental to their utility as a model of exercise physiology and behavior.

  13. u

    ERA5 Reanalysis Model Level Data

    • data.ucar.edu
    • rda.ucar.edu
    • +2more
    netcdf
    Updated Mar 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Medium-Range Weather Forecasts (2025). ERA5 Reanalysis Model Level Data [Dataset]. http://doi.org/10.5065/XV5R-5344
    Explore at:
    netcdfAvailable download formats
    Dataset updated
    Mar 8, 2025
    Dataset provided by
    Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory
    Authors
    European Centre for Medium-Range Weather Forecasts
    Time period covered
    Jan 1, 1979 - Dec 31, 2024
    Area covered
    Description

    After many years of research and technical preparation, the production of a new ECMWF climate reanalysis to replace ERA-Interim is in progress. ERA5 is the fifth generation of ECMWF atmospheric reanalyses of the global climate, which started with the FGGE reanalyses produced in the 1980s, followed by ERA-15, ERA-40 and most recently ERA-Interim. ERA5 will cover the period January 1950 to near real time. ERA5 is produced using high-resolution forecasts (HRES) at 31 kilometer resolution (one fourth the spatial resolution of the operational model) and a 62 kilometer resolution ten member 4D-Var ensemble of data assimilation (EDA) in CY41r2 of ECMWF's Integrated Forecast System (IFS) with 137 hybrid sigma-pressure (model) levels in the vertical, up to a top level of 0.01 hPa. Atmospheric data on these levels are interpolated to 37 pressure levels (the same levels as in ERA-Interim). Surface or single level data are also available, containing 2D parameters such as precipitation, 2 meter temperature, top of atmosphere radiation and vertical integrals over the entire atmosphere. The IFS is coupled to a soil model, the parameters of which are also designated as surface parameters, and an ocean wave model. Generally, the data is available at an hourly frequency and consists of analyses and short (12 hour) forecasts, initialized twice daily from analyses at 06 and 18 UTC. Most analyses parameters are also available from the forecasts. There are a number of forecast parameters, for example mean rates and accumulations, that are not available from the analyses. Improvements to ERA5, compared to ERA-Interim, include use of HadISST.2, reprocessed ECMWF climate data records (CDR), and implementation of RTTOV11 radiative transfer. Variational bias corrections have not only been applied to satellite radiances, but also ozone retrievals, aircraft observations, surface pressure, and radiosonde profiles. Please note: DECS is producing a CF 1.6 compliant netCDF-4/HDF5 version of ERA5...

  14. ERA5 Reanalysis (Monthly Mean 0.25 Degree Latitude-Longitude Grid)

    • oidc.rda.ucar.edu
    • data.ucar.edu
    • +1more
    Updated Nov 5, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Medium-Range Weather Forecasts (2019). ERA5 Reanalysis (Monthly Mean 0.25 Degree Latitude-Longitude Grid) [Dataset]. http://doi.org/10.5065/P8GT-0R61
    Explore at:
    Dataset updated
    Nov 5, 2019
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    European Centre for Medium-Range Weather Forecasts
    Time period covered
    Dec 31, 1978 - Dec 31, 2022
    Area covered
    Earth
    Description

    For RDA ERA5 monthly mean data prior to 1979, please see ds633.5: ERA5 monthly mean back extension 1950-1978 (Preliminary version) [https://rda.ucar.edu/datasets/ds633.5/] After many years of research and technical preparation, the production of a new ECMWF climate reanalysis to replace ERA-Interim is in progress. ERA5 is the fifth generation of ECMWF atmospheric reanalyses of the global climate, which started with the FGGE reanalyses produced in the 1980s, followed by ERA-15, ERA-40 and most recently ERA-Interim. ERA5 will cover the period January 1950 to near real time.

    ERA5 is produced using high-resolution forecasts (HRES) at 31 kilometer resolution (one fourth the spatial resolution of the operational model) and a 62 kilometer resolution ten member 4D-Var ensemble of data assimilation (EDA) in CY41r2 of ECMWF's Integrated Forecast System (IFS) with 137 hybrid sigma-pressure (model) levels in the vertical, up to a top level of 0.01 hPa. Atmospheric data on these levels are interpolated to 37 pressure levels (the same levels as in ERA-Interim). Surface or single level data are also available, containing 2D parameters such as precipitation, 2 meter temperature, top of atmosphere radiation and vertical integrals over the entire atmosphere. The IFS is coupled to a soil model, the parameters of which are also designated as surface parameters, and an ocean wave model. Generally, the data is available at an hourly frequency and consists of analyses and short (12 hour) forecasts, initialized twice daily from analyses at 06 and 18 UTC. Most analyses parameters are also available from the forecasts. There are a number of forecast parameters, e.g. mean rates and accumulations, that are not available from the analyses.

    Improvements to ERA5, compared to ERA-Interim, include use of HadISST.2, reprocessed ECMWF climate data records (CDR), and implementation of RTTOV11 radiative transfer. Variational bias corrections have not only been applied to satellite radiances, but also ozone retrievals, aircraft observations, surface pressure, and radiosonde profiles.

  15. w

    What AB 2644 Means for Geothermal Exploratory Projects in California

    • data.wu.ac.at
    Updated Dec 29, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). What AB 2644 Means for Geothermal Exploratory Projects in California [Dataset]. https://data.wu.ac.at/odso/geothermaldata_org/ZWUxOGFiY2EtOTBkNi00NTVkLWFlYjMtMjk2NjA5MzYzNzlj
    Explore at:
    Dataset updated
    Dec 29, 2015
    Description

    No Publication Abstract is Available

  16. cylistic_trip_data

    • kaggle.com
    zip
    Updated Jan 31, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tracy Nguyen (2022). cylistic_trip_data [Dataset]. https://www.kaggle.com/trnguyen1510/cylistic-trip-data
    Explore at:
    zip(204750591 bytes)Available download formats
    Dataset updated
    Jan 31, 2022
    Authors
    Tracy Nguyen
    Description

    Context

    Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study.

    You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.

    In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs. Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.

    Content

    The datasets contain the previous 12 months of Cyclistic trip data. The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer business questions.

    Acknowledgements

    This data has been made available by Motivate International Inc. under this license. This is public data that you can use to explore how different customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that you won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple single passes.

    Inspiration

    Research question: How do annual members and casual riders use Cylistic bikes differently.

  17. m

    Proposal of process optimazation and human capital factors as means of value...

    • data.mendeley.com
    Updated Sep 30, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flavio Andrade (2019). Proposal of process optimazation and human capital factors as means of value generation in organizations [Dataset]. http://doi.org/10.17632/f3g6ythk5h.2
    Explore at:
    Dataset updated
    Sep 30, 2019
    Authors
    Flavio Andrade
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data are from a MSc survey research and represent the valuation of 19 variables aimed to depict both process optimization and human capital factors to hold an organizational strategy.

  18. Z

    Data from: Determinants of emotional distress in neonatal healthcare...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gagliardi Luigi (2022). Determinants of emotional distress in neonatal healthcare professionals: an exploratory analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7079092
    Explore at:
    Dataset updated
    Dec 24, 2022
    Dataset provided by
    Provenzi Livio
    Gagliardi Luigi
    Merusi Ilaria
    Ciotti Sabina
    Grumi Serena
    Nazzari Sarah
    Description

    This database includes the raw data linked with the paper “Determinants of emotional distress in neonatal healthcare professionals: an exploratory analysis”. This study is part of the Staff and Parental Adjustment to COVID-19 Epidemics – Neonatal Experience in Tuscany” (SPACE-NET) multicenter project. In this paper, we report data on potential predictors of emotional distress of healthcare professionals who work in neonatal wards (NWs) and neonatal intensive care units (NICUs).

    Procedures - Healthcare professionals of seven level-3 and six level-2 neonatal units in Tuscany (Italy) were invited to complete an online survey. Emotional distress (i.e., anxiety, depression, psychosomatic, post-traumatic stress symptoms and emotional exhaustion), Behavioral Inhibition System (BIS) and Behavioral Approach System (BAS) sensitivity, coping strategies and safety culture were assessed through well-validated, self-reported questionnaires.

    Analytical plan - Differences in mean levels of personality, coping and safety between professionals from NICUs or NWs were determined by Student’s t tests. Forward stepwise multivariate regression analyses were performed to identify significant predictors of Emotional Distress for the total sample and separately for professionals from NWs and NICUs. Furthermore, we performed a two-step cluster analysis to exploratorily identify specific profiles of professionals in terms of personality, coping strategies and safety culture and their relationship with emotional distress.

    Findings in brief - Greater BIS/BAS sensitivity, avoidance coping strategies and a sub-dimension of safety culture (i.e., stress recognition) were all associated with greater risk of emotional distress, whereas job satisfaction emerged as a protective factor. Neonatal wards and NICUs personnel presented different associations between personality, coping and safety culture.

  19. KID-F (K-pop Idol Dataset - Female)

    • kaggle.com
    Updated Aug 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dongkyu Kim (2022). KID-F (K-pop Idol Dataset - Female) [Dataset]. https://www.kaggle.com/datasets/vkehfdl1/kidf-kpop-idol-dataset-female
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 5, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dongkyu Kim
    Description

    Description

    K-pop Idol Dataset - Female (KID-F) is the first dataset of K-pop idol high quality face images. It consists of about 6,000 high quality face images at 512x512 resolution and identity labels for each image.

    We collected about 90,000 K-pop female idol images and crop the face from each image. And we classified high quality face images. As a result, there are about 6,000 high quality face images in this dataset.

    There are 300 test datasets for a benchmark. There are no duplicate images between test and train images. Some identities in test images are not duplicated with train images. (It means some test images is new identity to the trained model) Each test images have its degraded pair. You can use these degraded test images for testing face super resolution performance.

    We also provide identity labels for each image.

    You can use this dataset for training face super resolution models.

    Agreement

    • The use of this software is RESTRICTED to non-commercial research and educational purposes.
    • All images of the KID-F dataset are obtained from the internet which are not property of EDA(PCEO-AI-CLUB). EDA is not responsible for the content nor the meaning of these images.
    • You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.
    • You agree not to further copy, publish or distribute any portion of the KID-F dataset. Except, for internal use at a single site within the same organization it is allowed to make copies of the dataset.
    • EDA reserves the right to terminate your access to the CelebA dataset at any time.
  20. ERA5 Reanalysis

    • oidc.rda.ucar.edu
    • data.ucar.edu
    • +1more
    Updated Sep 5, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Medium-Range Weather Forecasts (2017). ERA5 Reanalysis [Dataset]. http://doi.org/10.5065/D6X34W69
    Explore at:
    Dataset updated
    Sep 5, 2017
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    European Centre for Medium-Range Weather Forecasts
    Time period covered
    Jan 1, 2002 - Feb 1, 2019
    Area covered
    Description

    Please note: Please use ds633.0 to access RDA maintained ERA-5 data, see ERA5 Reanalysis (0.25 Degree Latitude-Longitude Grid) [https://rda.ucar.edu/datasets/ds633.0], RDA dataset ds633.0. This dataset is no longer being updated, and web access has been removed.

    After many years of research and technical preparation, the production of a new ECMWF climate reanalysis to replace ERA-Interim is in progress. ERA5 is the fifth generation of ECMWF atmospheric reanalyses of the global climate, which started with the FGGE reanalyses produced in the 1980s, followed by ERA-15, ERA-40 and most recently ERA-Interim. ERA5 will cover the period January 1950 to near real time, though the first segment of data to be released will span the period 2010-2016.

    ERA5 is produced using high-resolution forecasts (HRES) at 31 kilometer resolution (one fourth the spatial resolution of the operational model) and a 62 kilometer resolution ten member 4D-Var ensemble of data assimilation (EDA) in CY41r2 of ECMWF's Integrated Forecast System (IFS) with 137 hybrid sigma-pressure (model) levels in the vertical, up to a top level of 0.01 hPa. Atmospheric data on these levels are interpolated to 37 pressure levels (the same levels as in ERA-Interim). Surface or single level data are also available, containing 2D parameters such as precipitation, 2 meter temperature, top of atmosphere radiation and vertical integrals over the entire atmosphere. The IFS is coupled to a soil model, the parameters of which are also designated as surface parameters, and an ocean wave model. Generally, the data is available at an hourly frequency and consists of analyses and short (18 hour) forecasts, initialized twice daily from analyses at 06 and 18 UTC. Most analyses parameters are also available from the forecasts. There are a number of forecast parameters, e.g. mean rates and accumulations, that are not available from the analyses.

    Improvements to ERA5, compared to ERA-Interim, include use of HadISST.2, reprocessed ECMWF climate data records (CDR), and implementation of RTTOV11 radiative transfer. Variational bias corrections have not only been applied to satellite radiances, but also ozone retrievals, aircraft observations, surface pressure, and radiosonde profiles.

    NCAR's Data Support Section (DSS) is performing and supplying a grid transformed version of ERA5, in which variables originally represented as spectral coefficients or archived on a reduced Gaussian grid are transformed to a regular 1280 longitude by 640 latitude N320 Gaussian grid. In addition, DSS is also computing horizontal winds (u-component, v-component) from spectral vorticity and divergence where these are available. Finally, the data is reprocessed into single parameter time series.

    Please note: As of November 2017, DSS is also producing a CF 1.6 compliant netCDF-4/HDF5 version of ERA5 for CISL RDA at NCAR. The netCDF-4/HDF5 version is the de facto RDA ERA5 online data format. The GRIB1 data format is only available via NCAR's High Performance Storage System (HPSS). We encourage users to evaluate the netCDF-4/HDF5 version for their work, and to use the currently existing GRIB1 files as a reference and basis of comparison. To ease this transition, there is a one-to-one correspondence between the netCDF-4/HDF5 and GRIB1 files, with as much GRIB1 metadata as possible incorporated into the attributes of the netCDF-4/HDF5 counterpart.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Einetic (2021). Exploratory Data Analytics and Descriptive Statistics [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-2020-2021/5/data-analytics-skills-for-managers

Exploratory Data Analytics and Descriptive Statistics

EDADS

Explore at:
htmlAvailable download formats
Dataset updated
Jun 1, 2021
Dataset authored and provided by
Einetic
License

https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

Description

Question Paper Solutions of chapter Exploratory Data Analytics and Descriptive Statistics of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021

Search
Clear search
Close search
Google apps
Main menu