88 datasets found
  1. Supplementary material from "Visual comparison of two data sets: Do people...

    • figshare.com
    xlsx
    Updated Mar 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robin Kramer; Caitlin Telfer; Alice Towler (2017). Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?" [Dataset]. http://doi.org/10.6084/m9.figshare.4751095.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 14, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Robin Kramer; Caitlin Telfer; Alice Towler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation.

  2. f

    Means, standard deviations, and correlations for all variables in Study 2.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jan 10, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moradi, Saleh; Hayhurst, Jillian G.; Arahanga-Doyle, Hitaua; Scarf, Damian; Neha, Tia; Boyes, Mike; Cruwys, Tegan; Koni, Elizabeth; Hunter, John A. (2019). Means, standard deviations, and correlations for all variables in Study 2. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000157107
    Explore at:
    Dataset updated
    Jan 10, 2019
    Authors
    Moradi, Saleh; Hayhurst, Jillian G.; Arahanga-Doyle, Hitaua; Scarf, Damian; Neha, Tia; Boyes, Mike; Cruwys, Tegan; Koni, Elizabeth; Hunter, John A.
    Description

    Means, standard deviations, and correlations for all variables in Study 2.

  3. n

    Chapter 3 of the Working Group I Contribution to the IPCC Sixth Assessment...

    • data-search.nerc.ac.uk
    Updated May 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Chapter 3 of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure 3.39 (v20220614) [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=AR6
    Explore at:
    Dataset updated
    May 16, 2024
    Description

    Data for Figure 3.39 from Chapter 3 of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6). Figure 3.39 shows the observed and simulated Pacific Decadal Variability (PDV). --------------------------------------------------- How to cite this dataset --------------------------------------------------- When citing this dataset, please include both the data citation below (under 'Citable as') and the following citation for the report component from which the figure originates: Eyring, V., N.P. Gillett, K.M. Achuta Rao, R. Barimalala, M. Barreiro Parrillo, N. Bellouin, C. Cassou, P.J. Durack, Y. Kosaka, S. McGregor, S. Min, O. Morgenstern, and Y. Sun, 2021: Human Influence on the Climate System. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 423–552, doi:10.1017/9781009157896.005. --------------------------------------------------- Figure subpanels --------------------------------------------------- The figure has six panels. Files are not separated according to the panels. --------------------------------------------------- List of data provided --------------------------------------------------- pdv.obs.nc contains - Observed SST anomalies associated with the PDV pattern - Observed PDV index time series (unfiltered) - Observed PDV index time series (low-pass filtered) - Taylor statistics of the observed PDV patterns - Statistical significance of the observed SST anomalies associated with the PDV pattern pdv.hist.cmip6.nc contains - Simulated SST anomalies associated with the PDV pattern - Simulated PDV index time series (unfiltered) - Simulated PDV index time series (low-pass filtered) - Taylor statistics of the simulated PDV patterns based on CMIP6 historical simulations. pdv.hist.cmip5.nc contains - Simulated SST anomalies associated with the PDV pattern - Simulated PDV index time series (unfiltered) - Simulated PDV index time series (low-pass filtered) - Taylor statistics of the simulated PDV patterns based on CMIP5 historical simulations. pdv.piControl.cmip6.nc contains - Simulated SST anomalies associated with the PDV pattern - Simulated PDV index time series (unfiltered) - Simulated PDV index time series (low-pass filtered) - Taylor statistics of the simulated PDV patterns based on CMIP6 piControl simulations. pdv.piControl.cmip5.nc contains - Simulated SST anomalies associated with the PDV pattern - Simulated PDV index time series (unfiltered) - Simulated PDV index time series (low-pass filtered) - Taylor statistics of the simulated PDV patterns based on CMIP5 piControl simulations. --------------------------------------------------- Data provided in relation to figure --------------------------------------------------- Panel a: - ipo_pattern_obs_ref in pdv.obs.nc: shading - ipo_pattern_obs_signif (dataset = 1) in pdv.obs.nc: cross markers Panel b: - Multimodel ensemble mean of ipo_model_pattern in pdv.hist.cmip6.nc: shading, with their sign agreement for hatching Panel c: - tay_stats (stat = 0, 1) in pdv.obs.nc: black dots - tay_stats (stat = 0, 1) in pdv.hist.cmip6.nc: red crosses, and their multimodel ensemble mean for the red dot - tay_stats (stat = 0, 1) in pdv.hist.cmip5.nc: blue crosses, and their multimodel ensemble mean for the blue dot Panel d: - Lag-1 autocorrelation of tpi in pdv.obs.nc: black horizontal lines in left . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of tpi in pdv.piControl.cmip5.nc: blue open box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of tpi in pdv.piControl.cmip6.nc: red open box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of tpi in pdv.hist.cmip5.nc: blue filled box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of tpi in pdv.hist.cmip6.nc: red filled box-whisker in the left - Lag-10 autocorrelation of tpi_lp in pdv.obs.nc: black horizontal lines in right . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of tpi_lp in pdv.piControl.cmip5.nc: blue open box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of tpi_lp in pdv.piControl.cmip6.nc: red open box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of tpi_lp in pdv.hist.cmip5.nc: blue filled box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of tpi_lp in pdv.hist.cmip6.nc: red filled box-whisker in the right Panel e: - Standard deviation of tpi in pdv.obs.nc: black horizontal lines in left . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of standard deviation of tpi in pdv.piControl.cmip5.nc: blue open box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of tpi in pdv.piControl.cmip6.nc: red open box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of tpi in pdv.hist.cmip5.nc: blue filled box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of tpi in pdv.hist.cmip6.nc: red filled box-whisker in the left - Standard deviation of tpi_lp in pdv.obs.nc: black horizontal lines in right . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of standard deviation of tpi_lp in pdv.piControl.cmip5.nc: blue open box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of tpi_lp in pdv.piControl.cmip6.nc: red open box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of tpi_lp in pdv.hist.cmip5.nc: blue filled box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of tpi_lp in pdv.hist.cmip6.nc: red filled box-whisker in the right Panel f: - tpi_lp in pdv.obs.nc: black curves . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - tpi_lp in pdv.hist.cmip6.nc: 5th-95th percentiles in red shading, multimodel ensemble mean and its 5-95% confidence interval for red curves - tpi_lp in pdv.hist.cmip5.nc: 5th-95th percentiles in blue shading, multimodel ensemble mean for blue curve CMIP5 is the fifth phase of the Coupled Model Intercomparison Project. CMIP6 is the sixth phase of the Coupled Model Intercomparison Project. SST stands for Sea Surface Temperature. --------------------------------------------------- Notes on reproducing the figure from the provided data --------------------------------------------------- Multimodel ensemble means and percentiles of historical simulations of CMIP5 and CMIP6 are calculated after weighting individual members with the inverse of the ensemble size of the same model. ensemble_assign in each file provides the model number to which each ensemble member belongs. This weighting does not apply to the sign agreement calculation. piControl simulations from CMIP5 and CMIP6 consist of a single member from each model, so the weighting is not applied. Multimodel ensemble means of the pattern correlation in Taylor statistics in (c) and the autocorrelation of the index in (d) are calculated via Fisher z-transformation and back transformation. --------------------------------------------------- Sources of additional information --------------------------------------------------- The following weblinks are provided in the Related Documents section of this catalogue record: - Link to the report component containing the figure (Chapter 3) - Link to the Supplementary Material for Chapter 3, which contains details on the input data used in Table 3.SM.1 - Link to the code for the figure, archived on Zenodo - Link to the figure on the IPCC AR6 website

  4. Datasets from an interlaboratory comparison to characterize a multi-modal...

    • data.nist.gov
    • datasets.ai
    • +1more
    Updated Jul 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kurt D. Benkstein (2021). Datasets from an interlaboratory comparison to characterize a multi-modal polydisperse sub-micrometer bead dispersion [Dataset]. http://doi.org/10.18434/mds2-2352
    Explore at:
    Dataset updated
    Jul 12, 2021
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Kurt D. Benkstein
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    These four data files contain datasets from an interlaboratory comparison that characterized a polydisperse five-population bead dispersion in water. A more detailed version of this description is available in the ReadMe file (PdP-ILC_datasets_ReadMe_v1.txt), which also includes definitions of abbreviations used in the data files. Paired samples were evaluated, so the datasets are organized as pairs associated with a randomly assigned laboratory number. The datasets are organized in the files by instrument type: PTA (particle tracking analysis), RMM (resonant mass measurement), ESZ (electrical sensing zone), and OTH (other techniques not covered in the three largest groups, including holographic particle characterization, laser diffraction, flow imaging, and flow cytometry). In the OTH group, the specific instrument type for each dataset is noted. Each instrument type (PTA, RMM, ESZ, OTH) has a dedicated file. Included in the data files for each dataset are: (1) the cumulative particle number concentration (PNC, (1/mL)); (2) the concentration distribution density (CDD, (1/mL·nm)) based upon five bins centered at each particle population peak diameter; (3) the CDD in higher resolution, varied-width bins. The lower-diameter bin edge (µm) is given for (2) and (3). Additionally, the PTA, RMM, and ESZ files each contain unweighted mean cumulative particle number concentrations and concentration distribution densities calculated from all datasets reporting values. The associated standard deviations and standard errors of the mean are also given. In the OTH file, the means and standard deviations were calculated using only data from one of the sub-groups (holographic particle characterization) that had n = 3 paired datasets. Where necessary, datasets not using the common bin resolutions are noted (PTA, OTH groups). The data contained here are presented and discussed in a manuscript to be submitted to the Journal of Pharmaceutical Sciences and presented as part of that scientific record.

  5. f

    Means and standard deviations for SWB in Studies 1 and 2.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hildreth, John Angus D.; Anderson, Cameron (2024). Means and standard deviations for SWB in Studies 1 and 2. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001383168
    Explore at:
    Dataset updated
    Sep 18, 2024
    Authors
    Hildreth, John Angus D.; Anderson, Cameron
    Description

    Means and standard deviations for SWB in Studies 1 and 2.

  6. Customer Satisfaction Scores and Behavior Data

    • kaggle.com
    zip
    Updated Apr 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salahuddin Ahmed (2025). Customer Satisfaction Scores and Behavior Data [Dataset]. https://www.kaggle.com/datasets/salahuddinahmedshuvo/customer-satisfaction-scores-and-behavior-data/discussion
    Explore at:
    zip(2456 bytes)Available download formats
    Dataset updated
    Apr 6, 2025
    Authors
    Salahuddin Ahmed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains customer satisfaction scores collected from a survey, alongside key demographic and behavioral data. It includes variables such as customer age, gender, location, purchase history, support contact status, loyalty level, and satisfaction factors. The dataset is designed to help analyze customer satisfaction, identify trends, and develop insights that can drive business decisions.

    File Information: File Name: customer_satisfaction_data.csv (or your specific file name)

    File Type: CSV (or the actual file format you are using)

    Number of Rows: 120

    Number of Columns: 10

    Column Names:

    Customer_ID – Unique identifier for each customer (e.g., 81-237-4704)

    Group – The group to which the customer belongs (A or B)

    Satisfaction_Score – Customer's satisfaction score on a scale of 1-10

    Age – Age of the customer

    Gender – Gender of the customer (Male, Female)

    Location – Customer's location (e.g., Phoenix.AZ, Los Angeles.CA)

    Purchase_History – Whether the customer has made a purchase (Yes or No)

    Support_Contacted – Whether the customer has contacted support (Yes or No)

    Loyalty_Level – Customer's loyalty level (Low, Medium, High)

    Satisfaction_Factor – Primary factor contributing to customer satisfaction (e.g., Price, Product Quality)

    Statistical Analyses:

    Descriptive Statistics:

    Calculate mean, median, mode, standard deviation, and range for key numerical variables (e.g., Satisfaction Score, Age).

    Summarize categorical variables (e.g., Gender, Loyalty Level, Purchase History) with frequency distributions and percentages.

    Two-Sample t-Test (Independent t-test):

    Compare the mean satisfaction scores between two independent groups (e.g., Group A vs. Group B) to determine if there is a significant difference in their average satisfaction scores.

    Paired t-Test:

    If there are two related measurements (e.g., satisfaction scores before and after a certain event), you can compare the means using a paired t-test.

    One-Way ANOVA (Analysis of Variance):

    Test if there are significant differences in mean satisfaction scores across more than two groups (e.g., comparing the mean satisfaction score across different Loyalty Levels).

    Chi-Square Test for Independence:

    Examine the relationship between two categorical variables (e.g., Gender vs. Purchase History or Loyalty Level vs. Support Contacted) to determine if there’s a significant association.

    Mann-Whitney U Test:

    For non-normally distributed data, use this test to compare satisfaction scores between two independent groups (e.g., Group A vs. Group B) to see if their distributions differ significantly.

    Kruskal-Wallis Test:

    Similar to ANOVA, but used for non-normally distributed data. This test can compare the median satisfaction scores across multiple groups (e.g., comparing satisfaction scores across Loyalty Levels or Satisfaction Factors).

    Spearman’s Rank Correlation:

    Test for a monotonic relationship between two ordinal or continuous variables (e.g., Age vs. Satisfaction Score or Satisfaction Score vs. Loyalty Level).

    Regression Analysis:

    Linear Regression: Model the relationship between a continuous dependent variable (e.g., Satisfaction Score) and independent variables (e.g., Age, Gender, Loyalty Level).

    Logistic Regression: If analyzing binary outcomes (e.g., Purchase History or Support Contacted), you could model the probability of an outcome based on predictors.

    Factor Analysis:

    To identify underlying patterns or groups in customer behavior or satisfaction factors, you can apply Factor Analysis to reduce the dimensionality of the dataset and group similar variables.

    Cluster Analysis:

    Use K-Means Clustering or Hierarchical Clustering to group customers based on similarity in their satisfaction scores and other features (e.g., Loyalty Level, Purchase History).

    Confidence Intervals:

    Calculate confidence intervals for the mean of satisfaction scores or any other metric to estimate the range in which the true population mean might lie.

  7. e

    BNSClim meteorological part (version 2)

    • data.europa.eu
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BNSClim meteorological part (version 2) [Dataset]. https://data.europa.eu/88u/dataset/de-dkrz-wdcc-iso3607582
    Explore at:
    Description

    This is the Baltic and North Sea Climatology (BNSC) for the Baltic Sea and the North Sea in the range 47 ° N to 66 ° N and 15 ° W to 30 ° E. It is the follow-up project to the knsc climatology. The climatology was first made available to the public in March 2018 by ICDC and is published here in a slightly revised version 2. It contains the monthly averages of mean air pressure at sea level, and air temperature, and dew point temperature at 2 meter height. It is available on a 1 ° x 1 ° grid for the period from 1950 to 2015. For the calculation of the mean values, all available quality-controlled data of the DWD (German Meteorological Service) of ship observations and buoy measurements were taken into account during this period. Additional dew point values were calculated from relative humidity and air temperature if available. Climatologies were calculated for the WMO standard periods 1951-1980, 1961-1990, 1971-2000 and 1981-2010 (monthly mean values). As a prerequisite for the calculation of the 30-year-climatology, at least 25 out of 30 (five-sixths) valid monthly means to be present in the respective grid box. For the long-term climatology from 1950 to 2015, at least four-fifths valid monthly means had to be available. Two methods were used (in combination) to calculate the monthly averages, to account for the small number of measurements per grid box and their uneven spatial and temporal distribution: 1. For parameters with a detectable annual cycle in the data (air temperature, dew point temperature), a 2nd order polynomial was fitted to the data to reduce the variation within a month and reduce the uncertainty of the calculated averages. In addition, for the mean value of air temperature, the daily temperature cycle was removed from the data. In the case of air pressure, which has no annual cycle, in version 2 per month and grid box no data gaps longer than 14 days were allowed for the calculation of a monthly mean and standard deviation. This method differs from knsc and BNSC version 1, where mean and standard deviation were calculated from 6-day windows means. 2. If the number of observations fell below a certain threshold, which was 20 observations per grid box and month for the air temperature as well as for the dew point temperature, and 500 per box and month for the air pressure, data from the adjacent boxes was used for the calculation. The neighbouring boxes were used in two steps (the nearest 8 boxes, and if the number was still below the threshold, the next sourrounding 16 boxes) to calculate the mean value of the center box. Thus, the spatial resolution of the parameters is reduced at certain points and, instead of 1 ° x 1 °, if neighboring values are taken into account, data from an area of 5 ° x 5 ° can also be considered, which are then averaged into a grid box value. This was especially used for air pressure, where the 24 values of the neighboring boxes were included in the averaging for most grid boxes. The mean value, the number of measurements, the standard deviation and the number of grid boxes used to calculate the mean values are available as parameters in the products. The calculated monthly and annual means were allocated to the centers of the grid boxes: Latitudes: 47.5, 48.5,... Longitudes: —14.5, -13.5,... In order to remove any existing values over land, a land-sea mask was used, which is also provided in 1 ° x 1 ° resolution. In this version 2 of the BNSC, a slightly different database was used, than for the knsc, which resulted in small changes (less than 1 K) in the means and standard deviations of the 2-meter air temperature and dew point temperature. The changes in mean sea level pressure values and the associated standard deviations are in the range of a few hPa, compared to the knsc. The parameter names and units have been adjusted to meet the CF 1.6 standard.

  8. f

    Study 1 and 2 means and standard deviations.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Drummond, Tehya M. LePage; Van Cappellen, Patty (2024). Study 1 and 2 means and standard deviations. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001340568
    Explore at:
    Dataset updated
    Dec 2, 2024
    Authors
    Drummond, Tehya M. LePage; Van Cappellen, Patty
    Description

    Religions, as cultural systems, influence how people view and attune to their body. This research explores whether individual differences in various dimensions of religiosity are associated with interoceptive sensibility (IS), i.e., one’s perceived ability to detect and interpret bodily signals. In Study 1, Christians, Muslims, and Hindus (N = 1570) reported their religiosity and completed the Multidimensional Assessment of Interoceptive Awareness, a well-validated measure of IS. Results show that religious identity moderates the relationship between the centrality of religion in one’s life and IS such that the association is positive and medium for Christians, large for Muslims and Hindus. In addition, the medium positive correlation between frequency of religious practice and IS was similar across religious groups. Study 2 (N = 450) extended these results by measuring additional dimensions of religiosity and spirituality as well as investigating religious-related beliefs about the body, both positive (e.g., My body is holy) and negative (e.g., My body is sinful). Associations between religiosity and IS are replicated and found for spirituality as well. Interestingly, mediation analyses reveal that belief in the body as holy partially explains the association between religiosity and IS, but belief in the body as sinful suppresses such association. We discuss how religion, as a cultural factor, may influence beliefs about the body and bodily awareness, with implications for emotion regulation and mental health.

  9. o

    Sport and leisure facilities

    • data.opendatascience.eu
    Updated Jan 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Sport and leisure facilities [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?type=dataset
    Explore at:
    Dataset updated
    Jan 2, 2021
    Description

    Overview: 142: Areas used for sports, leisure and recreation purposes. Traceability (lineage): This dataset was produced with a machine learning framework with several input datasets, specified in detail in Witjes et al., 2022 (in review, preprint available at https://doi.org/10.21203/rs.3.rs-561383/v3 ) Scientific methodology: The single-class probability layers were generated with a spatiotemporal ensemble machine learning framework detailed in Witjes et al., 2022 (in review, preprint available at https://doi.org/10.21203/rs.3.rs-561383/v3 ). The single-class uncertainty layers were calculated by taking the standard deviation of the three single-class probabilities predicted by the three components of the ensemble. The HCL (hard class) layers represents the class with the highest probability as predicted by the ensemble. Usability: The HCL layers have a decreasing average accuracy (weighted F1-score) at each subsequent level in the CLC hierarchy. These metrics are 0.83 at level 1 (5 classes):, 0.63 at level 2 (14 classes), and 0.49 at level 3 (43 classes). This means that the hard-class maps are more reliable when aggregating classes to a higher level in the hierarchy (e.g. 'Discontinuous Urban Fabric' and 'Continuous Urban Fabric' to 'Urban Fabric'). Some single-class probabilities may more closely represent actual patterns for some classes that were overshadowed by unequal sample point distributions. Users are encouraged to set their own thresholds when postprocessing these datasets to optimize the accuracy for their specific use case. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: The LULC classification was validated through spatial 5-fold cross-validation as detailed in the accompanying publication. Completeness: The dataset has chunks of empty predictions in regions with complex coast lines (e.g. the Zeeland province in the Netherlands and the Mar da Palha bay area in Portugal). These are artifacts that will be avoided in subsequent versions of the LULC product. Consistency: The accuracy of the predictions was compared per year and per 30km*30km tile across europe to derive temporal and spatial consistency by calculating the standard deviation. The standard deviation of annual weighted F1-score was 0.135, while the standard deviation of weighted F1-score per tile was 0.150. This means the dataset is more consistent through time than through space: Predictions are notably less accurate along the Mediterrranean coast. The accompanying publication contains additional information and visualisations. Positional accuracy: The raster layers have a resolution of 30m, identical to that of the Landsat data cube used as input features for the machine learning framework that predicted it. Temporal accuracy: The dataset contains predictions and uncertainty layers for each year between 2000 and 2019. Thematic accuracy: The maps reproduce the Corine Land Cover classification system, a hierarchical legend that consists of 5 classes at the highest level, 14 classes at the second level, and 44 classes at the third level. Class 523: Oceans was omitted due to computational constraints.

  10. f

    Data from: Enlarging Applicability Domain of Quantitative Structure–Activity...

    • acs.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shifa Zhong; Dylan R Lambeth; Thomas K Igou; Yongsheng Chen (2023). Enlarging Applicability Domain of Quantitative Structure–Activity Relationship Models through Uncertainty-Based Active Learning [Dataset]. http://doi.org/10.1021/acsestengg.1c00434.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    Shifa Zhong; Dylan R Lambeth; Thomas K Igou; Yongsheng Chen
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The first step to develop a quantitative structure–activity relationship (QSAR) model is to identify a set of chemicals with known activities/properties, which can be either collected from the published studies or measured experimentally. A key challenge in this process is how to determine which chemicals are used to train a QSAR model, and, of those chemicals, which should be prioritized in experimental trials to ensure that the obtained models have large applicability domains (ADs). In this study, we employ uncertainty-based active learning (AC) to address this challenge. We use the Gaussian process (GP) to develop QSAR models for three public datasets, Koc, solubility, and k•OH, each with a number of chemicals represented by molecular descriptors, in which the GP can offer prediction uncertainty (by means of standard deviation) for the model’s prediction. The training chemicals of each dataset are selected in two different ways: (1) random splitting (RS) and (2) uncertainty-based AC. Uncertainty-based AC iteratively identifies chemicals with the highest uncertainty and selects them for model training. We demonstrate that the chemicals selected by AC are more diverse than those selected by RS and that AC-based QSAR models have better generalizability than those derived from RS. We then use these two types of models to predict the properties of chemicals in the REACH dataset (>300,000 chemicals) and assess their ADs using five different AD determination methods. We demonstrate that the AD of AC-based QSAR models for all AD methods is significantly larger than those of RS-based models (up to 24 times larger). This study provides a novel method to enlarge the AD of QSAR models, which can guide model development and improve the property prediction reliability for more REACH dataset chemicals while minimizing the development cost and time.

  11. f

    Means and standard deviations of maximum a posteriori (MAP) estimates of (α,...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Black, Andrew J.; Walker, James N.; Ross, Joshua V. (2017). Means and standard deviations of maximum a posteriori (MAP) estimates of (α, β, γ, R*, r). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001810716
    Explore at:
    Dataset updated
    Oct 18, 2017
    Authors
    Black, Andrew J.; Walker, James N.; Ross, Joshua V.
    Description

    The means and standard deviations of the 50 MAP estimates based upon data with 400 infected households for each parameter is shown in the form mean(standard deviation) for the BPA and DA-MCMC methods. The last row shows the difference in the mean and standard deviation between the two methods.

  12. f

    Data from: Improved Accuracy and Reliability in Untargeted Analysis with...

    • figshare.com
    • acs.figshare.com
    xlsx
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillaume Laurent Erny; Julia Nowak; Michał Woźniakiewicz (2025). Improved Accuracy and Reliability in Untargeted Analysis with LC-ESI-QTOF/MS1 by Ensemble Averaging [Dataset]. http://doi.org/10.1021/acs.analchem.4c06078.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 4, 2025
    Dataset provided by
    ACS Publications
    Authors
    Guillaume Laurent Erny; Julia Nowak; Michał Woźniakiewicz
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Untargeted liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) is a powerful tool for comprehensive chemical analysis. Such techniques allow the detection and quantification of thousands of compounds in a sample. However, the complexity and variability in the data can introduce significant errors, impacting the reliability of the results. This study investigates ensemble averaging to mitigate these errors and improve signal-to-noise (S/N) ratios, feature detection, and data quality. In this work, 256 LC-qTOF/MS1 data sets from the analysis of Morning Glory seeds were averaged to generate merged data sets. The numbers of the pooled data sets in the merged files were varied, and the number of features, the S/N ratio, the accuracy and precision of the accurate masses, relative intensities, and migration time were examined. It was proved that ensemble averaging allows an increase in the S/N up to a factor of 10, and the relative standard deviation of the accurate masses and retention time decreased by a factor of 10. Moreover, the average number of features mined per data set increased from 1192 ± 129 with the original data set to 4408 when all data sets were averaged into one. Using known target compounds, ensemble averaging benefits on quantitative analysis were investigated. The measured and theoretical relative intensities between the [M+1]+H+, [M+2]+H+, and [M+3]+H+ and [M]+H+ isotopes of known alkaloids were used. The standard deviation decreased by up to a factor of 10, and the absolute error between theoretical and experimental relative intensities was below 3%, making the theoretical isotopic pattern a valid criterion for confirming a putative molecular formula. Using a targeted approach to recover quantitative data from the original data sets from information in the merged data sets provides an accurate quantitative means. Peak lists from the merged data sets and quantitative information from the original data sets were fused to obtain a robust clustering approach that allows recognizing features (adducts, isotopes, and fragments) generated by a common chemical in the ionization chamber. Two hundred and four clusters were obtained, characterized by two or more features with migration times that differ by less than 0.05 min and with similar response patterns.

  13. GPM PR on TRMM Precipitation Statistics, at Surface and Fixed Heights 1 day...

    • datasets.ai
    • cmr.earthdata.nasa.gov
    • +4more
    21, 33, 34
    Updated Nov 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2022). GPM PR on TRMM Precipitation Statistics, at Surface and Fixed Heights 1 day 0.25x0.25 degree V07 (GPM_3PRD) at GES DISC [Dataset]. https://datasets.ai/datasets/gpm-pr-on-trmm-precipitation-statistics-at-surface-and-fixed-heights-1-day-0-25x0-25-degre-fbfec
    Explore at:
    21, 34, 33Available download formats
    Dataset updated
    Nov 30, 2022
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    National Aeronautics and Space Administration
    Description

    This a new (GPM-formated) TRMM product. There is no equivalent in the old TRMM suite of products.

    Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07.

    This is the GPM-like formatted TRMM Precipitation Radar (PR) daily gridded data, first released with the "V8" TRMM reprocessing. The daily radar grid data is new for TRMM nomenclature and is introduced for consistency with the GPM Dual-frequency Precipitation Radar (DPR). The closest ancestor was 3A25 which was a monthly radar statistics.

    This product consists of daily statistics of the PR measurements at (0.25x0.25) degrees horizontal resolution.

    The objective of the algorithm is to calculate various daily statistics from the level 2 PR output products. Four types of statistics are calculated: 1. Probabilities of occurrence (count values) 2. Means and standard deviations In all cases, the statistics are conditioned on the presence of rain or some other quantity such as the presence of stratiform rain or the presence of a bright-band. For example, to compute the unconditioned mean rain rate, the conditional mean must be multiplied by the probability of rain which, in turn is calculated from the ratio of rain counts to the total number of observations in the box of interest.

    The grids are in the Planetary Grid 2 structure matching the Dual-frequency PR on the core GPM observatory that covers 67S to 67N degrees of latitudes. Areas beyond the ±40 degrees of latitudes are padded with empty grid cells.

  14. f

    Means (and standard deviations) of demographic and clinical data for two...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tiziana Zalla; Elena Daprati; Anca-Maria Sav; Pauline Chaste; Daniele Nico; Marion Leboyer (2023). Means (and standard deviations) of demographic and clinical data for two groups (Asperger and Comparison). [Dataset]. http://doi.org/10.1371/journal.pone.0013370.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Tiziana Zalla; Elena Daprati; Anca-Maria Sav; Pauline Chaste; Daniele Nico; Marion Leboyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1Mean values (SD); normal scores>9 (max score = 16).2Max score = 60.

  15. Data from: Low-frequency oscillations in the magnetic nozzle of a Helicon...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Sep 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davide Maddaloni; Davide Maddaloni; Borja Bayón-Buján; Borja Bayón-Buján; Jaume Navarro-Cavallé; Jaume Navarro-Cavallé; Filippo Terragni; Filippo Terragni; Mario Merino; Mario Merino (2024). Data from: Low-frequency oscillations in the magnetic nozzle of a Helicon Plasma Thruster [Dataset]. http://doi.org/10.5281/zenodo.13758358
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Davide Maddaloni; Davide Maddaloni; Borja Bayón-Buján; Borja Bayón-Buján; Jaume Navarro-Cavallé; Jaume Navarro-Cavallé; Filippo Terragni; Filippo Terragni; Mario Merino; Mario Merino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    - Data from: Low-frequency oscillations in the magnetic nozzle of a Helicon Plasma Thruster

    - Authors: Davide Maddaloni, Borja Bayón-Buján, Jaume Navarro-Cavallé, Mario Merino, Filippo Terragni

    - Contact email: dmaddalo@ing.uc3m.es

    - Date: 2024-09-13

    - Version: 1.0.0

    - License: This dataset is made available under the Creative Commons Attribution 4.0 International

    Abstract

    This dataset contains the postprocessed experimental data used in:

    Davide Maddaloni, Borja Bayón-Buján, Jaume Navarro-Cavallé, Mario Merino, Filippo Terragni, "Low-frequency oscillations in the magnetic nozzle of a Helicon Plasma Thruster", Plasma Sources Science and Technology

    Which is currently submitted.

    Dataset description

    The experimental data is gathered by means of three distinct floating Langmuir Probes (LPs).

    For the time-resolved analysis, the original floating potential data is postprocessed according to the description provided in the corresponding journal article (Section 3). For the time-averaged analysis, sweeping of the LPs is performed and the I-V curves are postprocessed according to the routine illustrated in Lobbia et al.

    Please refer to the relative article for further details regarding any of the parameters and/or configurations.

    Data files

    The data files are in standard Matlab .mat format. A recent version of Matlab is recommended.

    For the time-resolved results, data is subdivided according to the spatial position inspected and the xenon injected mass flow rate. The nomenclature of the files is the tfollowing: "Output_[injected mass flow rate]_[axial position]_[angular position]". Currently, all the arrays collect frequencies until 200 kHz. In a future update, frequencies until 1 MHz will be included.

    Each file consists of several subfields, as follows:

    • Fmsc: gridmat containing the frequencies (in Hz)
      • az: for the azimuthal direction
      • ax: for the parallel direction
    • Kmsc: gridmat containing the wavenumbers (in rad)
      • az: for the azimuthal direction
      • ax: for the parallel direction
    • SSmsc: matrix containing the results of the Two-Power Spectral Density (PSD2P) technique. Each element collects the value of the real-valued mean squared coherence for each corresponding frequency and wavenumber
      • az: for the azimuthal direction
      • ax: for the parallel direction
    • stats: general statistics
      • az: for the azimuthal direction
        • meanphase: mean of the phase difference between the two probe signals (in rad)
        • stdphase: standard deviation of the phase difference between the two probe signals (in rad)
        • fRFTbin: frequency array (in Hz)
        • coherence: real-valued mean squared coherence for the two signals
      • ax: for the axial direction
        • meanphase: mean of the phase difference between the two probe signals (in rad)
        • stdphase: standard deviation of the phase difference between the two probe signals (in rad)
        • fRFTbin: frequency array (in Hz)
        • coherence: real-valued mean squared coherence for the two signals
      • psd: averaged out Power Spectral Densities (PSDs) for each LP (refer to the journal article for the corresponding nomenclatures of the probes)
        • for probe 1 (in V^2/HZ)
        • for probe 2 (in V^2/HZ)
        • for probe 3 (in V^2/HZ)

    The results of the time-resolved analysis, including plasma potential and plasma density for the two mass flow rates inspected, will be added in a future update.

    Citation

    Works using this dataset or any part of it in any form shall cite it as follows.

    The preferred means of citation is to reference the publication associated to this dataset, as soon as it is available.

    Optionally, the dataset may be cited directly by referencing the corresponding DOI: 10.5281/zenodo.13758358.

    Acknowledgments

    This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (project ERC-STG ZARATHUSTRA, grant agreement No 950466). Additionally, F. Terragni was also supported by the FEDER / Ministerio de Ciencia, Innovación y Universidades - Agencia Estatal de Investigación (grant agreement No PID2020-112796RB-C22), while B. Bayón-Buján enjoyed a grant from the Consejería de Educación, Universidades, Ciencia y Portavocía of the Community of Madrid (grant PEJ-2021-AI/TIC-23158).

  16. n

    Chapter 3 of the Working Group I Contribution to the IPCC Sixth Assessment...

    • data-search.nerc.ac.uk
    Updated Apr 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Chapter 3 of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure 3.40 (v20220614) [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=modes%20of%20variability
    Explore at:
    Dataset updated
    Apr 24, 2024
    Description

    Data for Figure 3.40 from Chapter 3 of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6). Figure 3.40 shows the observed and simulated Atlantic Multidecadal Variability (AMV). --------------------------------------------------- How to cite this dataset --------------------------------------------------- When citing this dataset, please include both the data citation below (under 'Citable as') and the following citation for the report component from which the figure originates: Eyring, V., N.P. Gillett, K.M. Achuta Rao, R. Barimalala, M. Barreiro Parrillo, N. Bellouin, C. Cassou, P.J. Durack, Y. Kosaka, S. McGregor, S. Min, O. Morgenstern, and Y. Sun, 2021: Human Influence on the Climate System. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 423–552, doi:10.1017/9781009157896.005. --------------------------------------------------- Figure subpanels --------------------------------------------------- The figure has six panels. Files are not separated according to the panels. --------------------------------------------------- List of data provided --------------------------------------------------- amv.obs.nc contains - Observed SST anomalies associated with the AMV pattern - Observed AMV index time series (unfiltered) - Observed AMV index time series (low-pass filtered) - Taylor statistics of the observed AMV patterns amv.hist.cmip6.nc contains - Statistical significance of the observed SST anomalies associated with the AMV pattern - Simulated SST anomalies associated with the AMV pattern - Simulated AMV index time series (unfiltered) - Simulated AMV index time series (low-pass filtered) - Taylor statistics of the simulated AMV patterns based on CMIP6 historical simulations. amv.hist.cmip5.nc contains - Simulated SST anomalies associated with the AMV pattern - Simulated AMV index time series (unfiltered) - Simulated AMV index time series (low-pass filtered) - Taylor statistics of the simulated AMV patterns based on CMIP5 historical simulations. amv.piControl.cmip6.nc contains - Simulated SST anomalies associated with the AMV pattern - Simulated AMV index time series (unfiltered) - Simulated AMV index time series (low-pass filtered) - Taylor statistics of the simulated AMV patterns based on CMIP6 piControl simulations. amv.piControl.cmip5.nc contains - Simulated SST anomalies associated with the AMV pattern - Simulated AMV index time series (unfiltered) - Simulated AMV index time series (low-pass filtered) - Taylor statistics of the simulated AMV patterns based on CMIP5 piControl simulations. --------------------------------------------------- Data provided in relation to figure --------------------------------------------------- Panel a: - amv_pattern_obs_ref in amv.obs.nc: shading - amv_pattern_obs_signif (dataset = 1) in amv.obs.nc: cross markers Panel b: - Multimodel ensemble mean of amv_pattern in amv.hist.cmip6.nc: shading, with their sign agreement for hatching Panel c: - tay_stats (stat = 0, 1) in amv.obs.nc: black dots - tay_stats (stat = 0, 1) in amv.hist.cmip6.nc: red crosses, and their multimodel ensemble mean for the red dot - tay_stats (stat = 0, 1) in amv.hist.cmip5.nc: blue crosses, and their multimodel ensemble mean for the blue dot Panel d: - Lag-1 autocorrelation of amv_timeseries_raw in amv.obs.nc: black horizontal lines in left . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of amv_timeseries_raw in amv.piControl.cmip5.nc: blue open box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of amv_timeseries_raw in amv.piControl.cmip6.nc: red open box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of amv_timeseries_raw in amv.hist.cmip5.nc: blue filled box-whisker in the left - Multimodel ensemble mean and percentiles of lag-1 autocorrelation of amv_timeseries_raw in amv.hist.cmip6.nc: red filled box-whisker in the left - Lag-10 autocorrelation of amv_timeseries in amv.obs.nc: black horizontal lines in right . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of amv_timeseries in amv.piControl.cmip5.nc: blue open box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of amv_timeseries in amv.piControl.cmip6.nc: red open box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of amv_timeseries in amv.hist.cmip5.nc: blue filled box-whisker in the right - Multimodel ensemble mean and percentiles of lag-10 autocorrelation of amv_timeseries in amv.hist.cmip6.nc: red filled box-whisker in the right Panel e: - Standard deviation of amv_timeseries_raw in amv.obs.nc: black horizontal lines in left . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries_raw in amv.piControl.cmip5.nc: blue open box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries_raw in amv.piControl.cmip6.nc: red open box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries_raw in amv.hist.cmip5.nc: blue filled box-whisker in the left - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries_raw in amv.hist.cmip6.nc: red filled box-whisker in the left - Standard deviation of amv_timeseries in amv.obs.nc: black horizontal lines in right . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries in amv.piControl.cmip5.nc: blue open box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries in amv.piControl.cmip6.nc: red open box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries in amv.hist.cmip5.nc: blue filled box-whisker in the right - Multimodel ensemble mean and percentiles of standard deviation of amv_timeseries in amv.hist.cmip6.nc: red filled box-whisker in the right Panel f: - amv_timeseries in amv.obs.nc: black curves . ERSSTv5: dataset = 1 . HadISST: dataset = 2 . COBE-SST2: dataset = 3 - amv_timeseries in amv.hist.cmip6.nc: 5th-95th percentiles in red shading, multimodel ensemble mean and its 5-95% confidence interval for red curves - amv_timeseries in amv.hist.cmip5.nc: 5th-95th percentiles in blue shading, multimodel ensemble mean for blue curve CMIP5 is the fifth phase of the Coupled Model Intercomparison Project. CMIP6 is the sixth phase of the Coupled Model Intercomparison Project. SST stands for Sea Surface Temperature. --------------------------------------------------- Notes on reproducing the figure from the provided data --------------------------------------------------- Multimodel ensemble means and percentiles of historical simulations of CMIP5 and CMIP6 are calculated after weighting individual members with the inverse of the ensemble size of the same model. ensemble_assign in each file provides the model number to which each ensemble member belongs. This weighting does not apply to the sign agreement calculation. piControl simulations from CMIP5 and CMIP6 consist of a single member from each model, so the weighting is not applied. Multimodel ensemble means of the pattern correlation in Taylor statistics in (c) and the autocorrelation of the index in (d) are calculated via Fisher z-transformation and back transformation. --------------------------------------------------- Sources of additional information --------------------------------------------------- The following weblinks are provided in the Related Documents section of this catalogue record: - Link to the report component containing the figure (Chapter 3) - Link to the Supplementary Material for Chapter 3, which contains details on the input data used in Table 3.SM.1 - Link to the code for the figure, archived on Zenodo - Link to the figure on the IPCC AR6 website

  17. Processed Synthetic Real-World Data for tristate modelling

    • data.europa.eu
    • data.niaid.nih.gov
    unknown
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). Processed Synthetic Real-World Data for tristate modelling [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7410184?locale=da
    Explore at:
    unknown(10979914)Available download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This model learning dataset is created out of the Raw Synthetic RWD raw dataset, including some of the original attributes. It is distributed in JOBLIB files, where .joblib files contain the vectors and _ids.joblib contain the ID of the person from which each vector is extracted. This is useful in case it is needed to map the vectors to metadata about the people that are found in the original raw dataset. Note that corresponds to , or , depending on the dataset. The split is roughly 60% of the people are in the training dataset, and 20% in each of the validation and the testing datasets. The input attributes are the age, the short-term averages and the trends of the current week’s BMI, steps walked, calories burned, sleep quality, mood and water consumption, as well as the previous week’s short-term average and trend of the answer to the health self-assessment question. The outcome to be predicted is a tristate quantized version of the health self-assessment answer to be given in the current week. The dataset is normalized based on the training set. The means and standard deviations used can be found in the train_statistics.joblib file. Finally, the output_descriptions.joblib file contains descriptions of the outcomes to be predicted (not actually needed, since included here).

  18. f

    Means and standard deviations for session 1 and session 2, mean differences...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Feb 23, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebelo-Gonçalves, Ricardo; Gonçalves, Rui S.; Duarte, João P.; Figueiredo, António J.; Baptista, Rafael C.; Martinho, Diogo; Valente-dos-Santos, João; Severino, Vítor; Coelho-e-Silva, Manuel J.; Luz, Leonardo G. O.; Ahmed, Alexis; Vaz, Vasco; Tessitore, Antonio (2018). Means and standard deviations for session 1 and session 2, mean differences between time-moments including 95% confidence intervals, paired t-test and effect size (n = 22). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000630920
    Explore at:
    Dataset updated
    Feb 23, 2018
    Authors
    Rebelo-Gonçalves, Ricardo; Gonçalves, Rui S.; Duarte, João P.; Figueiredo, António J.; Baptista, Rafael C.; Martinho, Diogo; Valente-dos-Santos, João; Severino, Vítor; Coelho-e-Silva, Manuel J.; Luz, Leonardo G. O.; Ahmed, Alexis; Vaz, Vasco; Tessitore, Antonio
    Description

    Means and standard deviations for session 1 and session 2, mean differences between time-moments including 95% confidence intervals, paired t-test and effect size (n = 22).

  19. Gender, Age, and Emotion Detection from Voice

    • kaggle.com
    zip
    Updated May 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rohit Zaman (2021). Gender, Age, and Emotion Detection from Voice [Dataset]. https://www.kaggle.com/rohitzaman/gender-age-and-emotion-detection-from-voice
    Explore at:
    zip(967820 bytes)Available download formats
    Dataset updated
    May 29, 2021
    Authors
    Rohit Zaman
    Description

    Context

    Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.

    Content

    Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.

    Acknowledgements

    Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/

  20. f

    Means and standard deviations of variables as a function of crime type,...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Apr 23, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bastian, Brock; Haslam, Nick; Denson, Thomas F. (2013). Means and standard deviations of variables as a function of crime type, Study 2. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001720292
    Explore at:
    Dataset updated
    Apr 23, 2013
    Authors
    Bastian, Brock; Haslam, Nick; Denson, Thomas F.
    Description

    NOTE: Within rows, values with different superscripts (a, b) are significantly (p<.05) different from each other controlling for familywise error (Scheffé’s test).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Robin Kramer; Caitlin Telfer; Alice Towler (2017). Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?" [Dataset]. http://doi.org/10.6084/m9.figshare.4751095.v1
Organization logoOrganization logo

Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?"

Explore at:
xlsxAvailable download formats
Dataset updated
Mar 14, 2017
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Robin Kramer; Caitlin Telfer; Alice Towler
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation.

Search
Clear search
Close search
Google apps
Main menu