40 datasets found
  1. Supplementary material from "Visual comparison of two data sets: Do people...

    • figshare.com
    xlsx
    Updated Mar 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robin Kramer; Caitlin Telfer; Alice Towler (2017). Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?" [Dataset]. http://doi.org/10.6084/m9.figshare.4751095.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 14, 2017
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Robin Kramer; Caitlin Telfer; Alice Towler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation.

  2. d

    Sea Surface Temperature (SST) Standard Deviation of Long-term Mean,...

    • catalog.data.gov
    • data.ioos.us
    • +1more
    Updated Jan 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact) (2025). Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii [Dataset]. https://catalog.data.gov/dataset/sea-surface-temperature-sst-standard-deviation-of-long-term-mean-2000-2013-hawaii
    Explore at:
    Dataset updated
    Jan 27, 2025
    Dataset provided by
    National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact)
    Area covered
    Hawaii
    Description

    Sea surface temperature (SST) plays an important role in a number of ecological processes and can vary over a wide range of time scales, from daily to decadal changes. SST influences primary production, species migration patterns, and coral health. If temperatures are anomalously warm for extended periods of time, drastic changes in the surrounding ecosystem can result, including harmful effects such as coral bleaching. This layer represents the standard deviation of SST (degrees Celsius) of the weekly time series from 2000-2013. Three SST datasets were combined to provide continuous coverage from 1985-2013. The concatenation applies bias adjustment derived from linear regression to the overlap periods of datasets, with the final representation matching the 0.05-degree (~5-km) near real-time SST product. First, a weekly composite, gap-filled SST dataset from the NOAA Pathfinder v5.2 SST 1/24-degree (~4-km), daily dataset (a NOAA Climate Data Record) for each location was produced following Heron et al. (2010) for January 1985 to December 2012. Next, weekly composite SST data from the NOAA/NESDIS/STAR Blended SST 0.1-degree (~11-km), daily dataset was produced for February 2009 to October 2013. Finally, a weekly composite SST dataset from the NOAA/NESDIS/STAR Blended SST 0.05-degree (~5-km), daily dataset was produced for March 2012 to December 2013. The standard deviation of the long-term mean SST was calculated by taking the standard deviation over all weekly data from 2000-2013 for each pixel.

  3. AVISO Level 4 Absolute Dynamic Topography for Climate Model Comparison...

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated May 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AVISO;NASA/JPL/PODAAC (2025). AVISO Level 4 Absolute Dynamic Topography for Climate Model Comparison Standard Error [Dataset]. https://catalog.data.gov/dataset/aviso-level-4-absolute-dynamic-topography-for-climate-model-comparison-standard-error
    Explore at:
    Dataset updated
    May 28, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    These data are the standard error calculated from the AVISO Level 4 Absolute Dynamic Topography for Climate Model Comparison Number of Observations data set ( in PO.DAAC Drive at https://podaac-tools.jpl.nasa.gov/drive/files/allData/aviso/L4/abs_dynamic_topo ). This data set is not meant to be used alone, but with the absolute dynamic topography data. These data were generated to help support the CMIP5 (Coupled Model Intercomparison Project Phase 5) portion of PCMDI (Program for Climate Model Diagnosis and Intercomparison). The dynamic topograhy are from sea surface height measured by several satellites, Envisat, TOPEX/Poseidon, Jason-1 and OSTM/Jason-2 and referenced to the geoid. These data were provided by AVISO (French space agency data provider), which are based on a similar dynamic topography data set they already produce( http://www.aviso.oceanobs.com/index.php?id=1271 ).

  4. Datasets from an interlaboratory comparison to characterize a multi-modal...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jul 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Datasets from an interlaboratory comparison to characterize a multi-modal polydisperse sub-micrometer bead dispersion [Dataset]. https://catalog.data.gov/dataset/datasets-from-an-interlaboratory-comparison-to-characterize-a-multi-modal-polydisperse-sub
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    These four data files contain datasets from an interlaboratory comparison that characterized a polydisperse five-population bead dispersion in water. A more detailed version of this description is available in the ReadMe file (PdP-ILC_datasets_ReadMe_v1.txt), which also includes definitions of abbreviations used in the data files. Paired samples were evaluated, so the datasets are organized as pairs associated with a randomly assigned laboratory number. The datasets are organized in the files by instrument type: PTA (particle tracking analysis), RMM (resonant mass measurement), ESZ (electrical sensing zone), and OTH (other techniques not covered in the three largest groups, including holographic particle characterization, laser diffraction, flow imaging, and flow cytometry). In the OTH group, the specific instrument type for each dataset is noted. Each instrument type (PTA, RMM, ESZ, OTH) has a dedicated file. Included in the data files for each dataset are: (1) the cumulative particle number concentration (PNC, (1/mL)); (2) the concentration distribution density (CDD, (1/mL·nm)) based upon five bins centered at each particle population peak diameter; (3) the CDD in higher resolution, varied-width bins. The lower-diameter bin edge (µm) is given for (2) and (3). Additionally, the PTA, RMM, and ESZ files each contain unweighted mean cumulative particle number concentrations and concentration distribution densities calculated from all datasets reporting values. The associated standard deviations and standard errors of the mean are also given. In the OTH file, the means and standard deviations were calculated using only data from one of the sub-groups (holographic particle characterization) that had n = 3 paired datasets. Where necessary, datasets not using the common bin resolutions are noted (PTA, OTH groups). The data contained here are presented and discussed in a manuscript to be submitted to the Journal of Pharmaceutical Sciences and presented as part of that scientific record.

  5. A

    ‘Datasets from an interlaboratory comparison to characterize a multi-modal...

    • analyst-2.ai
    Updated Jan 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Datasets from an interlaboratory comparison to characterize a multi-modal polydisperse sub-micrometer bead dispersion’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-datasets-from-an-interlaboratory-comparison-to-characterize-a-multi-modal-polydisperse-sub-micrometer-bead-dispersion-1603/c07ee221/?iid=004-407&v=presentation
    Explore at:
    Dataset updated
    Jan 27, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Datasets from an interlaboratory comparison to characterize a multi-modal polydisperse sub-micrometer bead dispersion’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/7f7e5222-e579-486e-b5d7-c02d511d1964 on 27 January 2022.

    --- Dataset description provided by original source is as follows ---

    These four data files contain datasets from an interlaboratory comparison that characterized a polydisperse five-population bead dispersion in water. A more detailed version of this description is available in the ReadMe file (PdP-ILC_datasets_ReadMe_v1.txt), which also includes definitions of abbreviations used in the data files. Paired samples were evaluated, so the datasets are organized as pairs associated with a randomly assigned laboratory number. The datasets are organized in the files by instrument type: PTA (particle tracking analysis), RMM (resonant mass measurement), ESZ (electrical sensing zone), and OTH (other techniques not covered in the three largest groups, including holographic particle characterization, laser diffraction, flow imaging, and flow cytometry). In the OTH group, the specific instrument type for each dataset is noted. Each instrument type (PTA, RMM, ESZ, OTH) has a dedicated file. Included in the data files for each dataset are: (1) the cumulative particle number concentration (PNC, (1/mL)); (2) the concentration distribution density (CDD, (1/mL·nm)) based upon five bins centered at each particle population peak diameter; (3) the CDD in higher resolution, varied-width bins. The lower-diameter bin edge (µm) is given for (2) and (3). Additionally, the PTA, RMM, and ESZ files each contain unweighted mean cumulative particle number concentrations and concentration distribution densities calculated from all datasets reporting values. The associated standard deviations and standard errors of the mean are also given. In the OTH file, the means and standard deviations were calculated using only data from one of the sub-groups (holographic particle characterization) that had n = 3 paired datasets. Where necessary, datasets not using the common bin resolutions are noted (PTA, OTH groups). The data contained here are presented and discussed in a manuscript to be submitted to the Journal of Pharmaceutical Sciences and presented as part of that scientific record.

    --- Original source retains full ownership of the source dataset ---

  6. Benchmark Multi-Omics Datasets for Methods Comparison

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Nov 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang (2021). Benchmark Multi-Omics Datasets for Methods Comparison [Dataset]. http://doi.org/10.5281/zenodo.5683002
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Nov 14, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathway Multi-Omics Simulated Data

    These are synthetic variations of the TCGA COADREAD data set (original data available at http://linkedomics.org/data_download/TCGA-COADREAD/). This data set is used as a comprehensive benchmark data set to compare multi-omics tools in the manuscript "pathwayMultiomics: An R package for efficient integrative analysis of multi-omics datasets with matched or un-matched samples".

    There are 100 sets (stored as 100 sub-folders, the first 50 in "pt1" and the second 50 in "pt2") of random modifications to centred and scaled copy number, gene expression, and proteomics data saved as compressed data files for the R programming language. These data sets are stored in subfolders labelled "sim001", "sim002", ..., "sim100". Each folder contains the following contents: 1) "indicatorMatricesXXX_ls.RDS" is a list of simple triplet matrices showing which genes (in which pathways) and which samples received the synthetic treatment (where XXX is the simulation run label: 001, 002, ...), (2) "CNV_partitionA_deltaB.RDS" is the synthetically modified copy number variation data (where A represents the proportion of genes in each gene set to receive the synthetic treatment [partition 1 is 20%, 2 is 40%, 3 is 60% and 4 is 80%] and B is the signal strength in units of standard deviations), (3) "RNAseq_partitionA_deltaB.RDS" is the synthetically modified gene expression data (same parameter legend as CNV), and (4) "Prot_partitionA_deltaB.RDS" is the synthetically modified protein expression data (same parameter legend as CNV).

    Supplemental Files

    The file "cluster_pathway_collection_20201117.gmt" is the collection of gene sets used for the simulation study in Gene Matrix Transpose format. Scripts to create and analyze these data sets available at: https://github.com/TransBioInfoLab/pathwayMultiomics_manuscript_supplement

  7. J

    Standard Errors for Difference-in-Difference Regression (replication data)

    • journaldata.zbw.eu
    zip
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bruce Hansen; Bruce Hansen (2024). Standard Errors for Difference-in-Difference Regression (replication data) [Dataset]. http://doi.org/10.15456/jae.2024337.1643252147
    Explore at:
    zip(1269092)Available download formats
    Dataset updated
    Dec 11, 2024
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Bruce Hansen; Bruce Hansen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All replication code for the above paper. This includes general-purpose R and Stata code, all simulation code, all empirical data sets, and R and Stata code to replicate the empirical results

  8. r

    Data from: SeaWIFS K490 Standard Deviation

    • researchdata.edu.au
    • devweb.dga.links.com.au
    • +1more
    Updated Jul 31, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australian Ocean Data Network (2008). SeaWIFS K490 Standard Deviation [Dataset]. https://researchdata.edu.au/seawifs-k490-standard-deviation/687001
    Explore at:
    Dataset updated
    Jul 31, 2008
    Dataset provided by
    Australian Ocean Data Network
    Area covered
    Description

    This data set contains the standard deviation of SeaWIFS k490 generated from the climatology monthly means; the monthly climatologies represent the mean values for each month across the whole dataset time series. K490 indicates the turbidity of the water column: how the visible light in the blue; green region of the spectrum penetrates within the water column. It is directly related to the presence of scattering particles in the water column. The data are received as monthly composites, with a 4 km resolution, and are constrained to the region between 90E and 180E, and 10N to 60S. The data was sourced from http://oceancolor.gsfc.nasa.gov/SeaWiFS/. This dataset is a contribution to the CERF Marine Biodiversity Hub.

  9. High School Heights Dataset

    • kaggle.com
    Updated Aug 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yashmeet Singh (2022). High School Heights Dataset [Dataset]. https://www.kaggle.com/datasets/yashmeetsingh/high-school-heights-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yashmeet Singh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    High School Heights Dataset

    You will find three datasets containing heights of the high school students.

    All heights are in inches.

    The data is simulated. The heights are generated from a normal distribution with different sets of mean and standard deviation for boys and girls.

    Height Statistics (inches)BoysGirls
    Mean6762
    Standard Deviation2.92.2

    There are 500 measurements for each gender.

    Here are the datasets:

    • hs_heights.csv: contains a single column with heights for all boys and girls. There's no way to tell which of the values are for boys and which ones are for girls.

    • hs_heights_pair.csv: has two columns. The first column has boy's heights. The second column contains girl's heights.

    • hs_heights_flag.csv: has two columns. The first column has the flag is_girl. The second column contains a girl's height if the flag is 1. Otherwise, it contains a boy's height.

    To see how I generated this dataset, check this out: https://github.com/ysk125103/datascience101/tree/main/datasets/high_school_heights

    Image by Gillian Callison from Pixabay

  10. Z

    SPI, KUH, and MCM derived lidar T datasets

    • data.niaid.nih.gov
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A.A. Kutepov (2020). SPI, KUH, and MCM derived lidar T datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1306735
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    J. Hoeffner
    A. Feofilov
    X. Chu
    J.M. Russell
    X. Lu
    E.C.M. Dawkins
    M. Mlynczak
    L. Rezac
    A.A. Kutepov
    D. Janches
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Spī Kōh
    Description

    These datasets are used within Dawkins et al., 'Validation of SABER v2.0 Operational Temperature Data with Ground-based Lidars in the Mesosphere-Lower Thermosphere Region (75-105 km).'The datasets correspond to the derived Spitsbergen, Kühlungsborn, and McMurdo temperature (T, units: K) datasets presented within the paper. Each text file contains header information necessary for interpreting the data, with each following a standard format.

    For each season, the dataset comprises of 31 rows, and 5 columns. All rows correspond to altitudes: 75-105 km, at 1 km increments. Column 1 = lidar mean for season [Units=K] Column 2 = lidar standard deviation of mean for season [Units=K] Column 3 = combined SABER and lidar error for season [Units=K] Column 4 = standard deviation of individual (SABER-lidar) colocated pairs. [Units=K] Column 5 = T difference of seasonal means (SABER-lidar) [Units=K]

    Also contains dataset presenting Figure 12 (n rows, 2 columns). Column 1: all SABER T data for all altitudes. Column 2: corresponding lidar T data. n rows corresponds to number of SABER-lidar T data pairs. All units: K.

    DJF = December-January-February.

    MAM = March-April-May.

    JJA = June-July-August.

    SON = September-October-November.

  11. f

    Performance (mean ± standard deviation) comparison among all competing...

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liye Wang; Chong-Yaw Wee; Heung-Il Suk; Xiaoying Tang; Dinggang Shen (2023). Performance (mean ± standard deviation) comparison among all competing methods. [Dataset]. http://doi.org/10.1371/journal.pone.0117295.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Liye Wang; Chong-Yaw Wee; Heung-Il Suk; Xiaoying Tang; Dinggang Shen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The prefix ’S’ denotes the use of a single-kernel SVR. (CC: Correlation Coefficient; RMSE: Root Mean Square Error)Performance (mean ± standard deviation) comparison among all competing methods.

  12. g

    Replication data for: A Bootstrap Method for Conducting Statistical...

    • datasearch.gesis.org
    • dataverse.harvard.edu
    • +1more
    Updated Jan 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harden, Jeffrey (2020). Replication data for: A Bootstrap Method for Conducting Statistical Inference with Clustered Data [Dataset]. https://datasearch.gesis.org/dataset/httpsdataverse.unc.eduoai--hdl1902.2911599
    Explore at:
    Dataset updated
    Jan 22, 2020
    Dataset provided by
    Odum Institute Dataverse Network
    Authors
    Harden, Jeffrey
    Description

    State politics researchers often analyze data with observations grouped into clusters. This structure commonly produces unmodeled correlation within clusters, leading to downward bias in the standard errors of regression coefficients. Estimating robust cluster standard errors (RCSE) is a common approach to correcting this bias. However, despite their frequent use, recent work indicates that RCSE can also be biased downward. Here I show evidence of that bias and offer a potential solution. Through Monte Carlo simulation of an Ordinary Least Squares (OLS) regression model, I compare conventional standard error (OLS-SE) and RCSE performance to that of a bootstrap method that resamples clusters of observations (BCSE). I show that both OLS-SE and RCSE are biased downward, with OLS-SE being the mo st biased. In contrast, BCSE are not biased and consistently outperform the other two methods. I conclude with three replications from recent work and offer recommendations to researchers.

  13. a

    NZ Seabed Geomorphology - BTM - Standard deviation

    • hub.arcgis.com
    • doc-marine-data-deptconservation.hub.arcgis.com
    Updated Sep 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOC_admin (2022). NZ Seabed Geomorphology - BTM - Standard deviation [Dataset]. https://hub.arcgis.com/documents/18c8fb8623ba4ba0b5bed43f5dc5ffac
    Explore at:
    Dataset updated
    Sep 1, 2022
    Dataset authored and provided by
    DOC_admin
    Area covered
    New Zealand
    Description

    View on Map View ArcGIS Service BTM Standard deviation – this mosaic dataset is part of a series of seafloor terrain datasets aimed at providing a consistent baseline to assist users in consistently characterizing Aotearoa New Zealand seafloor habitats. This series has been developed using the tools provided within the Benthic Terrain Model (BTM [v3.0]) across different multibeam echo-sounder datasets. The series includes derived outputs from 50 MBES survey sets conducted between 1999 and 2020 from throughout the New Zealand marine environment (where available) covering an area of approximately 52,000 km2. Consistency and compatibility of the benthic terrain datasets have been achieved by utilising a common projected coordinate system (WGS84 Web Mercator), resolution (10 m), and by using a standard classification dictionary (also utilised by previous BTM studies in NZ). However, we advise caution when comparing the classification between different survey areas.Derived BTM outputs include the Bathymetric Position Index (BPI); Surface Derivative; Rugosity; Depth Statistics; Terrain Classification. A standardised digital surface model, and derived hillshade and aspect datasets have also been made available. The index of the original MBES survey surface models used in this analysis can be accessed from https://data.linz.govt.nz/layer/95574-nz-bathymetric-surface-model-index/The full report and description of available output datasets are available at: https://www.doc.govt.nz/globalassets/documents/science-and-technical/drds367entire.pdf

  14. A monthly air temperature and precipitation gridded dataset on 0.025°...

    • doi.pangaea.de
    html, tsv
    Updated Nov 5, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahu Chen; Hong Zhao; Wei Huang; Xian Wu; Yaowei Xie; Song Feng (2018). A monthly air temperature and precipitation gridded dataset on 0.025° spatial resolution in China during 1951-2011 [Dataset]. http://doi.org/10.1594/PANGAEA.895742
    Explore at:
    tsv, htmlAvailable download formats
    Dataset updated
    Nov 5, 2018
    Dataset provided by
    PANGAEA
    Authors
    Fahu Chen; Hong Zhao; Wei Huang; Xian Wu; Yaowei Xie; Song Feng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Variables measured
    File name, File size, File format, File content, Uniform resource locator/link to file
    Description

    The monthly air temperature in 1153 stations and precipitation in 1202 stations in China and neighboring countries were collected to construct a monthly climate dataset in China on 0.025 ° resolution (approximately 2.5 km) named LZU0025 dataset designed by Lanzhou University (LZU), using a partial thin plate smoothing method embedded in the ANUSPLIN software. The accuracy of the LZU0025 was evaluated from analyzing three aspects: 1) Diagnostic statistics from surface fitting model in the period of 1951-2011, and results show low mean square root of generalized cross validation (RTGCV) for monthly air temperature surface (1.1 °C) and monthly precipitation surface (2 mm1/2) which interpolated the square root of itself. This indicate exact surface fitting models. 2) Error statistics based on 265 withheld stations data in the period of 1951-2011, and results show that predicted values closely tracked true values with mean absolute error (MAE) of 0.6 °C and 4 mm and standard deviation of mean error (STD) of 1.3 °C and 5 mm, and monthly STDs presented consistent change with RTGCV varying. 3) Comparisons to other datasets through two ways, one was to compare three indices namely the standard deviation, mean and time trend derived from all datasets to referenced dataset released by the China Meteorological Administration (CMA) in the Taylor diagrams, the other was to compare LZU0025 to the Camp Tibet dataset on mountainous remote area. Taylor diagrams displayed the standard deviation derived from LZU had higher correlation with that induced from CMA (Pearson correlation R=0.76 for air temperature case and R=0.96 for precipitation case). The standard deviation for this index derived from LZU was more close to that induced from CMA, and the centered normalized root-mean-square difference for this index derived from LZU and CMA was lower. The same superior performance of LZU were found in comparing indices of the mean and time trend derived from LZU and those induced from other datasets. LZU0025 had high correlation with the Camp dataset for air temperature despite of insignificant correlation for precipitation in few stations. Based on above comprehensive analyses, LZU0025 was concluded as the reliable dataset.

  15. s

    Normalized Difference Water Index (NDWI) - Seasonal Mean - Switzerland

    • geonetwork.swissdatacube.org
    doi
    Updated Sep 17, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Université de Genève (2019). Normalized Difference Water Index (NDWI) - Seasonal Mean - Switzerland [Dataset]. https://geonetwork.swissdatacube.org/geonetwork/srv/api/records/af697b57-cf0a-4dc3-aa46-b394cf9f8c72
    Explore at:
    doiAvailable download formats
    Dataset updated
    Sep 17, 2019
    Dataset authored and provided by
    Université de Genève
    Time period covered
    Mar 28, 1984 - Jun 15, 2019
    Area covered
    Description

    This dataset is a seasonal time-series of Landsat Analysis Ready Data (ARD)-derived Normalized Difference Water Index (NDWI) computed from Landsat 5 Thematic Mapper (TM) and Landsat 8 Opeational Land Imager (OLI). To ensure a consistent dataset, Landsat 7 has not been used because the Scan Line Correct (SLC) failure creates gaps into the data. NDWI quantifies plant water content by measuring the difference between Near-Infrared (NIR) and Short Wave Infrared (SWIR) (or Green) channels using this generic formula: (NIR - SWIR) / (NIR + SWIR) For Landsat sensors, this corresponds to the following bands: Landsat 5, NDVI = (Band 4 – Band 2) / (Band 4 + Band 2). Landsat 8, NDVI = (Band 5 – Band 3) / (Band 5 + Band 3). NDWI values ranges from -1 to +1. NDWI is a good proxy for plant water stress and therefore useful for drought monitoring and early warning. NDWI is sometimes alos refered as Normalized Difference Moisture Index (NDMI) Standard Deviation is provided in a separate dataset for each time step. Spring: March-April_May (_MAM) Summer: June-July-August (_JJA) Autumn: September-October-November (_SON) Winter: December-January-February (_DJF) Data format: GeoTiff This dataset has been genereated with the Swiss Data Cube (http://www.swissdatacube.ch)

  16. Dataset: A Labeled Dataset for Osteoporosis Screening Based on...

    • zenodo.org
    csv
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D. A. Carvalho Dionísio; D. A. Carvalho Dionísio; Fernandes Felipe; Fernandes Felipe (2025). Dataset: A Labeled Dataset for Osteoporosis Screening Based on Electromagnetic Attenuation [Dataset]. http://doi.org/10.5281/zenodo.14259374
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    D. A. Carvalho Dionísio; D. A. Carvalho Dionísio; Fernandes Felipe; Fernandes Felipe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    README

    Dataset name: osseus_dataset.csv

    Version: 1.0

    Dataset period: 07/01/2021 - 09/31/2023

    Dataset Characteristics: Multivalued

    Number of Instances: 669

    Number of Attributes: 31

    Missing Values: yes

    Area(s): Health and technology

    Sources:

    • Electronic Patient Record (EPR) - University Hospital Onofre Lopes of Federal University of Rio Grande do Norte (HUOL/UFRN), Brazil;

    • OSSEUS (Osteoporosis screening based on electromagnetic waves); and,

    • DXA (Dual-energy x-ray absorptiometry).

    Description: The dataset “osseus_dataset.csv” (Table 1) contains elementary data related to risk factors and examinations performed by individuals in Rio Grande do Norte, Brazil, to investigate bone mineral density. Data were collected using the EPR of HUOL/UFRN, DXA, and OSSEUS, a low-cost device based on electromagnetic waves, which measures the attenuation of the signal when crossing the medial phalanx of the middle finger (PINHEIRO et al., 2021, ALBUQUERQUE et al., 2022).

    Descrição: O conjunto de dados “osseus_dataset.csv” (Tabela 1) contém dados elementares relacionados a fatores de risco e exames realizados por indivíduos no Estado do Rio Grande do Norte, Brasil, para a investigação da densidade mineral óssea. Os dados foram coletados por meio do EPR do HUOL/UFRN, DXA e OSSEUS, um dispositivo de baixo custo baseado em ondas eletromagnéticas, que mede a atenuação do sinal ao atravessar a falange medial do dedo médio (PINHEIRO et al., 2021, ALBUQUERQUE et al., 2022).

    Table 1: Description of Dataset Features.

    Attributes

    Description

    datatype

    Value

    Electronic Patient Record (EPR)

    id

    Unique identifier for a person (anonymous).

    Categorical.

    Person unique identifier.

    gender

    It informs the person's gender.

    Categorical.

    • female

    • male

    age

    It informs the person's age.

    Numerical.

    Integer value for age

    weight

    Informs the value referring to the person's weight—the unit of mass in kilogram (kg).

    Numerical.

    Integer value for weight

    height

    Informs the value relating to the person's height—the unit of measurement for size in centimeters (cm).

    Numerical.

    Integer value for height

    ethnicity

    Informs the person's ethnicity.

    Categorical.

    • black

    • brown

    • white

    target

    Describe the person's diagnosis or medical report.

    Categorical.

    • normal

    • osteoporosis

    • low bone mineral density

    alcohol

    It informs whether the person consumes alcoholic beverages.

    Categorical.

    • yes

    • no

    smoking

    Informs whether the person is a smoker.

    Categorical.

    • yes

    • no

    activity

    It informs whether the person practices physical activities.

    Categorical.

    • yes

    • no

    milk

    It informs whether the person consumes dairy drinks.

    Categorical.

    • yes

    • no

    calcium

    It informs whether the person uses calcium.

    Categorical.

    • yes

    • no

    vitamin_d

    It informs whether the person uses Vitamin D.

    Categorical.

    • yes

    • no

    fall

    It informs whether the person has a history of falling.

    Categorical.

    • yes

    • no

    parents_osteoporosis

    It informs whether the person has a family history of osteoporosis.

    Categorical.

    • yes

    • no

    parents_curved

    It informs whether the person has a family history of "parents curved."

    Categorical.

    • yes

    • no

    corticosteroids

    It informs whether the person uses corticosteroid-type medications for three months or longer.

    Categorical.

    • yes

    • no

    arthritis

    Informs if the person has arthritis.

    Categorical.

    • yes

    • no

    diseases

    Informs if the person has comorbidities.

    Categorical.

    • yes

    • no

    menopause

    Informs if the person has menopause.

    Categorical.

    • yes

    • no

    testosterone

    It informs whether the person uses testosterone.

    Categorical.

    • yes

    • no

    OSSEUS (Osteoporosis screening based on electromagnetic waves)

    medial_length

    Length of the medial phalanx in mm.

    Numerical.

    Integer value for length.

    medial_height

    Height of the medial phalanx in mm.

    Numerical.

    Integer value for height.

    medial_width

    Width of the medial phalanx in mm.

    Numerical.

    Integer value for width.

    calibration

    Osseus signal strength with no obstacle between the antennas.

    Numerical.

    Float value for calibration.

    attenuation

    Osseus signal strength with obstacles between antennas.

    Numerical.

    Float value for attenuation.

    DXA (Dual-energy x-ray absorptiometry)

    spine_deviation

    Reports the spinal standard deviation score that represents the difference between bone density and the expected value.

    Numerical.

    Float value for deviation.

    femur_deviation

    Reports the femur standard deviation score that represents the difference between bone density and the expected value.

    Numerical.

    Float value for deviation.

    body_deviation

    Reports the full body standard deviation score that represents the difference between bone density and the expected value.

    Numerical.

    Float value for deviation.

    forearm_deviation

    Reports the forearm standard deviation score that represents the difference between bone density and the expected value.

    Numerical.

    Float value for deviation.

    worst_deviation

    Reports the worst standard deviation score among all deviations obtained from the record.

    Numerical.

    Float value for deviation.

    REFERENCES
    Albuquerque, G. et al. A method based on non-ionizing microwave radiation for ancillary diagnosis of osteoporosis: a pilot study. BioMedical Eng. OnLine 21, 70, https://doi.org/10.1186/s12938-022-01038-y (2022).

    Pinheiro, B. d. M. et al. The influence of antenna gain and beamwidth used in osseus in the screening process for osteoporosis. Sci. Reports 11, 19148, https://doi.org/10.1038/s41598-021-98204-4 (2021).

  17. Supporting data for 'DFENS: Diffusion chronometry using Finite Elements and...

    • data-search.nerc.ac.uk
    html
    Updated Feb 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    British Geological Survey (2021). Supporting data for 'DFENS: Diffusion chronometry using Finite Elements and Nested Sampling' (NERC Grant NE/L002507/1) [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/api/records/ba70f4a5-fb5a-303c-e054-002128a47908
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Feb 5, 2021
    Dataset authored and provided by
    British Geological Surveyhttps://www.bgs.ac.uk/
    License

    http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations

    Time period covered
    Oct 1, 2014 - Jan 26, 2021
    Description

    This is supporting data for the manuscript entitled 'DFENS: Diffusion chronometry using Finite Elements and Nested Sampling' by E. J. F. Mutch, J. Maclennan, O. Shorttle, J. F. Rudge and D. Neave. Preprint here: https://doi.org/10.1002/essoar.10503709.1 Data Set S1. ds01.csv Electron probe microanalysis (EPMA) profile data of olivine crystals used in this study. Standard deviations are averaged values of standard deviations from counting statistics and repeat measurements of secondary standards. Data Set S2. ds02.csv Plagioclase compositional profiles used in this study, including SIMS, EPMA and step scan data. Standard deviations for EPMA analyses are averaged values of standard deviations from counting statistics and repeat measurements of secondary standards. Standard deviations for SIMS and step scan analyses are based on analytical precision of secondary standards. Data Set S3. ds03.csv Angles between the EPMA profile and the main olivine crystallographic axes measured by electron backscatter diffraction (EBSD). 'angle100X' is the angle between the [100] crystallographic axis and the x direction of the EBSD map, 'angle100Y' is the angle between [100] crystallographic axis and the y direction of the EBSD map, and 'angle100Z' is the angle between the [100] crystallographic axis and the z direction in the EBSD map etc. 'angle100P' is the angle between the EPMA profile and the [100] crystallographic axis, 'angle010P' is the angle between the EPMA profile and the [010] crystallographic axis, and 'angle100P' is the angle between the EPMA profile and the [001] crystallographic axis. All angles are in degrees. Data Set S4. ds04.csv Median timescales and 1 sigma errors from the olivine crystals of this study. The +1 sigma (days) is the quantile value calculated at 0.841 (i.e. 0.5 + (0.6826 / 2)). The -1 sigma (days) is therefore the quantile calculated at approximately 0.158 (which is 1 - 0.841). The 2 sigma is basically the same but it is 0.5 + (0.95/2). The value quoted as the +1 sigma (error) is the difference between the upper 1 sigma quantile and the median. Likewise the -1 sigma (error) is the difference between the median and the lower 1 sigma quantile. Data Set S5. ds05.xlsx Median timescales and 1 sigma errors from the plagioclase crystals of this study. Results from each of the parameterisations of the Mg-in-plagioclase diffusion data are included: Faak et al, (2013), Van Orman et al., (2014) and a combined expression. Data Set S6. ds06.xlsx Spreadsheet containing the regression parameters and covariance matrices used in this study and in Mutch et al. (2019). Additional versions of the olivine regressions where the ln fO2 is expressed in Pa have been made for completeness. We recommend using the versions where ln fO2 is expressed in its native form (bars).

  18. n

    ECMWF ERA5.1: 10 ensemble member surface level analysis parameter data for...

    • data-search.nerc.ac.uk
    • catalogue.ceda.ac.uk
    Updated Dec 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). ECMWF ERA5.1: 10 ensemble member surface level analysis parameter data for 2000-2006 [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=ensemble%20runs
    Explore at:
    Dataset updated
    Dec 8, 2023
    Description

    This dataset contains ERA5.1 surface level analysis parameter data for the period 2000-2006 from 10 member ensemble runs. ERA5.1 is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project re-run for 2000-2006 to improve upon the cold bias in the lower stratosphere seen in ERA5 (see technical memorandum 859 in the linked documentation section for further details). Ensemble means and spreads are calculated from these 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble mean and ensemble spread data. The main ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data, ERA5t, are also available upto 5 days behind the present. A limited selection of data from these runs are also available via CEDA, whilst full access is available via the Copernicus Data Store.

  19. d

    Example Groundwater-Level Datasets and Benchmarking Results for the...

    • catalog.data.gov
    • data.usgs.gov
    Updated Oct 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Example Groundwater-Level Datasets and Benchmarking Results for the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) Software Package [Dataset]. https://catalog.data.gov/dataset/example-groundwater-level-datasets-and-benchmarking-results-for-the-automated-regional-cor
    Explore at:
    Dataset updated
    Oct 13, 2024
    Dataset provided by
    U.S. Geological Survey
    Description

    This data release provides two example groundwater-level datasets used to benchmark the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) software package (Levy and others, 2024). The first dataset contains groundwater-level records and site metadata for wells located on Long Island, New York (NY) and some surrounding mainland sites in New York and Connecticut. The second dataset contains groundwater-level records and site metadata for wells located in the southeastern San Joaquin Valley of the Central Valley, California (CA). For ease of exposition these are referred to as NY and CA datasets, respectively. Both datasets are formatted with column headers that can be read by the ARCHI software package within the R computing environment. These datasets were used to benchmark the imputation accuracy of three ARCHI model settings (OLS, ridge, and MOVE.1) against the widely used imputation program missForest (Stekhoven and Bühlmann, 2012). The ARCHI program was used to process the NY and CA datasets on monthly and annual timesteps, respectively, filter out sites with insufficient data for imputation, and create 200 test datasets from each of the example datasets with 5 percent of observations removed at random (herein, referred to as "holdouts"). Imputation accuracy for test datasets was assessed using normalized root mean square error (NRMSE), which is the root mean square error divided by the standard deviation of the observed holdout values. ARCHI produces prediction intervals (PIs) using a non-parametric bootstrapping routine, which were assessed by computing a coverage rate (CR) defined as the proportion of holdout observations falling within the estimated PI. The multiple regression models included with the ARCHI package (OLS and ridge) were further tested on all test datasets at eleven different levels of the p_per_n input parameter, which limits the maximum ratio of regression model predictors (p) per observations (n) as a decimal fraction greater than zero and less than or equal to one. This data release contains ten tables formatted as tab-delimited text files. The “CA_data.txt” and “NY_data.txt” tables contain 243,094 and 89,997 depth-to-groundwater measurement values (value, in feet below land surface) indexed by site identifier (site_no) and measurement date (date) for CA and NY datasets, respectively. The “CA_sites.txt” and “NY_sites.txt” tables contain site metadata for the 4,380 and 476 unique sites included in the CA and NY datasets, respectively. The “CA_NRMSE.txt” and “NY_NRMSE.txt” tables contain NRMSE values computed by imputing 200 test datasets with 5 percent random holdouts to assess imputation accuracy for three different ARCHI model settings and missForest using CA and NY datasets, respectively. The “CA_CR.txt” and “NY_CR.txt” tables contain CR values used to evaluate non-parametric PIs generated by bootstrapping regressions with three different ARCHI model settings using the CA and NY test datasets, respectively. The “CA_p_per_n.txt” and “NY_p_per_n.txt” tables contain mean NRMSE values computed for 200 test datasets with 5 percent random holdouts at 11 different levels of p_per_n for OLS and ridge models compared to training error for the same models on the entire CA and NY datasets, respectively. References Cited Levy, Z.F., Stagnitta, T.J., and Glas, R.L., 2024, ARCHI: Automated Regional Correlation Analysis for Hydrologic Record Imputation, v1.0.0: U.S. Geological Survey software release, https://doi.org/10.5066/P1VVHWKE. Stekhoven, D.J., and Bühlmann, P., 2012, MissForest—non-parametric missing value imputation for mixed-type data: Bioinformatics 28(1), 112-118. https://doi.org/10.1093/bioinformatics/btr597.

  20. f

    Means and standard deviations of BMI for each polymorphism, and main effects...

    • plos.figshare.com
    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chunhui Chen; Wen Chen; Chuansheng Chen; Robert Moyzis; Qinghua He; Xuemei Lei; Jin Li; Yunxin Wang; Bin Liu; Daiming Xiu; Bi Zhu; Qi Dong (2023). Means and standard deviations of BMI for each polymorphism, and main effects and post hoc comparisons of SNPs that showed significant main effects and were used in subsequent multiple regression analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0058717.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Chunhui Chen; Wen Chen; Chuansheng Chen; Robert Moyzis; Qinghua He; Xuemei Lei; Jin Li; Yunxin Wang; Bin Liu; Daiming Xiu; Bi Zhu; Qi Dong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note: Empty cells mean no such genotypes were found in our sample. Maj: Major allele; Het: Heterozygote; Min: Minor allele.aResults (p values) of post hoc comparisons. mh = Maj versus Het, mm = Maj versus Min, hm = Het versus Min.bPost hoc comparison was not run because there were only 2 groups for this locus.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Robin Kramer; Caitlin Telfer; Alice Towler (2017). Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?" [Dataset]. http://doi.org/10.6084/m9.figshare.4751095.v1
Organization logo

Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?"

Explore at:
xlsxAvailable download formats
Dataset updated
Mar 14, 2017
Dataset provided by
Figsharehttp://figshare.com/
Authors
Robin Kramer; Caitlin Telfer; Alice Towler
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation.

Search
Clear search
Close search
Google apps
Main menu