100+ datasets found
  1. Confidence Interval Examples

    • figshare.com
    application/cdfv2
    Updated Jun 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Rollinson (2016). Confidence Interval Examples [Dataset]. http://doi.org/10.6084/m9.figshare.3466364.v2
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 28, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Emily Rollinson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Examples demonstrating how confidence intervals change depending on the level of confidence (90% versus 95% versus 99%) and on the size of the sample (CI for n=20 versus n=10 versus n=2). Developed for BIO211 (Statistics and Data Analysis: A Conceptual Approach) at Stony Brook University in Fall 2015.

  2. Winkler Interval score metric

    • kaggle.com
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carl McBride Ellis (2023). Winkler Interval score metric [Dataset]. https://www.kaggle.com/datasets/carlmcbrideellis/winkler-interval-score-metric
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Carl McBride Ellis
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Model performance evaluation: The Mean Winkler Interval score (MWIS)

    We can assess the overall performance of a regression model that produces prediction intervals by using the mean Winkler Interval score [1,2,3] which, for an individual interval, is given by:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4051350%2Fe3bd94c6047815c0304b3851fc325a7c%2FWinkler_Interval_Score.png?generation=1700042360776825&alt=media" alt="">

    where \(y\) is the true value, \(u\) it the upper prediction interval, \(l\) is the lower prediction interval, and \(\alpha\) is (1-coverage). For example, for 90% coverage, \(\alpha = 0.1\). Note that the Winkler Interval score constitutes a proper scoring rule [2,3].

    Python code: Usage example

    Attach this dataset to a notebook, then:

    import sys
    sys.path.append('/kaggle/input/winkler-interval-score-metric/')
    import MWIS_metric
    help(MWIS_metric.score)
    
    MWIS,coverage = MWIS_metric.score(predictions["y_true"],predictions["lower"],predictions["upper"],alpha)
    print(f"Local MWI score   ",round(MWIS,3))
    print("Predictions coverage  ", round(coverage*100,1),"%")
    
  3. Melodic Intervals Size Statistics for the most commonly occurring intervals....

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shui' er Han; Janani Sundararajan; Daniel Liu Bowling; Jessica Lake; Dale Purves (2023). Melodic Intervals Size Statistics for the most commonly occurring intervals. (Independent – samples t-tests). [Dataset]. http://doi.org/10.1371/journal.pone.0020160.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Shui' er Han; Janani Sundararajan; Daniel Liu Bowling; Jessica Lake; Dale Purves
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics for the comparisons of the most commonly occurring melodic interval sizes in tone and non-tone language music databases; n1 and n2 refer to the sample sizes of tone and non-tone language music databases. (All comparisons were made with the two-tailed independent samples t-test, α-level adjusted using the Bonferroni method).

  4. f

    Example drills from the low-volume high-intensity interval training...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George, Keith P.; Batterham, Alan M.; Weston, Matthew; Weston, Kathryn L.; Bock, Susan; Azevedo, Liane B. (2016). Example drills from the low-volume high-intensity interval training sessions. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001550082
    Explore at:
    Dataset updated
    Sep 28, 2016
    Authors
    George, Keith P.; Batterham, Alan M.; Weston, Matthew; Weston, Kathryn L.; Bock, Susan; Azevedo, Liane B.
    Description

    Example drills from the low-volume high-intensity interval training sessions.

  5. Example Web Traffic

    • kaggle.com
    zip
    Updated Jan 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kleber Bernardo (2018). Example Web Traffic [Dataset]. https://www.kaggle.com/kleberbernardo/example-web-traffic-with-lstm
    Explore at:
    zip(196843 bytes)Available download formats
    Dataset updated
    Jan 6, 2018
    Authors
    Kleber Bernardo
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Kleber Bernardo

    Released under CC BY-NC-SA 4.0

    Contents

  6. f

    Data from: Confidence and Prediction in Linear Mixed Models: Do Not...

    • tandf.figshare.com
    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernard G. Francq; Dan Lin; Walter Hoyer (2023). Confidence and Prediction in Linear Mixed Models: Do Not Concatenate the Random Effects. Application in an Assay Qualification Study [Dataset]. http://doi.org/10.6084/m9.figshare.12410729.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Bernard G. Francq; Dan Lin; Walter Hoyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract–In the pharmaceutical industry, all analytical methods must be shown to deliver unbiased and precise results. In an assay qualification or validation study, the trueness, accuracy, and intermediate precision are usually assessed by comparing the measured concentrations to their nominal levels. Trueness is assessed by using Confidence Intervals (CIs) of mean measured concentration, accuracy by Prediction Intervals (PIs) for a future measured concentration, and the intermediate precision by the total variance. ICH and USP guidelines alike request that all relevant sources of variability must be studied, for example, the effect of different technicians, the day-to-day variability or the use of multiple reagent lots. Those different random effects must be modeled as crossed, nested, or a combination of both; while concatenating them to simplify the model is often taken place. This article compares this simplified approach to a mixed model with the actual design. Our simulation study shows an under-estimation of the intermediate precision and, therefore, a substantial reduction of the CI and PI. The power for accuracy or trueness is consequently over-estimated when designing a new study. Two real datasets from assay validation study during vaccine development are used to illustrate the impact of such concatenation of random variables.

  7. f

    Example of different blood sampling time interval schedules for juvenile...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jerome, Jacob; Giesy, Katherine C.; McDonald, M. Danielle; Macdonald, Catherine; D’Alessandro, Evan; Wester, Julia (2025). Example of different blood sampling time interval schedules for juvenile nurse sharks. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001498968
    Explore at:
    Dataset updated
    Jan 3, 2025
    Authors
    Jerome, Jacob; Giesy, Katherine C.; McDonald, M. Danielle; Macdonald, Catherine; D’Alessandro, Evan; Wester, Julia
    Description

    Example of different blood sampling time interval schedules for juvenile nurse sharks.

  8. (Figure 4f) Magnetic susceptibility measured on distinct samples at 5 cm...

    • doi.pangaea.de
    • search.dataone.org
    html, tsv
    Updated Apr 12, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cristiano Mazur Chiessi; Oscar E Romero; Tilo von Dobeneck; Sebastian Razik (2013). (Figure 4f) Magnetic susceptibility measured on distinct samples at 5 cm interval of sediment core GeoB6211-2 [Dataset]. http://doi.org/10.1594/PANGAEA.805099
    Explore at:
    html, tsvAvailable download formats
    Dataset updated
    Apr 12, 2013
    Dataset provided by
    PANGAEA
    Authors
    Cristiano Mazur Chiessi; Oscar E Romero; Tilo von Dobeneck; Sebastian Razik
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Time period covered
    Dec 12, 1999
    Area covered
    Variables measured
    AGE, DEPTH, sediment/rock, Susceptibility, specific, carbonate-free
    Description

    This dataset is about: (Figure 4f) Magnetic susceptibility measured on distinct samples at 5 cm interval of sediment core GeoB6211-2. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.805136 for more information.

  9. Wind Generation Time Interval Exploration Tool

    • catalog.data.gov
    • data.cnra.ca.gov
    • +2more
    Updated Jul 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Energy Commission (2025). Wind Generation Time Interval Exploration Tool [Dataset]. https://catalog.data.gov/dataset/wind-generation-time-interval-exploration-tool-28d5a
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    California Energy Commissionhttp://www.energy.ca.gov/
    Description

    This Wind Generation Interactive Query Tool created by the CEC. The visualization tool interactively displays wind generation over different time intervals in three-dimensional space. The viewer can look across the state to understand generation patterns of regions with concentrations of wind power plants. The tool aids in understanding high and low periods of generation. Operation of the electric grid requires that generation and demand are balanced in each period. The height and color of columns at wind generation areas are scaled and shaded to represent capacity factors (CFs) of the areas in a specific time interval. Capacity factor is the ratio of the energy produced to the amount of energy that could ideally have been produced in the same period using the rated nameplate capacity. Due to natural variations in wind speeds, higher factors tend to be seen over short time periods, with lower factors over longer periods. The capacity used is the reported nameplate capacity from the Quarterly Fuel and Energy Report, CEC-1304A. CFs are based on wind plants in service in the wind generation areas.Renewable energy resources like wind facilities vary in size and geographic distribution within each state. Resource planning, land use constraints, climate zones, and weather patterns limit availability of these resources and where they can be developed. National, state, and local policies also set limits on energy generation and use. An example of resource planning in California is the Desert Renewable Energy Conservation Plan. By exploring the visualization, a viewer can gain a three-dimensional understanding of temporal variation in generation CFs, along with how the wind generation areas compare to one another. The viewer can observe that areas peak in generation in different periods. The large range in CFs is also visible.

  10. d

    (Table 4) Isotopic composition of pore fluid samples from gas-hydrate...

    • search.dataone.org
    • doi.pangaea.de
    Updated Nov 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miriam Kastner; Keith A Kvenvolden; Thomas D Lorenson (2025). (Table 4) Isotopic composition of pore fluid samples from gas-hydrate interval at ODP Site 146-892 [Dataset]. http://doi.org/10.1594/PANGAEA.713910
    Explore at:
    Dataset updated
    Nov 2, 2025
    Dataset provided by
    PANGAEA Data Publisher for Earth and Environmental Science
    Authors
    Miriam Kastner; Keith A Kvenvolden; Thomas D Lorenson
    Time period covered
    Nov 6, 1992 - Nov 16, 1992
    Area covered
    Description

    This dataset is about: (Table 4) Isotopic composition of pore fluid samples from gas-hydrate interval at ODP Site 146-892. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.713919 for more information.

  11. d

    Data from: New Source Rock Data for the Niobrara and Sage Breaks intervals...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). New Source Rock Data for the Niobrara and Sage Breaks intervals of the lower Cody Shale in the Wyoming part of the Bighorn Basin [Dataset]. https://catalog.data.gov/dataset/new-source-rock-data-for-the-niobrara-and-sage-breaks-intervals-of-the-lower-cody-shale-in
    Explore at:
    Dataset updated
    Nov 25, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Bighorn Basin, Wyoming
    Description

    In 2019 the U.S. Geological Survey (USGS) quantitively assessed the potential for undiscovered, technically recoverable continuous (unconventional) oil and gas resources in the Niobrara interval of the Cody Shale in the Bighorn Basin Province (Finn and others, 2019). Leading up to the assessment, in 2017, the USGS collected samples from the Niobrara and underlying Sage Breaks intervals (Finn, 2019) to better characterize the source rock potential of the Niobrara interval. Eighty-two samples from 31 wells were collected from the well cuttings collection stored at the USGS Core Research Center in Lakewood, Colorado. The selected wells are located near the outcrop belt along the shallow margins of the basin to obtain samples that were not subjected to the effects of deep burial and subsequent organic carbon loss due to thermal maturation as described by Daly and Edman (1987) (fig. 1). Sixty samples are from the Niobrara interval, and 22 from the Sage Breaks interval (fig. 2).

  12. f

    Appendix H. Examples of panels with barnacle mimics from Connecticut and...

    • datasetcatalog.nlm.nih.gov
    • wiley.figshare.com
    Updated Aug 9, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Freestone, Amy L.; Osman, Richard W. (2016). Appendix H. Examples of panels with barnacle mimics from Connecticut and Belize at the early sampling interval. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001589747
    Explore at:
    Dataset updated
    Aug 9, 2016
    Authors
    Freestone, Amy L.; Osman, Richard W.
    Area covered
    Belize, Connecticut
    Description

    Examples of panels with barnacle mimics from Connecticut and Belize at the early sampling interval.

  13. Wind Generation Time Interval Exploration Data

    • data.cnra.ca.gov
    • data.ca.gov
    • +3more
    Updated Jan 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Energy Commission (2024). Wind Generation Time Interval Exploration Data [Dataset]. https://data.cnra.ca.gov/dataset/wind-generation-time-interval-exploration-data
    Explore at:
    arcgis geoservices rest api, gdb, geojson, zip, csv, kml, html, txt, xlsx, gpkgAvailable download formats
    Dataset updated
    Jan 19, 2024
    Dataset authored and provided by
    California Energy Commissionhttp://www.energy.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the data set behind the Wind Generation Interactive Query Tool created by the CEC. The visualization tool interactively displays wind generation over different time intervals in three-dimensional space. The viewer can look across the state to understand generation patterns of regions with concentrations of wind power plants. The tool aids in understanding high and low periods of generation. Operation of the electric grid requires that generation and demand are balanced in each period.



    The height and color of columns at wind generation areas are scaled and shaded to represent capacity factors (CFs) of the areas in a specific time interval. Capacity factor is the ratio of the energy produced to the amount of energy that could ideally have been produced in the same period using the rated nameplate capacity. Due to natural variations in wind speeds, higher factors tend to be seen over short time periods, with lower factors over longer periods. The capacity used is the reported nameplate capacity from the Quarterly Fuel and Energy Report, CEC-1304A. CFs are based on wind plants in service in the wind generation areas.

    Renewable energy resources like wind facilities vary in size and geographic distribution within each state. Resource planning, land use constraints, climate zones, and weather patterns limit availability of these resources and where they can be developed. National, state, and local policies also set limits on energy generation and use. An example of resource planning in California is the Desert Renewable Energy Conservation Plan.

    By exploring the visualization, a viewer can gain a three-dimensional understanding of temporal variation in generation CFs, along with how the wind generation areas compare to one another. The viewer can observe that areas peak in generation in different periods. The large range in CFs is also visible.



  14. r

    Data from: Transformation of measurement uncertainties into low-dimensional...

    • resodate.org
    • data.niaid.nih.gov
    • +1more
    Updated Jan 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonios Alexiadis; Scott Ferson; Eann A. Patterson (2021). Data from: Transformation of measurement uncertainties into low-dimensional feature vector space [Dataset]. http://doi.org/10.5061/DRYAD.6HDR7SQX2
    Explore at:
    Dataset updated
    Jan 1, 2021
    Dataset provided by
    Dryad
    Authors
    Antonios Alexiadis; Scott Ferson; Eann A. Patterson
    Description

    Advances in technology allow the acquisition of data with high spatial and temporal resolution. These datasets are usually accompanied by estimates of the measurement uncertainty, which may be spatially or temporally varying and should be taken into consideration when making decisions based on the data. At the same time, various transformations are commonly implemented to reduce the dimensionality of the datasets for post-processing, or to extract significant features. However, the corresponding uncertainty is not usually represented in the low-dimensional or feature vector space. A method is proposed that maps the measurement uncertainty into the equivalent low-dimensional space with the aid of approximate Bayesian computation, resulting in a distribution that can be used to make statistical inferences. The method involves no assumptions about the probability distribution of the measurement error and is independent of the feature extraction process as demonstrated in three examples. In the first two examples Chebyshev polynomials were used to analyse structural displacements and soil moisture measurements; while in the third, principal component analysis was used to decompose global ocean temperature data. The uses of the method range from supporting decision making in model validation or confirmation, model updating or calibration and tracking changes in condition, such as the characterisation of the El Niño Southern Oscillation.

  15. f

    Average and confidence intervals for λ, the percentage of runs with λ

    • figshare.com
    • plos.figshare.com
    xls
    Updated Dec 2, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin H. Letcher; Keith H. Nislow; Jason A. Coombs; Matthew J. O'Donnell; Todd L. Dubreuil (2015). Average and confidence intervals for λ, the percentage of runs with λ [Dataset]. http://doi.org/10.1371/journal.pone.0001139.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 2, 2015
    Dataset provided by
    PLOS ONE
    Authors
    Benjamin H. Letcher; Keith H. Nislow; Jason A. Coombs; Matthew J. O'Donnell; Todd L. Dubreuil
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Also shown is the number and proportion of immigrants required to ‘rescue’ the system from extinction (λ = 1).

  16. (Table 2) Trace element composition of samples from the Cretaceous interval...

    • doi.pangaea.de
    • search.dataone.org
    html, tsv
    Updated 1984
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberet B Halley; B J Pierson; Wolfgang Schlager (1984). (Table 2) Trace element composition of samples from the Cretaceous interval at DSDP Hole 77-536 [Dataset]. http://doi.org/10.1594/PANGAEA.809139
    Explore at:
    tsv, htmlAvailable download formats
    Dataset updated
    1984
    Dataset provided by
    PANGAEA
    Authors
    Roberet B Halley; B J Pierson; Wolfgang Schlager
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Time period covered
    Jan 9, 1981
    Area covered
    Variables measured
    Iron, Sodium, Calcium, Magnesium, Manganese, Rock type, Strontium, Sample code/label
    Description

    Duplicate samples were run as cross checks on analytical procedure.

  17. t

    (Figure 4f) Magnetic susceptibility measured on distinct samples at 5 cm...

    • service.tib.eu
    Updated Nov 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). (Figure 4f) Magnetic susceptibility measured on distinct samples at 5 cm interval of sediment core GeoB6211-2 [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-805099
    Explore at:
    Dataset updated
    Nov 30, 2024
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    DOI retrieved: 2013

  18. Data from: (Table 1) Number of counted pollen samples, intervals and time...

    • doi.pangaea.de
    • search.dataone.org
    html, tsv
    Updated 2003
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jun Tian; Xiangjun Sun; Yunli Luo; Fei Huang; Pinxian Wang (2003). (Table 1) Number of counted pollen samples, intervals and time resolution between samples of ODP SIte 184-1144 [Dataset]. http://doi.org/10.1594/PANGAEA.738717
    Explore at:
    tsv, htmlAvailable download formats
    Dataset updated
    2003
    Dataset provided by
    PANGAEA
    Authors
    Jun Tian; Xiangjun Sun; Yunli Luo; Fei Huang; Pinxian Wang
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Time period covered
    Mar 13, 1999 - Mar 18, 1999
    Area covered
    Variables measured
    Sample amount, Depth, top/min, Sample spacing, Time resolution, Age, maximum/old, Depth, bottom/max, Age, minimum/young, DEPTH, sediment/rock
    Description

    This dataset is about: (Table 1) Number of counted pollen samples, intervals and time resolution between samples of ODP SIte 184-1144. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.738719 for more information.

  19. r

    The banksia plot: a method for visually comparing point estimates and...

    • researchdata.edu.au
    • datasetcatalog.nlm.nih.gov
    • +1more
    Updated Apr 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Turner; Joanne McKenzie; Emily Karahalios; Elizabeth Korevaar (2024). The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets [Dataset]. http://doi.org/10.26180/25286407.V2
    Explore at:
    Dataset updated
    Apr 16, 2024
    Dataset provided by
    Monash University
    Authors
    Simon Turner; Joanne McKenzie; Emily Karahalios; Elizabeth Korevaar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Companion data for the creation of a banksia plot:

    Background:

    In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.

    Methods:

    The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.

    Results:

    In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.

    Conclusions:

    The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.

    This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1

  20. 2020 American Community Survey: B17001F | POVERTY STATUS IN THE PAST 12...

    • data.census.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACS, 2020 American Community Survey: B17001F | POVERTY STATUS IN THE PAST 12 MONTHS BY SEX BY AGE (SOME OTHER RACE ALONE) (ACS 5-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT5Y2020.B17001F?q=B17001F&g=050XX00US48321
    Explore at:
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ACS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2020
    Description

    Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2020, the 2020 Census provides the official counts of the population and housing units for the nation, states, counties, cities, and towns. For 2016 to 2019, the Population Estimates Program provides estimates of the population for the nation, states, counties, cities, and towns and intercensal housing unit estimates for the nation, states, and counties..Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2016-2020 American Community Survey 5-Year Estimates.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..The Hispanic origin and race codes were updated in 2020. For more information on the Hispanic origin and race code changes, please visit the American Community Survey Technical Documentation website..The 2016-2020 American Community Survey (ACS) data generally reflect the September 2018 Office of Management and Budget (OMB) delineations of metropolitan and micropolitan statistical areas. In certain instances, the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB delineation lists due to differences in the effective dates of the geographic entities..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Emily Rollinson (2016). Confidence Interval Examples [Dataset]. http://doi.org/10.6084/m9.figshare.3466364.v2
Organization logoOrganization logo

Confidence Interval Examples

Explore at:
62 scholarly articles cite this dataset (View in Google Scholar)
application/cdfv2Available download formats
Dataset updated
Jun 28, 2016
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Emily Rollinson
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Examples demonstrating how confidence intervals change depending on the level of confidence (90% versus 95% versus 99%) and on the size of the sample (CI for n=20 versus n=10 versus n=2). Developed for BIO211 (Statistics and Data Analysis: A Conceptual Approach) at Stony Brook University in Fall 2015.

Search
Clear search
Close search
Google apps
Main menu