100+ datasets found

f
Data from: A Statistical Inference Course Based on p-Values
tandf.figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan Martin (2023). A Statistical Inference Course Based on p-Values [Dataset]. http://doi.org/10.6084/m9.figshare.3494549.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3494549.v2
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Ryan Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here, I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline. Supplementary materials for this article are available online.
h
INTERVAL
healthdatagateway.org
unknown
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee., INTERVAL [Dataset]. https://healthdatagateway.org/dataset/201
Explore at:
unknownAvailable download formats
Dataset authored and provided by
INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee.
License
http://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdfhttp://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdf
Description
In over 100 years of blood donation practice, INTERVAL is the first randomised controlled trial to assess the impact of varying the frequency of blood donation on donor health and the blood supply. It provided policy-makers with evidence that collecting blood more frequently than current intervals can be implemented over two years without impacting on donor health, allowing better management of the supply to the NHS of units of blood with in-demand blood groups. INTERVAL was designed to deliver a multi-purpose strategy: an initial purpose related to blood donation research aiming to improve NHS Blood and Transplant’s core services and a longer-term purpose related to the creation of a comprehensive resource that will enable detailed studies of health-related questions.

Approximately 50,000 generally healthy blood donors were recruited between June 2012 and June 2014 from 25 NHS Blood Donation centres across England. Approximately equal numbers of men and women; aged from 18-80; ~93% white ancestry. All participants completed brief online questionnaires at baseline and gave blood samples for research purposes. Participants were randomised to giving blood every 8/10/12 weeks (for men) and 12/14/16 weeks (for women) over a 2-year period. ~30,000 participants returned after 2 years and completed a brief online questionnaire and gave further blood samples for research purposes.

The baseline questionnaire includes brief lifestyle information (smoking, alcohol consumption, etc), iron-related questions (e.g., red meat consumption), self-reported height and weight, etc. The SF-36 questionnaire was completed online at baseline and 2-years, with a 6-monthly SF-12 questionnaire between baseline and 2-years.

All participants have had the Affymetrix Axiom UK Biobank genotyping array assayed and then imputed to 1000G+UK10K combined reference panel (80M variants in total). 4,000 participants have 50X whole-exome sequencing and 12,000 participants have 15X whole-genome sequencing. Whole-blood RNA sequencing has commenced in ~5,000 participants.

The dataset also contains data on clinical chemistry biomarkers, blood cell traits, >200 lipoproteins, metabolomics (Metabolon HD4), lipidomics, and proteomics (SomaLogic, Olink), either cohort-wide or is large sub-sets of the cohort.
Additional file 2 of Comparison of a time-varying covariate model and a...
springernature.figshare.com
zip
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Campbell; Elizabeth Juarez-Colunga; Gary Grunwald; James Cooper; Scott Davis; Jane Gralla (2023). Additional file 2 of Comparison of a time-varying covariate model and a joint model of time-to-event outcomes in the presence of measurement error and interval censoring: application to kidney transplantation [Dataset]. http://doi.org/10.6084/m9.figshare.8331338.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8331338.v1
Dataset updated
Jun 2, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Kristen Campbell; Elizabeth Juarez-Colunga; Gary Grunwald; James Cooper; Scott Davis; Jane Gralla
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
∙ example-1-JM.R: Code to fit M1 ∙ longitudinal-data.csv: simulated TAC data for M1 ∙ survival-data.csv: simulated dnDSA data for M1 ∙ model-1-JM.txt: JAGS model, called by example-1-JM.R ∙ example-1-NLMIXED.SAS: Code to fit M1 with PROC NLMIXED, uses same simulated data ∙ example-4-TVC.R: Code to fit M4 ∙ longitudinal data tvc.csv: simulated TAC data for M4 (carried forward values of TAC) ∙ model-4-TVC.txt: JAGS model, called by example-4-TVC.R (ZIP 199 kb)
d
Wind Generation Time Interval Exploration Data
catalog.data.gov
data.cnra.ca.gov
+8more
Updated Jul 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Energy Commission (2025). Wind Generation Time Interval Exploration Data [Dataset]. https://catalog.data.gov/dataset/wind-generation-time-interval-exploration-data-403dc
Explore at:
Dataset updated
Jul 24, 2025
Dataset provided by
California Energy Commission
Description
This is the data set behind the Wind Generation Interactive Query Tool created by the CEC. The visualization tool interactively displays wind generation over different time intervals in three-dimensional space. The viewer can look across the state to understand generation patterns of regions with concentrations of wind power plants. The tool aids in understanding high and low periods of generation. Operation of the electric grid requires that generation and demand are balanced in each period. The height and color of columns at wind generation areas are scaled and shaded to represent capacity factors (CFs) of the areas in a specific time interval. Capacity factor is the ratio of the energy produced to the amount of energy that could ideally have been produced in the same period using the rated nameplate capacity. Due to natural variations in wind speeds, higher factors tend to be seen over short time periods, with lower factors over longer periods. The capacity used is the reported nameplate capacity from the Quarterly Fuel and Energy Report, CEC-1304A. CFs are based on wind plants in service in the wind generation areas.Renewable energy resources like wind facilities vary in size and geographic distribution within each state. Resource planning, land use constraints, climate zones, and weather patterns limit availability of these resources and where they can be developed. National, state, and local policies also set limits on energy generation and use. An example of resource planning in California is the Desert Renewable Energy Conservation Plan. By exploring the visualization, a viewer can gain a three-dimensional understanding of temporal variation in generation CFs, along with how the wind generation areas compare to one another. The viewer can observe that areas peak in generation in different periods. The large range in CFs is also visible.
o
The Percentile Bootstrap For Calculating The 95%Ci For The Median -...
explore.openaire.eu
Updated Apr 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J. Goedhart (2018). The Percentile Bootstrap For Calculating The 95%Ci For The Median - Animation (With R-Script And Example Data) [Dataset]. http://doi.org/10.5281/zenodo.1219874
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1219874
Dataset updated
Apr 17, 2018
Authors
J. Goedhart
Description
R Scripts and example data to perform a percentile bootstrap to determine the 95% confidence interval for the median. More background is described in this blog: http://thenode.biologists.com/a-better-bar/education/
2023 American Community Survey: S2602 | Characteristics of the Group...
data.census.gov
Updated Sep 28, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ACS (2019). 2023 American Community Survey: S2602 | Characteristics of the Group Quarters Population by Group Quarters Type (3 Types) (ACS 1-Year Estimates Subject Tables) [Dataset]. https://data.census.gov/cedsci/table?q=S2602
Explore at:
Dataset updated
Sep 28, 2019
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
ACS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2023
Description
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and the group quarters population for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2023 American Community Survey 1-Year Estimates.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Occupation titles and their 4-digit codes are based on the 2018 Standard Occupational Classification..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
Data from: Long-term site responses to season and interval of underburns on...
catalog.data.gov
agdatacommons.nal.usda.gov
+8more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2025). Long-term site responses to season and interval of underburns on the Georgia Piedmont [Dataset]. https://catalog.data.gov/dataset/long-term-site-responses-to-season-and-interval-of-underburns-on-the-georgia-piedmont-c0cdb
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Area covered
Georgia
Description
Between 1987 and 1988, twenty-four approximately 2-acre plots were established in Jones County, Georgia on the Hitchiti Experimental Forest which is also known as the Brender Demonstration Forest. These plots have not burned since prior to 1939. Treatments were applied to track site changes over time from five short return interval underburn treatments. These treatments, replicated 4 times, were comprised of: biennial dormant season headfires, triennial dormant season headfires, triennial dormant season backfires, triennial growing season headfires, growing season headfires every 6 years, and unburned controls. Triennial dormant season treatments were eventually combined. Variables tracked over time include the impact of fire on overstory pine growth, midstory (in this study midstory includes understory plants > 4.5 feet high) structure and composition, seedling (in this study including all woody plants < 4.5 feet high and all other plants regardless of height) species dominance, percent cover, pine seedling establishment and mortality, and forest floor consumption. Several thousand overstory and midstory trees were tagged, GPS coordinates recorded and their survival and growth followed over time. Vegetation was measured in nested circular plots and on line transects. Live and dead overstory trees on two 0.2 acre (1/5 ac) subplots per treatment plot were tallied annually by species with diameter at breast height and height, measured and pest damage/mortality by pathogen, lightning and wind damage recorded. Basal area was calculated periodically. Some overstory pines were bored to determine age (typically after death). Midstory live and dead trees were tallied annually on six 0.02 acre subplots per treatment plot. Seedlings were tallied on eighteen 0.001 acre (MA = milacre) subplots per treatment plot by species/species group, and percent of the subplot area in vines, herbs, moss, live woody material, dead plant material, and void of plant material (exposed mineral soil). Six 33 feet line transects per treatment plot were divided into 6 inch segments and dominant seedling species/species group tallied annually. Over 150 species/species groups were identified and tracked over time. Weights of likely available live fuel were determined by species/species group prior to each burn, as were weights of likely available dead fuel for various categories/size classes. Paired postburn samples were collected to determine consumption of various fuel categories. Overstory and midstory pine crown scorch, foliage consumption, and hardwood mortality were tallied within two weeks following each burn. Other vegetation datasets include pine seedling establishment and survival over time on the 18 MA subplots per treatment plot. Red cockaded woodpecker (RCW) related information was collected annually by Region 8 (Southern Region) of the USFS and is available from them. Live and dead fuel moisture data were sampled prior to every burn and can include preburn moisture content grab samples, 10-hour fuel stick readings, and random lumber probe readings. Fire behavior records of headfires and backfires can include rate of spread, flame length, flame angle, flame zone depth, short distance spotting, slopovers, burnout time, and percent of plot burned. The study plan called for observations of fire residence time as well, but such observations were rarely recorded. Weather data include on-plot hand-held instrument observations of surface wind velocity, ambient temperature and relative humidity (RH). On-site data collected can include precipitation, ambient temperature, RH and wind traces from recording gauges. Keetch Byram Drought Index (KBDI) calculation and National Fire danger Rating System NFDRS predictions and other weather observations taken at two nearby Georgia Forestry Commission weather stations were also included.
f
Data from: How to estimate the minimum power of the test and bound values...
scielo.figshare.com
png
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KLEIN IVANDRO; MATSUOKA MARCELO TOMIO; GUZATTO MATHEUS PEREIRA (2023). How to estimate the minimum power of the test and bound values for the confidence interval of Data Snooping procedure [Dataset]. http://doi.org/10.6084/m9.figshare.14327649.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14327649.v1
Dataset updated
Jun 1, 2023
Dataset provided by
SciELO journals
Authors
KLEIN IVANDRO; MATSUOKA MARCELO TOMIO; GUZATTO MATHEUS PEREIRA
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Snooping (DS) is the best-established method to identify gross errors (outliers) in geodetic data analysis, with a given probability. The power of the test is the probability of DS correctly identifying a gross error, while the confidence interval is the probability of DS not rejecting an observation uncontaminated by gross error. In practice, the power of the test is always unknown. Thus, the objective of this paper is to present a theoretical review of how to determine the minimum power of the test, and bound values for the confidence interval of the DS procedure in an n-dimensional scenario, i.e., considering all observations involved. Along with the theoretical review, a numerical example involving a simulated leveling network is presented. The results obtained in the experiments agreed with the previously calculated theoretical values, i.e., the revised methodology showed satisfactory performance in practice. The example also shows the importance of the revised methodology in the planning stage (or pre-analysis) of geodetic networks
2023 American Community Survey: B17020 | Poverty Status in the Past 12...
data.census.gov
Updated Oct 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ACS (2023). 2023 American Community Survey: B17020 | Poverty Status in the Past 12 Months by Age (ACS 1-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/cedsci/table?q=B17020
Explore at:
Dataset updated
Oct 18, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
ACS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2023
Description
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and the group quarters population for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2023 American Community Survey 1-Year Estimates.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
f
Data from: Confidence and Prediction in Linear Mixed Models: Do Not...
tandf.figshare.com
figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernard G. Francq; Dan Lin; Walter Hoyer (2023). Confidence and Prediction in Linear Mixed Models: Do Not Concatenate the Random Effects. Application in an Assay Qualification Study [Dataset]. http://doi.org/10.6084/m9.figshare.12410729.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12410729.v2
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Bernard G. Francq; Dan Lin; Walter Hoyer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract–In the pharmaceutical industry, all analytical methods must be shown to deliver unbiased and precise results. In an assay qualification or validation study, the trueness, accuracy, and intermediate precision are usually assessed by comparing the measured concentrations to their nominal levels. Trueness is assessed by using Confidence Intervals (CIs) of mean measured concentration, accuracy by Prediction Intervals (PIs) for a future measured concentration, and the intermediate precision by the total variance. ICH and USP guidelines alike request that all relevant sources of variability must be studied, for example, the effect of different technicians, the day-to-day variability or the use of multiple reagent lots. Those different random effects must be modeled as crossed, nested, or a combination of both; while concatenating them to simplify the model is often taken place. This article compares this simplified approach to a mixed model with the actual design. Our simulation study shows an under-estimation of the intermediate precision and, therefore, a substantial reduction of the CI and PI. The power for accuracy or trueness is consequently over-estimated when designing a new study. Two real datasets from assay validation study during vaccine development are used to illustrate the impact of such concatenation of random variables.
C
Data from: Soilsamples
ckan.mobidatalab.eu
kml, wfs, wms
Updated Nov 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open Data Vlaanderen (2023). Soilsamples [Dataset]. https://ckan.mobidatalab.eu/id/dataset/soilsamples
Explore at:
kml, wms, wfsAvailable download formats
Dataset updated
Nov 3, 2023
Dataset provided by
Open Data Vlaanderen
Description
A soil sample is a sample of the soil that is taken for further analysis in the field or in a laboratory. A sample is always taken at a certain depth (from/to). A specific technique can optionally be described (for example, disturbed or undisturbed), as well as the conditions of the sampling (for example, atmospheric conditions, etc.). The results of analyzes performed on the sample are saved as observations that are linked to the sample. A sample can possibly be linked to one or more assignments and attachments can also be linked to a sample (for example analysis results or reports). A soil sample is either a single sample or a mixed sample (this is the type). A single sample is always linked to one bottom location or one depth interval. A bottom location or depth interval can have 0 or more single samples. A mixed sample can be linked to one soil site, one soil location or one soil depth interval. These can have 0 or more mixed samples.
a
ACS Travel Time To Work Variables - Tract
hub.arcgis.com
hub.scag.ca.gov
Updated Feb 3, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rdpgisadmin (2022). ACS Travel Time To Work Variables - Tract [Dataset]. https://hub.arcgis.com/datasets/3341ca03b6044fc6bc474765f6f1eac7
Explore at:
Dataset updated
Feb 3, 2022
Dataset authored and provided by
rdpgisadmin
Area covered
Description
This layer shows workers' place of residence by commute length. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of commuters whose commute is 90 minutes or more. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2015-2019ACS Table(s): B08303Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 10, 2020National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2010 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
d
New Source Rock Data for the Niobrara and Sage Breaks intervals of the lower...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). New Source Rock Data for the Niobrara and Sage Breaks intervals of the lower Cody Shale in the Wyoming part of the Bighorn Basin [Dataset]. https://catalog.data.gov/dataset/new-source-rock-data-for-the-niobrara-and-sage-breaks-intervals-of-the-lower-cody-shale-in
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Wyoming, Bighorn Basin
Description
In 2019 the U.S. Geological Survey (USGS) quantitively assessed the potential for undiscovered, technically recoverable continuous (unconventional) oil and gas resources in the Niobrara interval of the Cody Shale in the Bighorn Basin Province (Finn and others, 2019). Leading up to the assessment, in 2017, the USGS collected samples from the Niobrara and underlying Sage Breaks intervals (Finn, 2019) to better characterize the source rock potential of the Niobrara interval. Eighty-two samples from 31 wells were collected from the well cuttings collection stored at the USGS Core Research Center in Lakewood, Colorado. The selected wells are located near the outcrop belt along the shallow margins of the basin to obtain samples that were not subjected to the effects of deep burial and subsequent organic carbon loss due to thermal maturation as described by Daly and Edman (1987) (fig. 1). Sixty samples are from the Niobrara interval, and 22 from the Sage Breaks interval (fig. 2).
m
Data for: Determination of fuel utilisation and recirculated gas composition...
data.mendeley.com
explore.openaire.eu
Updated Nov 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pauli Koski (2021). Data for: Determination of fuel utilisation and recirculated gas composition in dead-ended PEMFC systems [Dataset]. http://doi.org/10.17632/zdz65vcjzc.1
Explore at:
Unique identifier
https://doi.org/10.17632/zdz65vcjzc.1
Dataset updated
Nov 22, 2021
Authors
Pauli Koski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Measurement data records from the operation of a 1 kW PEMFC short stack system (697175 samples, 50 ms interval) with three different purge cycles. The file also includes gas chromatograph data (92 samples, 375 s interval). Data is stored in Matlab MAT-file format.

Original article with description of the measurements is available here: https://doi.org/10.1016/j.ijhydene.2020.04.252
Preventive to Predictive Maintenance
kaggle.com
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prognostics @ HSE (2024). Preventive to Predictive Maintenance [Dataset]. http://doi.org/10.34740/kaggle/dsv/8684322
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/8684322
Dataset updated
Jun 13, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Prognostics @ HSE
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context: This data set originates from a practice-relevant degradation process, which is representative for Prognostics and Health Management (PHM) applications. The observed degradation process is the clogging of filters when separating of solid particles from gas. A test bench is used for this purpose, which performs automated life testing of filter media by loading them. For testing, dust complying with ISO standard 12103-1 and with a known particle size distribution is employed. The employed filter media is made of randomly oriented non-woven fibre material. Further data sets are generated for various practice-relevant data situations which do not correspond to the ideal conditions of full data coverage. These data sets are uploaded to Kaggle by the user "Prognostics @ HSE" in a continuous process. In order to avoid the carryover between two data sets, a different configuration of the filter tests is used for each uploaded practice-relevant data situation, for example by selecting a different filter media.

Detailed specification: For more information about the general operation and the components used, see the provided description file Preventive to Predicitve Maintenance dataset.pdf

Given data situation: The data set Preventive to Predicitve Maintenance is about the transition of a preventive maintenance strategy to a predictive maintenance strategy of a replaceable part, in this case a filter. To aid the realisation of predictive maintenance, life cycles have already been recorded from the application studied. However, the preventive maintenance in place so far causes them to be replaced after a fixed period of time, regardless of the condition of the degrading part. As a result, the end of life is not known for most records and thus they are right-censored. The so given training data are recorded runs of the filter up to a periodic replacement interval. When specifying the interval length for preventive maintenance, a trade-off has to be made between wasted life and the frequency of unplanned downtimes that occur, when having a particularly short life. The interval here is chosen so that, on average, failure is observed at the shortest 10% of the filter lives in the training data. The other lives are censored. The filter failure occurs when the differential pressure across the filter exceeds 600 Pa. The maintenance interval length depends on the amount of dust fed in per time, which is constant within a test run. For example, at twice the dust feed, the maintenance interval is half as long. The same relationship therefore applies to the respective censoring time, which scales inversely proportional with the particle feed. The variations between lifetimes are therefore primarily based on the type of dust, the flow rate and manufacturing tolerances. The filter medium CC 600 G was used exclusively for these measurement samples, which are included in this data set.

Task: The objective of the data set is to precisely predict the remaining useful life (RUL) of the filter for the given test data, so a transition to predictive maintenance is made possible. For this purpose, the dataset contains training and test data, consisting both of 50 life tests respectively. The test data contains randomly right-censored run-to-failure measurements and the respective RUL as a ground truth to the prediction task. The main challenge is how to make the most use of the right-censored life data within the training data. Due to the detailed description of the setup and the various physical filter models described in literature, it is possible to support the actual data-driven models by integrating physical knowledge respectively models in the sense of theory-guided data science or informed machine learning (various names are common).

Acknowledgement: Thanks go to Marc Hönig (Scientific Employee), Marcel Braig (Scientific Employee) and Christopher Rein (Research Assistant) for contributing to the recording of these life tests.

Data set Creator: Hochschule Esslingen - University of Applied Sciences Research Department Reliability Engineering and Prognostics and Health Management Robert-Bosch-Straße 1 73037 Göppingen Germany

Dataset Citation: Hagmeyer, S., Mauthe, F., & Zeiler, P. (2021). Creation of Publicly Available Data Sets for Prognostics and Diagnostics Addressing Data Scenarios Relevant to Industrial Applications. International Journal of Prognostics and Health Management, Volume 12, Issue 2, DOI: 10.36001/ijphm.2021.v12i2.3087
n
GNPS - Postmortem interval prediction using metabolomics - Skin samples...
data.niaid.nih.gov
Updated Sep 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pieter Dorrestein (2019). GNPS - Postmortem interval prediction using metabolomics - Skin samples during decomposition [Dataset]. https://data.niaid.nih.gov/resources?id=msv000084322
Explore at:
Dataset updated
Sep 13, 2019
Dataset authored and provided by
Pieter Dorrestein
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Metabolomics
Description
LC-MS/MS data were collected from skin samples collected across four seasons during decomposition of a total of 36 donor bodies, to generate per sample untargeted metabolomics profiles.
n
Counts of Pneumonia reported in UNITED STATES OF AMERICA: 1912-1951
data.niaid.nih.gov
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burke, Donald (2024). Counts of Pneumonia reported in UNITED STATES OF AMERICA: 1912-1951 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452320
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Cross, Anne
Van Panhuis, Willem
Burke, Donald
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961
data.niaid.nih.gov
zenodo.org
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burke, Donald (2024). Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452349
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Cross, Anne
Van Panhuis, Willem
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
ACS Household Size Variables - Boundaries
hub.arcgis.com
vaccine-confidence-program-cdcvax.hub.arcgis.com
+3more
Updated Nov 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Esri (2020). ACS Household Size Variables - Boundaries [Dataset]. https://hub.arcgis.com/maps/388cebd5976e49faa77af91a5d73dfee
Explore at:
Dataset updated
Nov 17, 2020
Dataset authored and provided by
Esrihttp://esri.com/
Area covered

Description
This layer shows household size by tenure (owner or renter). This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the average household size. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B25009, B25010, B19019Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Z
Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948
data.niaid.nih.gov
zenodo.org
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cross, Anne (2024). Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452249
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Cross, Anne
Van Panhuis, Willem
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".

Facebook

Twitter

Click to copy link

Link copied

Cite

Ryan Martin (2023). A Statistical Inference Course Based on p-Values [Dataset]. http://doi.org/10.6084/m9.figshare.3494549.v2

Data from: A Statistical Inference Course Based on p-Values

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.3494549.v2

Dataset updated

May 30, 2023

Dataset provided by

Taylor & Francis

Authors

Ryan Martin

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here, I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline. Supplementary materials for this article are available online.

Clear search

Close search

Google apps

Main menu

Data from: A Statistical Inference Course Based on p-Values

INTERVAL

Additional file 2 of Comparison of a time-varying covariate model and a...

Wind Generation Time Interval Exploration Data

The Percentile Bootstrap For Calculating The 95%Ci For The Median -...

2023 American Community Survey: S2602 | Characteristics of the Group...

Data from: Long-term site responses to season and interval of underburns on...

Data from: How to estimate the minimum power of the test and bound values...

2023 American Community Survey: B17020 | Poverty Status in the Past 12...

Data from: Confidence and Prediction in Linear Mixed Models: Do Not...

Data from: Soilsamples

ACS Travel Time To Work Variables - Tract

New Source Rock Data for the Niobrara and Sage Breaks intervals of the lower...

Data for: Determination of fuel utilisation and recirculated gas composition...

Preventive to Predictive Maintenance

GNPS - Postmortem interval prediction using metabolomics - Skin samples...

Counts of Pneumonia reported in UNITED STATES OF AMERICA: 1912-1951

Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961

ACS Household Size Variables - Boundaries

Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948

Data from: A Statistical Inference Course Based on p-Values