Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Examples demonstrating how confidence intervals change depending on the level of confidence (90% versus 95% versus 99%) and on the size of the sample (CI for n=20 versus n=10 versus n=2). Developed for BIO211 (Statistics and Data Analysis: A Conceptual Approach) at Stony Brook University in Fall 2015.
Facebook
TwitterThis table shows overall ATCEMS response interval performance for entire fiscal years. Data in the table is broken out by incident response priority and service area (City of Austin or Travis County).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here, I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline. Supplementary materials for this article are available online.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises comprehensive information from ranked matches played in the game League of Legends, spanning the time frame between January 12, 2023, and May 18, 2023. The matches cover a wide range of skill levels, specifically from the Iron tier to the Diamond tier.
The dataset is structured based on time intervals, presenting game data at various percentages of elapsed game time, including 20%, 40%, 60%, 80%, and 100%. For each interval, detailed match statistics, player performance metrics, objective control, gold distribution, and other vital in-game information are provided.
This collection of data not only offers insights into how matches evolve and strategies change over different phases of the game but also enables the exploration of player behavior and decision-making as matches progress. Researchers and analysts in the field of esports and game analytics will find this dataset valuable for studying trends, developing predictive models, and gaining a deeper understanding of the dynamics within ranked League of Legends matches across different skill tiers.
Facebook
TwitterThis table contains data describing ATCEMS performance in delivering patients with time-sensitive conditions (aka “Alert Patients”) to receiving facilities in a timely manner. The call-to-door interval begins when the first 911 call for an incident is answered in the Communications Center, and ends when the patient is recorded in CAD as arriving at a receiving facility.
Facebook
TwitterThis dataset was created by Md Mahmud Ferdous
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the data set behind the Wind Generation Interactive Query Tool created by the CEC. The visualization tool interactively displays wind generation over different time intervals in three-dimensional space. The viewer can look across the state to understand generation patterns of regions with concentrations of wind power plants. The tool aids in understanding high and low periods of generation. Operation of the electric grid requires that generation and demand are balanced in each period.
Renewable energy resources like wind facilities vary in size and geographic distribution within each state. Resource planning, land use constraints, climate zones, and weather patterns limit availability of these resources and where they can be developed. National, state, and local policies also set limits on energy generation and use. An example of resource planning in California is the Desert Renewable Energy Conservation Plan.
By exploring the visualization, a viewer can gain a three-dimensional understanding of temporal variation in generation CFs, along with how the wind generation areas compare to one another. The viewer can observe that areas peak in generation in different periods. The large range in CFs is also visible.
Facebook
TwitterPython scripts and Python+Qt graphical user interface for calculating Feldman-Cousins confidence intervals for low-count Poisson processes in the presence of a known background and for Gaussian processes with a physical lower limit of 0.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset presents a high-resolution tracking log of the International Space Station (ISS), captured every 10 seconds over a continuous 24-hour period on June 7, 2025. It contains 8,641 data points, each representing the ISS’s exact location and motion as it orbits the Earth approximately every 90 minutes. Each record includes a timestamp, latitude, longitude, altitude (in kilometers), orbital speed (in km/h), the hemisphere in which the station was located, and the geographical region or body of water it was passing over. The data has been enriched with geolocation insights to help identify where the ISS was positioned above the Earth. This dataset is ideal for those interested in space exploration, orbital mechanics, geospatial analysis, educational demonstrations, or real-time data visualization. Whether you're a student, data scientist, or space enthusiast, this rich time-series dataset offers a valuable glimpse into the motion of one of humanity’s most iconic space assets.
Key Highlights: - 8641 entries captured at 10-second intervals (1 full day) - Tracks latitude, longitude, altitude, and speed of the ISS - Includes hemisphere and region metadata for context - Suitable for geospatial visualization, orbital simulation, and data science - Based on publicly available ISS tracking sources - Released under CC0 (Public Domain) for unrestricted use
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.
The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.
In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.
The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.
This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1
Facebook
TwitterData sets (R-wave to R-wave interval) used in the study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 2 rows and is filtered where the book is Survival analysis with interval-censored data : a practical approach with R, SAS and WinBUGS. It features 7 columns including author, publication date, language, and book publisher.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The described database was created using data obtained from the California Independent System Operator (CAISO) and the National Renewable Energy Laboratory (NREL). All data was collected at five-minute intervals, and subsequently cleaned and modified to create a database comprising three time series: solar energy production, wind energy production, and electricity demand. The database contains 12 columns, including date, station (1: Winter, 2: Spring, 3: Summer, 4: Autumn), day of the week (0: Monday, ... , 6: Sunday), DHI (W/m2), DNI (W/m2), GHI (W/m2), wind speed (m/s), humidity (%), temperature (degrees), solar energy production (MW), wind energy production (MW), and electricity demand (MW).
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Includes accelerometer data using an ActiGraph to assess usual sedentary, moderate, vigorous, and very vigorous activity at baseline, 6 weeks, and 10 weeks. Includes relative reinforcing value (RRV) data showing how participants rated how much they would want to perform both physical and sedentary activities on a scale of 1-10 at baseline, week 6, and week 10. Includes data on the breakpoint, or Pmax of the RRV, which was the last schedule of reinforcement (i.e. 4, 8, 16, …) completed for the behavior (exercise or sedentary). For both Pmax and RRV score, greater scores indicated a greater reinforcing value, with scores exceeding 1.0 indicating increased exercise reinforcement. Includes questionnaire data regarding preference and tolerance for exercise intensity using the Preference for and Tolerance of Intensity of Exercise Questionnaire (PRETIEQ) and positive and negative outcome expectancy of exercise using the outcome expectancy scale (OES). Includes data on height, weight, and BMI. Includes demographic data such as gender and race/ethnicity. Resources in this dataset:Resource Title: Actigraph activity data. File Name: AGData.csvResource Description: Includes data from Actigraph accelerometer for each participant at baseline, 6 weeks, and 10 weeks.Resource Title: RRV Data. File Name: RRVData.csvResource Description: Includes data from RRV at baseline, 6 weeks, and 10 weeks, OES survey data, PRETIE-Q survey data, and demographic data (gender, weight, height, race, ethnicity, and age).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the rapid development of data acquisition and storage space, massive datasets exhibited with large sample size emerge increasingly and make more advanced statistical tools urgently need. To accommodate such big volume in the analysis, a variety of methods have been proposed in the circumstances of complete or right censored survival data. However, existing development of big data methodology has not attended to interval-censored outcomes, which are ubiquitous in cross-sectional or periodical follow-up studies. In this work, we propose an easily implemented divide-and-combine approach for analyzing massive interval-censored survival data under the additive hazards model. We establish the asymptotic properties of the proposed estimator, including the consistency and asymptotic normality. In addition, the divide-and-combine estimator is shown to be asymptotically equivalent to the full-data-based estimator obtained from analyzing all data together. Simulation studies suggest that, relative to the full-data-based approach, the proposed divide-and-combine approach has desirable advantage in terms of computation time, making it more applicable to large-scale data analysis. An application to a set of interval-censored data also demonstrates the practical utility of the proposed method.
Facebook
TwitterThis file contains data on the number and types of vehicles that entered from each entry point on the tolled section of the Thruway with their exit points.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For 5 published data sets, the 95% confidence intervals on α and ρ were calculated with the two models Dirac and exponential.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
The Cardiac Arrhythmia Suppression Trial (CAST) was a landmark NHLBI-sponsored study designed to test the hypothesis that the suppression of asymptomatic or mildly symptomatic ventricular premature complexes (PVCs) in survivors of myocardial infarction (MI) would decrease the number of deaths from ventricular arrhythmias and improve survival. Enrollment required an acute MI within the preceding 2 years and 6 or more PVCs per hour during a pre-treatment (qualifying) long-term ECG (Holter) recording. Those subjects enrolled within 90 days of the index MI were required to have left ventricular ejection fractions less than or equal to 55%, while those enrolled after this 90 day window were required to have an ejection fraction less than or equal to 40%. CAST enrolled 3,549 patients in all.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
We can assess the overall performance of a regression model that produces prediction intervals by using the mean Winkler Interval score [1,2,3] which, for an individual interval, is given by:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4051350%2Fe3bd94c6047815c0304b3851fc325a7c%2FWinkler_Interval_Score.png?generation=1700042360776825&alt=media" alt="">
where \(y\) is the true value, \(u\) it the upper prediction interval, \(l\) is the lower prediction interval, and \(\alpha\) is (1-coverage). For example, for 90% coverage, \(\alpha = 0.1\). Note that the Winkler Interval score constitutes a proper scoring rule [2,3].
Attach this dataset to a notebook, then:
import sys
sys.path.append('/kaggle/input/winkler-interval-score-metric/')
import MWIS_metric
help(MWIS_metric.score)
MWIS,coverage = MWIS_metric.score(predictions["y_true"],predictions["lower"],predictions["upper"],alpha)
print(f"Local MWI score ",round(MWIS,3))
print("Predictions coverage ", round(coverage*100,1),"%")
Facebook
TwitterThese data show the results of four tests, one pretest and three posttest. It consist of three variables. Each task is performed three times (three trials). The movement times, the time it took to perform three different functional tasks. The duration of the maximal handopening during one of these tasks. And the deviation of the grip force control, in a task where a handle needed to be grasped with the correct amount of force.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Examples demonstrating how confidence intervals change depending on the level of confidence (90% versus 95% versus 99%) and on the size of the sample (CI for n=20 versus n=10 versus n=2). Developed for BIO211 (Statistics and Data Analysis: A Conceptual Approach) at Stony Brook University in Fall 2015.