100+ datasets found
  1. q

    Data from: A Customizable Inquiry-Based Statistics Teaching Application for...

    • qubeshub.org
    Updated Apr 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg (2024). A Customizable Inquiry-Based Statistics Teaching Application for Introductory Biology Students [Dataset]. https://qubeshub.org/publications/4651/?v=1
    Explore at:
    Dataset updated
    Apr 5, 2024
    Dataset provided by
    QUBES
    Authors
    Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg
    Description

    Building strong quantitative skills prepares undergraduate biology students for successful careers in science and medicine. While math and statistics anxiety can negatively impact student learning within biology classrooms, instructors may reduce this anxiety by steadily building student competency in quantitative reasoning through instructional scaffolding, application-based approaches, and simple computer program interfaces. However, few statistical programs exist that meet all needs of an inclusive, inquiry-based laboratory course. These needs include an open-source program, a simple interface, little required background knowledge in statistics for student users, and customizability to minimize cognitive load, align with course learning outcomes, and create desirable difficulty. To address these needs, we used the Shiny package in R to develop a custom statistical analysis application. Our “BioStats” app provides students with scaffolded learning experiences in applied statistics that promotes student agency and is customizable by the instructor. It introduces students to the strengths of the R interface, while eliminating the need for complex coding in the R programming language. It also prioritizes practical implementation of statistical analyses over learning statistical theory. To our knowledge, this is the first statistics teaching tool where students are presented basic statistics initially, more complex analyses as they advance, and includes an option to learn R statistical coding. The BioStats app interface yields a simplified introduction to applied statistics that is adaptable to many biology laboratory courses.

    Primary Image: Singing Junco. A sketch of a junco singing on a pine tree branch, created by the lead author of this paper.

  2. Statistical Data Analysis using R

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Barsanelli Costa (2023). Statistical Data Analysis using R [Dataset]. http://doi.org/10.6084/m9.figshare.5501035.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Samuel Barsanelli Costa
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.

  3. Students Data Analysis

    • kaggle.com
    zip
    Updated Jul 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MOMONO (2022). Students Data Analysis [Dataset]. https://www.kaggle.com/datasets/erqizhou/students-data-analysis
    Explore at:
    zip(2174 bytes)Available download formats
    Dataset updated
    Jul 20, 2022
    Authors
    MOMONO
    Description

    A little paragraph from one real dataset, with a few little changes to protect students' private information. Permissions are given.

    Goals

    You are going to help teachers with only the data: 1. Prediction: To tell what makes a brilliant student who can apply for a graduate school, whether abroad or not. 2. Application: To help those who fails to apply for a graduate school with advice in job searching.

    Tips

    1. Educational data may have subtle structures, hierarchies and heterogeneity are probably involved. Simple regressions can hardly make any difference. Also, you should keep an eye on the collinearity in some indicators collected by teachers who have already forgot statistics.
    2. Not all students are free to choose to apply for a graduate school, but some were born with privileges.
    3. Some of the students are trying (or planning to try) to apply for a graduate school for years, you should be responsible to give advice accurately under their circumstances

    About the Data

    Some of the original structure are deleted or censored. For those are left: Basic data like: - ID - class: categorical, initially students were divided into 2 classes, yet teachers suspect that of different classes students may performance significant differently. - gender - race: categorical and censored - GPA: real numbers, float

    Some teachers assume that scores of math curriculums can represent one's likelihood perfectly: - Algebra: real numbers, Advanced Algebra - ......

    Some assume that background of students can affect their choices and likelihood significantly, which are all censored as: - from1: students' home locations - from2: a probably bad indicator for preference on mathematics - from 3: how did students apply for this university (undergraduate) - from4: a probably bad indicator for family background. 0 with more wealth, 4 with more poverty

    The final indicator y: - 0, one fails to apply for the graduate school, who may apply again or search jobs in the future - 1, success, inland - 2, success, abroad

  4. Basic statistical analysis of Team Work Survey

    • figshare.com
    pdf
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajiv vaid Basaiawmoit; Eszter, Somos; Ervin, Szalai; Kata, Szabo; Taru, Deva (2016). Basic statistical analysis of Team Work Survey [Dataset]. http://doi.org/10.6084/m9.figshare.1494849.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Rajiv vaid Basaiawmoit; Eszter, Somos; Ervin, Szalai; Kata, Szabo; Taru, Deva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aggregated responses of the 9-point post-course team-work questionnaire that was sent out to the students in this gamification pilot study. The dataset is divided into the two teams game-based and instructor-section based teams. A Mann Whitney U Test was performed on the dataset.

  5. f

    Data from: Statistical analysis plan for the Balanced Solution versus Saline...

    • datasetcatalog.nlm.nih.gov
    Updated Mar 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lacerda, Fabio Holanda; Neto, Ary Serpa; Lovato, Wilson José; Grion, Cintia Magalhães Carvalho; Cavalcanti, Alexandre Biasi; Júnior, Lúcio Couto Oliveira; Damiani, Lucas Petri; Amêndola, Cristina Prata; Maia, Israel Silva; Zampieri, Fernando Godinho; de Freitas, Flávio Geraldo Rezende; Veiga, Viviane Cordeiro; Figueiredo, Rodrigo Cruvinel; Guedes, Marco Antonio Vieira; Miranda, Tamiris Abait; da Rocha Paranhos, Jorge Luiz; Biondi, Rodrigo Santos; de Azevedo Lúcio, Eraldo; Lisboa, Thiago Costa; Machado, Flavia Ribeiro (2021). Statistical analysis plan for the Balanced Solution versus Saline in Intensive Care Study (BaSICS) [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000861642
    Explore at:
    Dataset updated
    Mar 25, 2021
    Authors
    Lacerda, Fabio Holanda; Neto, Ary Serpa; Lovato, Wilson José; Grion, Cintia Magalhães Carvalho; Cavalcanti, Alexandre Biasi; Júnior, Lúcio Couto Oliveira; Damiani, Lucas Petri; Amêndola, Cristina Prata; Maia, Israel Silva; Zampieri, Fernando Godinho; de Freitas, Flávio Geraldo Rezende; Veiga, Viviane Cordeiro; Figueiredo, Rodrigo Cruvinel; Guedes, Marco Antonio Vieira; Miranda, Tamiris Abait; da Rocha Paranhos, Jorge Luiz; Biondi, Rodrigo Santos; de Azevedo Lúcio, Eraldo; Lisboa, Thiago Costa; Machado, Flavia Ribeiro
    Description

    Abstract Objective: To report the statistical analysis plan (first version) for the Balanced Solutions versus Saline in Intensive Care Study (BaSICS). Methods: BaSICS is a multicenter factorial randomized controlled trial that will assess the effects of Plasma-Lyte 148 versus 0.9% saline as the fluid of choice in critically ill patients, as well as the effects of a slow (333mL/h) versus rapid (999mL/h) infusion speed during fluid challenges, on important patient outcomes. The fluid type will be blinded for investigators, patients and the analyses. No blinding will be possible for the infusion speed for the investigators, but all analyses will be kept blinded during the analysis procedure. Results: BaSICS will have 90-day mortality as its primary endpoint, which will be tested using mixed-effects Cox proportional hazard models, considering sites as a random variable (frailty models) adjusted for age, organ dysfunction and admission type. Important secondary endpoints include renal replacement therapy up to 90 days, acute renal failure, organ dysfunction at days 3 and 7, and mechanical ventilation-free days within 28 days. Conclusion: This manuscript provides details on the first version of the statistical analysis plan for the BaSICS trial and will guide the study’s analysis when follow-up is finished.

  6. d

    Data from: Units of Analysis: The Basics

    • search.dataone.org
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chuck Humphrey (2023). Units of Analysis: The Basics [Dataset]. http://doi.org/10.5683/SP3/DLBDT5
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Chuck Humphrey
    Description

    One of the first steps in a reference interview is determining what is it the user really wants or needs. In many cases, the question comes down to the unit of analysis: what is it that is being investigated or researched? This presentation will take us through the concept of the unit of analysis so that we can improve our reference service — and make our lives easier as a result! Note: This presentation precedes Working with Complex Surveys: Canadian Travel Survey by Chuck Humphrey (14-Mar-2002).

  7. m

    Graphical and statistical analysis

    • data.mendeley.com
    Updated May 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashwathi Prakash (2023). Graphical and statistical analysis [Dataset]. http://doi.org/10.17632/49kc5g6z25.1
    Explore at:
    Dataset updated
    May 22, 2023
    Authors
    Ashwathi Prakash
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Graphical analysis of the toxicity testing and the potency of millet extracts in reversing the tachycardic and bradycardic conditions. The results show significant changes and it is effectively supported by the statistical data (correlation analysis) performed using the basic functions of Microsoft Excel.

  8. H

    Hydrologic Statistics and Data Analysis (M1)

    • beta.hydroshare.org
    • hydroshare.org
    • +2more
    zip
    Updated Sep 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Garousi-Nejad; Belize Lane (2021). Hydrologic Statistics and Data Analysis (M1) [Dataset]. https://beta.hydroshare.org/resource/bd0b38fc5d1e4d5c895dc484ceeb2c2a/
    Explore at:
    zip(45.7 KB)Available download formats
    Dataset updated
    Sep 10, 2021
    Dataset provided by
    HydroShare
    Authors
    Irene Garousi-Nejad; Belize Lane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This resource contains a Jupyter Notebook that is used to introduce hydrologic data analysis and conservation laws. This resource is part of a HydroLearn Physical Hydrology learning module available at https://edx.hydrolearn.org/courses/course-v1:Utah_State_University+CEE6400+2019_Fall/about

    In this activity, the student learns how to (1) calculate the residence time of water in land and rivers for the global hydrologic cycle; (2) quantify the relative and absolute uncertainties in components of the water balance; (3) navigate public websites and databases, extract key watershed attributes, and perform basic hydrologic data analysis for a watershed of interest; (4) assess, compare, and interpret hydrologic trends in the context of a specific watershed.

    Please note that in problems 3-8, the user is asked to use an R package (i.e., dataRetrieval) and select a U.S. Geological Survey (USGS) streamflow gage to retrieve streamflow data and then apply the hydrological data analysis to the watershed of interest. We acknowledge that the material relies on USGS data that are only available within the U.S. If running for other watersheds of interest outside the U.S. or wishing to work with other datasets, the user must take some further steps and develop codes to prepare the streamflow dataset. Once a streamflow time series dataset is obtained for an international catchment of interest, the user would need to read that file into the workspace before working through subsequent analyses.

  9. students_perfomance_data

    • kaggle.com
    zip
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suhana Lodhi (2025). students_perfomance_data [Dataset]. https://www.kaggle.com/datasets/suhanalodhi/students-perfomance-data
    Explore at:
    zip(558 bytes)Available download formats
    Dataset updated
    Oct 29, 2025
    Authors
    Suhana Lodhi
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The “Students Performance Data” dataset provides academic and demographic information of students. It includes their marks in Maths, Science, and English along with attendance and city details. This dataset is ideal for beginners learning data entry, analysis, and visualization using tools like Excel or Kaggle Notebooks.

  10. Google Analytics data of an E-commerce Company

    • kaggle.com
    zip
    Updated Oct 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fehu.zone (2024). Google Analytics data of an E-commerce Company [Dataset]. https://www.kaggle.com/datasets/fehu94/google-analytics-data-of-an-e-commerce-company
    Explore at:
    zip(3156 bytes)Available download formats
    Dataset updated
    Oct 19, 2024
    Authors
    fehu.zone
    Description

    📊 Dataset Title: Daily Active Users Dataset

    📝 Description

    This dataset provides detailed insights into daily active users (DAU) of a platform or service, captured over a defined period of time. The dataset includes information such as the number of active users per day, allowing data analysts and business intelligence teams to track usage trends, monitor platform engagement, and identify patterns in user activity over time.

    The data is ideal for performing time series analysis, statistical analysis, and trend forecasting. You can utilize this dataset to measure the success of platform initiatives, evaluate user behavior, or predict future trends in engagement. It is also suitable for training machine learning models that focus on user activity prediction or anomaly detection.

    📂 Dataset Structure

    The dataset is structured in a simple and easy-to-use format, containing the following columns:

    • Date: The date on which the data was recorded, formatted as YYYYMMDD.
    • Number of Active Users: The number of users who were active on the platform on the corresponding date.

    Each row in the dataset represents a unique date and its corresponding number of active users. This allows for time-based analysis, such as calculating the moving average of active users, detecting seasonality, or spotting sudden spikes or drops in engagement.

    🧐 Key Use Cases

    This dataset can be used for a wide range of purposes, including:

    1. Time Series Analysis: Analyze trends and seasonality of user engagement.
    2. Trend Detection: Discover peaks and valleys in user activity.
    3. Anomaly Detection: Use statistical methods or machine learning algorithms to detect anomalies in user behavior.
    4. Forecasting User Growth: Build forecasting models to predict future platform usage.
    5. Seasonality Insights: Identify patterns like increased activity on weekends or holidays.

    📈 Potential Analysis

    Here are some specific analyses you can perform using this dataset:

    • Moving Average and Smoothing: Calculate the moving average over a 7-day or 30-day period.
    • Correlation with External Factors: Correlate daily active users with other datasets.
    • Statistical Hypothesis Testing: Perform t-tests or ANOVA to determine significant differences in user activity.
    • Machine Learning for Prediction: Train machine learning models to predict user engagement.

    🚀 Getting Started

    To get started with this dataset, you can load it into your preferred analysis tool. Here's how to do it using Python's pandas library:

    import pandas as pd
    
    # Load the dataset
    data = pd.read_csv('path_to_dataset.csv')
    
    # Display the first few rows
    print(data.head())
    
    # Basic statistics
    print(data.describe())
    
  11. Descriptive statistics and reliability tests.

    • plos.figshare.com
    xls
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih (2025). Descriptive statistics and reliability tests. [Dataset]. http://doi.org/10.1371/journal.pone.0312306.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data is becoming increasingly ubiquitous today, and data literacy has emerged an essential skill in the workplace. Therefore, it is necessary to equip high school students with data literacy skills in order to prepare them for further learning and future employment. In Indonesia, there is a growing shift towards integrating data literacy in the high school curriculum. As part of a pilot intervention project, academics from two leading Universities organised data literacy boot camps for high school students across various cities in Indonesia. The boot camps aimed at increasing participants’ awareness of the power of analytical and exploration skills, which in turn, would contribute to creating independent and data-literate students. This paper explores student participants’ self-perception of their data literacy as a result of the skills acquired from the boot camps. Qualitative and quantitative data were collected through student surveys and a focus group discussion, and were used to analyse student perception post-intervention. The findings indicate that students became more aware of the usefulness of data literacy and its application in future studies and work after participating in the boot camp. Of the materials delivered at the boot camps, students found the greatest benefit in learning basic statistical concepts and applying them through the use of Microsoft Excel as a tool for basic data analysis. These findings provide valuable policy recommendations that educators and policymakers can use as guidelines for effective data literacy teaching in high schools.

  12. Basic Stand Alone Medicare Claims Public Use Files Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Basic Stand Alone Medicare Claims Public Use Files Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/basic-stand-alone-medicare-claims-public-use-files-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    This data package contains claims-based data about beneficiaries of Medicare program services including Inpatient, Outpatient, related to Chronic Conditions, Skilled Nursing Facility, Home Health Agency, Hospice, Carrier, Durable Medical Equipment (DME) and data related to Prescription Drug Events. It is necessary to mention that the values are estimated and counted, by using a random sample of fee-for-service Medicare claims.

  13. ODM Data Analysis—A tool for the automatic validation, monitoring and...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    mp4
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas (2023). ODM Data Analysis—A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data [Dataset]. http://doi.org/10.1371/journal.pone.0199242
    Explore at:
    mp4Available download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.

  14. f

    Basic statistical analysis of SNP markers in bread wheat.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Nov 2, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rasheed, Awais; Xia, Xianchun; Jing, Ruilian; Wen, Weie; Cao, Shuanghe; He, Zhonghu; Dong, Yan; Liu, Jindong; Yan, Jun; Geng, Hongwei; Fu, Luping; Xiao, Yonggui; Zhang, Yan; Zhang, Yong (2016). Basic statistical analysis of SNP markers in bread wheat. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001563205
    Explore at:
    Dataset updated
    Nov 2, 2016
    Authors
    Rasheed, Awais; Xia, Xianchun; Jing, Ruilian; Wen, Weie; Cao, Shuanghe; He, Zhonghu; Dong, Yan; Liu, Jindong; Yan, Jun; Geng, Hongwei; Fu, Luping; Xiao, Yonggui; Zhang, Yan; Zhang, Yong
    Description

    Basic statistical analysis of SNP markers in bread wheat.

  15. Data from: Basic statistical considerations for physiology: The journal...

    • tandf.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron R. Caldwell; Samuel N. Cheuvront (2023). Basic statistical considerations for physiology: The journal Temperature toolbox [Dataset]. http://doi.org/10.6084/m9.figshare.8320151.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Aaron R. Caldwell; Samuel N. Cheuvront
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The average environmental and occupational physiologist may find statistics are difficult to interpret and use since their formal training in statistics is limited. Unfortunately, poor statistical practices can generate erroneous or at least misleading results and distorts the evidence in the scientific literature. These problems are exacerbated when statistics are used as thoughtless ritual that is performed after the data are collected. The situation is worsened when statistics are then treated as strict judgements about the data (i.e., significant versus non-significant) without a thought given to how these statistics were calculated or their practical meaning. We propose that researchers should consider statistics at every step of the research process whether that be the designing of experiments, collecting data, analysing the data or disseminating the results. When statistics are considered as an integral part of the research process, from start to finish, several problematic practices can be mitigated. Further, proper practices in disseminating the results of a study can greatly improve the quality of the literature. Within this review, we have included a number of reminders and statistical questions researchers should answer throughout the scientific process. Rather than treat statistics as a strict rule following procedure we hope that readers will use this review to stimulate a discussion around their current practices and attempt to improve them. The code to reproduce all analyses and figures within the manuscript can be found at https://doi.org/10.17605/OSF.IO/BQGDH.

  16. h

    databird-basic-math

    • huggingface.co
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rasmus Rasmussen (2025). databird-basic-math [Dataset]. https://huggingface.co/datasets/theprint/databird-basic-math
    Explore at:
    Dataset updated
    Oct 24, 2025
    Authors
    Rasmus Rasmussen
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    DATABIRD: BASIC MATH

      Overview
    

    This data set, curated by ThePrint, encompasses a broad spectrum of mathematical concepts, from basic arithmetic operations such as addition and subtraction to more advanced topics like percentages, introductory calculus, graph interpretation, and statistical analysis. It is designed for educational purposes, providing a comprehensive resource for understanding various numerical methodologies.

      Key Features
    

    Format: JSON Contents:… See the full description on the dataset page: https://huggingface.co/datasets/theprint/databird-basic-math.

  17. D

    Statistical Analysis Software Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Statistical Analysis Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/statistical-analysis-software-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Sep 22, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Statistical Analysis Software Market Outlook



    The global market size for statistical analysis software was estimated at USD 11.3 billion in 2023 and is projected to reach USD 21.6 billion by 2032, growing at a compound annual growth rate (CAGR) of 7.5% during the forecast period. This substantial growth can be attributed to the increasing complexity of data in various industries and the rising need for advanced analytical tools to derive actionable insights.



    One of the primary growth factors for this market is the increasing demand for data-driven decision-making across various sectors. Organizations are increasingly recognizing the value of data analytics in enhancing operational efficiency, reducing costs, and identifying new business opportunities. The proliferation of big data and the advent of technologies such as artificial intelligence and machine learning are further fueling the demand for sophisticated statistical analysis software. Additionally, the growing adoption of cloud computing has significantly reduced the cost and complexity of deploying advanced analytics solutions, making them more accessible to organizations of all sizes.



    Another critical driver for the market is the increasing emphasis on regulatory compliance and risk management. Industries such as finance, healthcare, and manufacturing are subject to stringent regulatory requirements, necessitating the use of advanced analytics tools to ensure compliance and mitigate risks. For instance, in the healthcare sector, statistical analysis software is used for clinical trials, patient data management, and predictive analytics to enhance patient outcomes and ensure regulatory compliance. Similarly, in the financial sector, these tools are used for fraud detection, credit scoring, and risk assessment, thereby driving the demand for statistical analysis software.



    The rising trend of digital transformation across industries is also contributing to market growth. As organizations increasingly adopt digital technologies, the volume of data generated is growing exponentially. This data, when analyzed effectively, can provide valuable insights into customer behavior, market trends, and operational efficiencies. Consequently, there is a growing need for advanced statistical analysis software to analyze this data and derive actionable insights. Furthermore, the increasing integration of statistical analysis tools with other business intelligence and data visualization tools is enhancing their capabilities and driving their adoption across various sectors.



    From a regional perspective, North America currently holds the largest market share, driven by the presence of major technology companies and a high level of adoption of advanced analytics solutions. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period, owing to the increasing adoption of digital technologies and the growing emphasis on data-driven decision-making in countries such as China and India. The region's rapidly expanding IT infrastructure and increasing investments in advanced analytics solutions are further contributing to this growth.



    Component Analysis



    The statistical analysis software market can be segmented by component into software and services. The software segment encompasses the core statistical analysis tools and platforms used by organizations to analyze data and derive insights. This segment is expected to hold the largest market share, driven by the increasing adoption of data analytics solutions across various industries. The availability of a wide range of software solutions, from basic statistical tools to advanced analytics platforms, is catering to the diverse needs of organizations, further driving the growth of this segment.



    The services segment includes consulting, implementation, training, and support services provided by vendors to help organizations effectively deploy and utilize statistical analysis software. This segment is expected to witness significant growth during the forecast period, driven by the increasing complexity of data analytics projects and the need for specialized expertise. As organizations seek to maximize the value of their data analytics investments, the demand for professional services to support the implementation and optimization of statistical analysis solutions is growing. Furthermore, the increasing trend of outsourcing data analytics functions to third-party service providers is contributing to the growth of the services segment.



    Within the software segment, the market can be further categori

  18. f

    Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  19. r

    QoG Basic Dataset - Time-Series Data

    • researchdata.se
    • demo.researchdata.se
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Dahlberg; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli (2024). QoG Basic Dataset - Time-Series Data [Dataset]. http://doi.org/10.18157/qogbasjan22
    Explore at:
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    University of Gothenburg
    Authors
    Stefan Dahlberg; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli
    Time period covered
    1946 - 2021
    Description

    The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.

    The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.

    QoG Basic Dataset, which consists of approximately the 300 most used variables from QoG Standard Dataset, is a selection of variables that cover the most important concepts related to Quality of Government.

    In the QoG Basic CS dataset, data from and around 2018 is included. Data from 2018 is prioritized, however, if no data is available for a country for 2018, data for 2019 is included. If no data exists for 2019, data for 2017 is included, and so on up to a maximum of +/- 3 years.

    In the QoG Basic TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    Purpose:

    The primary aim of QoG is to conduct and promote research on corruption. One aim of the QoG Institute is to make publicly available cross-national comparative data on QoG and its correlates.

    In the QoG Basic TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    Historical countries are in most cases denoted with a do-date (e.g. Ethiopia (-1992) and a from-date (Ethiopia (1993-)).

  20. f

    Separate tables and separate word documents containing basic numerical data,...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jun 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deng, Fei; Fu, Liyan; Moming, Abulimiti; Wang, Zhiying; Zheng, Xin; Wang, Hualin; Zhang, Tao; Wu, Xiaoli; Shen, Shu; Zhang, Yanfang; Qian, Jin; Hu, Sijing; Ni, Jun; Tang, Shuang (2024). Separate tables and separate word documents containing basic numerical data, statistical analysis and original pictures for Figs 1, 2B, 2C, 2D, 3A, 3B, 3E, 3F, 4B, 4C, 4D, 5A and 5B of this study. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001383426
    Explore at:
    Dataset updated
    Jun 7, 2024
    Authors
    Deng, Fei; Fu, Liyan; Moming, Abulimiti; Wang, Zhiying; Zheng, Xin; Wang, Hualin; Zhang, Tao; Wu, Xiaoli; Shen, Shu; Zhang, Yanfang; Qian, Jin; Hu, Sijing; Ni, Jun; Tang, Shuang
    Description

    Separate tables and separate word documents containing basic numerical data, statistical analysis and original pictures for Figs 1, 2B, 2C, 2D, 3A, 3B, 3E, 3F, 4B, 4C, 4D, 5A and 5B of this study.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg (2024). A Customizable Inquiry-Based Statistics Teaching Application for Introductory Biology Students [Dataset]. https://qubeshub.org/publications/4651/?v=1

Data from: A Customizable Inquiry-Based Statistics Teaching Application for Introductory Biology Students

Related Article
Explore at:
Dataset updated
Apr 5, 2024
Dataset provided by
QUBES
Authors
Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg
Description

Building strong quantitative skills prepares undergraduate biology students for successful careers in science and medicine. While math and statistics anxiety can negatively impact student learning within biology classrooms, instructors may reduce this anxiety by steadily building student competency in quantitative reasoning through instructional scaffolding, application-based approaches, and simple computer program interfaces. However, few statistical programs exist that meet all needs of an inclusive, inquiry-based laboratory course. These needs include an open-source program, a simple interface, little required background knowledge in statistics for student users, and customizability to minimize cognitive load, align with course learning outcomes, and create desirable difficulty. To address these needs, we used the Shiny package in R to develop a custom statistical analysis application. Our “BioStats” app provides students with scaffolded learning experiences in applied statistics that promotes student agency and is customizable by the instructor. It introduces students to the strengths of the R interface, while eliminating the need for complex coding in the R programming language. It also prioritizes practical implementation of statistical analyses over learning statistical theory. To our knowledge, this is the first statistics teaching tool where students are presented basic statistics initially, more complex analyses as they advance, and includes an option to learn R statistical coding. The BioStats app interface yields a simplified introduction to applied statistics that is adaptable to many biology laboratory courses.

Primary Image: Singing Junco. A sketch of a junco singing on a pine tree branch, created by the lead author of this paper.

Search
Clear search
Close search
Google apps
Main menu