100+ datasets found
  1. physical data

    • kaggle.com
    zip
    Updated Dec 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J H Lee (2023). physical data [Dataset]. https://www.kaggle.com/datasets/goen01/physical-data/code
    Explore at:
    zip(38313 bytes)Available download formats
    Dataset updated
    Dec 26, 2023
    Authors
    J H Lee
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by J H Lee

    Released under MIT

    Contents

  2. Students Data Analysis

    • kaggle.com
    zip
    Updated Jul 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MOMONO (2022). Students Data Analysis [Dataset]. https://www.kaggle.com/datasets/erqizhou/students-data-analysis
    Explore at:
    zip(2174 bytes)Available download formats
    Dataset updated
    Jul 20, 2022
    Authors
    MOMONO
    Description

    A little paragraph from one real dataset, with a few little changes to protect students' private information. Permissions are given.

    Goals

    You are going to help teachers with only the data: 1. Prediction: To tell what makes a brilliant student who can apply for a graduate school, whether abroad or not. 2. Application: To help those who fails to apply for a graduate school with advice in job searching.

    Tips

    1. Educational data may have subtle structures, hierarchies and heterogeneity are probably involved. Simple regressions can hardly make any difference. Also, you should keep an eye on the collinearity in some indicators collected by teachers who have already forgot statistics.
    2. Not all students are free to choose to apply for a graduate school, but some were born with privileges.
    3. Some of the students are trying (or planning to try) to apply for a graduate school for years, you should be responsible to give advice accurately under their circumstances

    About the Data

    Some of the original structure are deleted or censored. For those are left: Basic data like: - ID - class: categorical, initially students were divided into 2 classes, yet teachers suspect that of different classes students may performance significant differently. - gender - race: categorical and censored - GPA: real numbers, float

    Some teachers assume that scores of math curriculums can represent one's likelihood perfectly: - Algebra: real numbers, Advanced Algebra - ......

    Some assume that background of students can affect their choices and likelihood significantly, which are all censored as: - from1: students' home locations - from2: a probably bad indicator for preference on mathematics - from 3: how did students apply for this university (undergraduate) - from4: a probably bad indicator for family background. 0 with more wealth, 4 with more poverty

    The final indicator y: - 0, one fails to apply for the graduate school, who may apply again or search jobs in the future - 1, success, inland - 2, success, abroad

  3. Descriptive statistics and reliability tests.

    • plos.figshare.com
    xls
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih (2025). Descriptive statistics and reliability tests. [Dataset]. http://doi.org/10.1371/journal.pone.0312306.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data is becoming increasingly ubiquitous today, and data literacy has emerged an essential skill in the workplace. Therefore, it is necessary to equip high school students with data literacy skills in order to prepare them for further learning and future employment. In Indonesia, there is a growing shift towards integrating data literacy in the high school curriculum. As part of a pilot intervention project, academics from two leading Universities organised data literacy boot camps for high school students across various cities in Indonesia. The boot camps aimed at increasing participants’ awareness of the power of analytical and exploration skills, which in turn, would contribute to creating independent and data-literate students. This paper explores student participants’ self-perception of their data literacy as a result of the skills acquired from the boot camps. Qualitative and quantitative data were collected through student surveys and a focus group discussion, and were used to analyse student perception post-intervention. The findings indicate that students became more aware of the usefulness of data literacy and its application in future studies and work after participating in the boot camp. Of the materials delivered at the boot camps, students found the greatest benefit in learning basic statistical concepts and applying them through the use of Microsoft Excel as a tool for basic data analysis. These findings provide valuable policy recommendations that educators and policymakers can use as guidelines for effective data literacy teaching in high schools.

  4. Collection of example datasets used for the book - R Programming -...

    • figshare.com
    txt
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kingsley Okoye; Samira Hosseini (2023). Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research [Dataset]. http://doi.org/10.6084/m9.figshare.24728073.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Kingsley Okoye; Samira Hosseini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.

  5. B

    Big Data Basic Platform Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Big Data Basic Platform Report [Dataset]. https://www.archivemarketresearch.com/reports/big-data-basic-platform-564496
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    May 22, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Big Data Basic Platform market is experiencing robust growth, projected to reach a market size of $150 billion by 2025, exhibiting a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033. This expansion is fueled by several key drivers, including the escalating volume and velocity of data generated across various industries, the increasing demand for real-time data analytics, and the growing adoption of cloud-based solutions for data storage and processing. Furthermore, advancements in technologies like artificial intelligence (AI) and machine learning (ML) are creating new opportunities for businesses to leverage big data for improved decision-making and enhanced operational efficiency. The market is segmented across various deployment models (cloud, on-premise, hybrid), industry verticals (finance, healthcare, retail, etc.), and functionalities (data ingestion, storage, processing, analytics). Key players in this competitive landscape include established technology giants like IBM, Microsoft, and AWS, alongside specialized big data solution providers such as Splunk and Cloudera. The market's growth trajectory is expected to remain strong throughout the forecast period, driven by ongoing digital transformation initiatives across enterprises globally. The significant market expansion reflects a confluence of factors. Businesses are increasingly recognizing the strategic value of big data for competitive advantage, leading to significant investments in platform infrastructure and skilled talent. Geographic expansion is also a notable driver, with developing economies witnessing accelerated adoption. However, challenges remain, including the complexities of data integration, security concerns related to sensitive data, and the need for skilled professionals capable of managing and interpreting large datasets. The market is witnessing increasing consolidation through mergers and acquisitions, as companies strive to broaden their service offerings and strengthen their market positions. The emergence of open-source technologies and the ongoing evolution of cloud computing architectures are further shaping the market's competitive dynamics, driving innovation and lowering the barrier to entry for new entrants. Future growth will likely depend on continued technological advancements, increasing data literacy, and the development of robust data governance frameworks.

  6. S1 Data -

    • plos.figshare.com
    xls
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih (2025). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0312306.s001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Charanjit Kaur; Pei P. Tan; Nurjannah Nurjannah; Ririn Yuniasih
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data is becoming increasingly ubiquitous today, and data literacy has emerged an essential skill in the workplace. Therefore, it is necessary to equip high school students with data literacy skills in order to prepare them for further learning and future employment. In Indonesia, there is a growing shift towards integrating data literacy in the high school curriculum. As part of a pilot intervention project, academics from two leading Universities organised data literacy boot camps for high school students across various cities in Indonesia. The boot camps aimed at increasing participants’ awareness of the power of analytical and exploration skills, which in turn, would contribute to creating independent and data-literate students. This paper explores student participants’ self-perception of their data literacy as a result of the skills acquired from the boot camps. Qualitative and quantitative data were collected through student surveys and a focus group discussion, and were used to analyse student perception post-intervention. The findings indicate that students became more aware of the usefulness of data literacy and its application in future studies and work after participating in the boot camp. Of the materials delivered at the boot camps, students found the greatest benefit in learning basic statistical concepts and applying them through the use of Microsoft Excel as a tool for basic data analysis. These findings provide valuable policy recommendations that educators and policymakers can use as guidelines for effective data literacy teaching in high schools.

  7. f

    Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s004
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  8. B

    Big Data Basic Platform Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Big Data Basic Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/big-data-basic-platform-1372362
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Big Data Basic Platform market is booming, projected to reach $150 billion by 2033 at a 15% CAGR. Discover key trends, drivers, restraints, and leading companies shaping this rapidly evolving sector. Learn more about cloud-based solutions, regional market shares, and future growth potential.

  9. Basic Data Analysis using Pandas & Matplotlib

    • kaggle.com
    zip
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Babar Hussain AI (2024). Basic Data Analysis using Pandas & Matplotlib [Dataset]. https://www.kaggle.com/datasets/babarhussainai/basic-data-analysis-using-pandas-and-matplotlib/discussion
    Explore at:
    zip(138710 bytes)Available download formats
    Dataset updated
    Dec 5, 2024
    Authors
    Babar Hussain AI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Babar Hussain AI

    Released under Apache 2.0

    Contents

  10. d

    Analysis of Air Temperature using CUAHSI HIS Web Services

    • search.dataone.org
    • hydroshare.org
    Updated Dec 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liza Brazil (2021). Analysis of Air Temperature using CUAHSI HIS Web Services [Dataset]. https://search.dataone.org/view/sha256%3Af0e49064a8c110ddfd3c3169685aa1e08fdb113b15f8e7f50c5ab62dcdadc3f4
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    Liza Brazil
    Description

    This resource contains a Jupyter notebook that demonstrate how the CUAHSI JupyterHub platform can be used to perform basic hydrologic data analysis. Temperature data is collected via the CUAHSI Hydrologic Information System (HIS) using web services. These data are interrogated, organized using Python classes, and plotted in various ways to demonstrate common data analysis steps. To get started, click the Open with dropdown on the top right of the resource and select CUAHSI JupyterHub. To use CUAHSI JupyterHub, you will need a HydroShare account.

  11. H

    Hydrologic Statistics and Data Analysis (M1)

    • beta.hydroshare.org
    • hydroshare.org
    • +2more
    zip
    Updated Sep 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Garousi-Nejad; Belize Lane (2021). Hydrologic Statistics and Data Analysis (M1) [Dataset]. https://beta.hydroshare.org/resource/bd0b38fc5d1e4d5c895dc484ceeb2c2a/
    Explore at:
    zip(45.7 KB)Available download formats
    Dataset updated
    Sep 10, 2021
    Dataset provided by
    HydroShare
    Authors
    Irene Garousi-Nejad; Belize Lane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This resource contains a Jupyter Notebook that is used to introduce hydrologic data analysis and conservation laws. This resource is part of a HydroLearn Physical Hydrology learning module available at https://edx.hydrolearn.org/courses/course-v1:Utah_State_University+CEE6400+2019_Fall/about

    In this activity, the student learns how to (1) calculate the residence time of water in land and rivers for the global hydrologic cycle; (2) quantify the relative and absolute uncertainties in components of the water balance; (3) navigate public websites and databases, extract key watershed attributes, and perform basic hydrologic data analysis for a watershed of interest; (4) assess, compare, and interpret hydrologic trends in the context of a specific watershed.

    Please note that in problems 3-8, the user is asked to use an R package (i.e., dataRetrieval) and select a U.S. Geological Survey (USGS) streamflow gage to retrieve streamflow data and then apply the hydrological data analysis to the watershed of interest. We acknowledge that the material relies on USGS data that are only available within the U.S. If running for other watersheds of interest outside the U.S. or wishing to work with other datasets, the user must take some further steps and develop codes to prepare the streamflow dataset. Once a streamflow time series dataset is obtained for an international catchment of interest, the user would need to read that file into the workspace before working through subsequent analyses.

  12. q

    Data from: A Customizable Inquiry-Based Statistics Teaching Application for...

    • qubeshub.org
    Updated Apr 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg (2024). A Customizable Inquiry-Based Statistics Teaching Application for Introductory Biology Students [Dataset]. https://qubeshub.org/publications/4651/?v=1
    Explore at:
    Dataset updated
    Apr 5, 2024
    Dataset provided by
    QUBES
    Authors
    Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg
    Description

    Building strong quantitative skills prepares undergraduate biology students for successful careers in science and medicine. While math and statistics anxiety can negatively impact student learning within biology classrooms, instructors may reduce this anxiety by steadily building student competency in quantitative reasoning through instructional scaffolding, application-based approaches, and simple computer program interfaces. However, few statistical programs exist that meet all needs of an inclusive, inquiry-based laboratory course. These needs include an open-source program, a simple interface, little required background knowledge in statistics for student users, and customizability to minimize cognitive load, align with course learning outcomes, and create desirable difficulty. To address these needs, we used the Shiny package in R to develop a custom statistical analysis application. Our “BioStats” app provides students with scaffolded learning experiences in applied statistics that promotes student agency and is customizable by the instructor. It introduces students to the strengths of the R interface, while eliminating the need for complex coding in the R programming language. It also prioritizes practical implementation of statistical analyses over learning statistical theory. To our knowledge, this is the first statistics teaching tool where students are presented basic statistics initially, more complex analyses as they advance, and includes an option to learn R statistical coding. The BioStats app interface yields a simplified introduction to applied statistics that is adaptable to many biology laboratory courses.

    Primary Image: Singing Junco. A sketch of a junco singing on a pine tree branch, created by the lead author of this paper.

  13. Data Sheet 1_Global trends of big data analytics in health research: a...

    • frontiersin.figshare.com
    docx
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Yao; Yan Liu; Tingrui Wang; Chunyan Han; Qiaoxing Li; Qinqin Li; Xiaoli You; Tingting Ren; Yinhua Wang (2025). Data Sheet 1_Global trends of big data analytics in health research: a bibliometric study.docx [Dataset]. http://doi.org/10.3389/fmed.2025.1456286.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jul 1, 2025
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Li Yao; Yan Liu; Tingrui Wang; Chunyan Han; Qiaoxing Li; Qinqin Li; Xiaoli You; Tingting Ren; Yinhua Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe field of “Big Health,” which encompasses the integration of big data in healthcare, has seen rapid development in recent years. As big data technologies continue to transform healthcare, understanding emerging trends and key advancements within the field is essential.MethodsWe retrieved and filtered articles and reviews related to big data analytics in health research from the Web of Science Core Collection, including SCI Expanded and SSCI, covering the period from 2009 to 2024. Bibliometric and co-citation analyses were conducted using VOSviewer and CiteSpace.ResultsA total of 13,609 papers were analyzed, including 10,702 original research and 2,907 reviews. Co-occurrence word analysis identified six key research areas: (1) the application of big data analytics in health decision-making; (2) challenges in the technological management of health and medical big data; (3) integration of machine learning with health monitoring; (4) privacy and ethical issues in health and medical big data; (5) data integration in precision medicine; and (6) the use of big data in disease management and risk assessment. The co-word burst analysis results indicate that topics such as personalized medicine, decision support, and data protection experienced significant growth between 2015 and 2020. With the advancement of big data technologies, research hotspots have gradually expanded from basic data analysis to more complex application areas, such as the digital transformation of healthcare, digital health strategies, and smart health cities.ConclusionThis study highlights the growing impact of big data analytics in healthcare, emphasizing its role in decision-making, disease management, and precision medicine. As digital transformation in healthcare advances, addressing challenges in data integration, privacy, and machine learning integration will be crucial for maximizing the potential of big data technologies in improving health outcomes.

  14. Basic Stand Alone Medicare Claims Public Use Files Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Basic Stand Alone Medicare Claims Public Use Files Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/basic-stand-alone-medicare-claims-public-use-files-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    This data package contains claims-based data about beneficiaries of Medicare program services including Inpatient, Outpatient, related to Chronic Conditions, Skilled Nursing Facility, Home Health Agency, Hospice, Carrier, Durable Medical Equipment (DME) and data related to Prescription Drug Events. It is necessary to mention that the values are estimated and counted, by using a random sample of fee-for-service Medicare claims.

  15. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  16. D

    Statistical Analysis Software Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Statistical Analysis Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/statistical-analysis-software-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Sep 22, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Statistical Analysis Software Market Outlook



    The global market size for statistical analysis software was estimated at USD 11.3 billion in 2023 and is projected to reach USD 21.6 billion by 2032, growing at a compound annual growth rate (CAGR) of 7.5% during the forecast period. This substantial growth can be attributed to the increasing complexity of data in various industries and the rising need for advanced analytical tools to derive actionable insights.



    One of the primary growth factors for this market is the increasing demand for data-driven decision-making across various sectors. Organizations are increasingly recognizing the value of data analytics in enhancing operational efficiency, reducing costs, and identifying new business opportunities. The proliferation of big data and the advent of technologies such as artificial intelligence and machine learning are further fueling the demand for sophisticated statistical analysis software. Additionally, the growing adoption of cloud computing has significantly reduced the cost and complexity of deploying advanced analytics solutions, making them more accessible to organizations of all sizes.



    Another critical driver for the market is the increasing emphasis on regulatory compliance and risk management. Industries such as finance, healthcare, and manufacturing are subject to stringent regulatory requirements, necessitating the use of advanced analytics tools to ensure compliance and mitigate risks. For instance, in the healthcare sector, statistical analysis software is used for clinical trials, patient data management, and predictive analytics to enhance patient outcomes and ensure regulatory compliance. Similarly, in the financial sector, these tools are used for fraud detection, credit scoring, and risk assessment, thereby driving the demand for statistical analysis software.



    The rising trend of digital transformation across industries is also contributing to market growth. As organizations increasingly adopt digital technologies, the volume of data generated is growing exponentially. This data, when analyzed effectively, can provide valuable insights into customer behavior, market trends, and operational efficiencies. Consequently, there is a growing need for advanced statistical analysis software to analyze this data and derive actionable insights. Furthermore, the increasing integration of statistical analysis tools with other business intelligence and data visualization tools is enhancing their capabilities and driving their adoption across various sectors.



    From a regional perspective, North America currently holds the largest market share, driven by the presence of major technology companies and a high level of adoption of advanced analytics solutions. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period, owing to the increasing adoption of digital technologies and the growing emphasis on data-driven decision-making in countries such as China and India. The region's rapidly expanding IT infrastructure and increasing investments in advanced analytics solutions are further contributing to this growth.



    Component Analysis



    The statistical analysis software market can be segmented by component into software and services. The software segment encompasses the core statistical analysis tools and platforms used by organizations to analyze data and derive insights. This segment is expected to hold the largest market share, driven by the increasing adoption of data analytics solutions across various industries. The availability of a wide range of software solutions, from basic statistical tools to advanced analytics platforms, is catering to the diverse needs of organizations, further driving the growth of this segment.



    The services segment includes consulting, implementation, training, and support services provided by vendors to help organizations effectively deploy and utilize statistical analysis software. This segment is expected to witness significant growth during the forecast period, driven by the increasing complexity of data analytics projects and the need for specialized expertise. As organizations seek to maximize the value of their data analytics investments, the demand for professional services to support the implementation and optimization of statistical analysis solutions is growing. Furthermore, the increasing trend of outsourcing data analytics functions to third-party service providers is contributing to the growth of the services segment.



    Within the software segment, the market can be further categori

  17. B

    Big Data Basic Platform Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Big Data Basic Platform Report [Dataset]. https://www.marketresearchforecast.com/reports/big-data-basic-platform-540366
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Big Data Basic Platform market is experiencing robust growth, driven by the exponential increase in data volume and the rising need for efficient data storage, processing, and analytics across diverse industries. The market, estimated at $50 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $150 billion by 2033. This expansion is fueled by several key factors, including the increasing adoption of cloud-based solutions offering scalability and cost-effectiveness, the growing demand for real-time analytics and insights, and the proliferation of advanced technologies such as artificial intelligence (AI) and machine learning (ML) that leverage big data platforms. Furthermore, stringent government regulations regarding data privacy and security are compelling organizations to invest in robust and secure big data platforms, further driving market growth. Major players like IBM, Dell, Splunk, and cloud providers such as AWS, Microsoft Azure, and Google Cloud Platform dominate the market, offering a range of solutions tailored to diverse business needs. However, the market is also witnessing the emergence of several niche players, particularly in specialized areas like data visualization and advanced analytics. Competitive pressures, technological advancements, and the need for continuous innovation are shaping the market landscape. While challenges exist, such as data security concerns, talent shortages, and the complexity of integrating big data solutions into existing IT infrastructure, the overall market outlook remains highly positive, with significant growth opportunities anticipated across various regions and segments throughout the forecast period.

  18. g

    Video tutorial on data literacy​ training | gimi9.com

    • gimi9.com
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Video tutorial on data literacy​ training | gimi9.com [Dataset]. https://gimi9.com/dataset/mekong_video-tutorial-on-data-literacy-training
    Explore at:
    Dataset updated
    Mar 23, 2025
    Description

    This video series presents 11 lessons and introduction to data literacy organized by the Open Development Cambodia Organization (ODC) to provide video tutorials on data literacy and the use of data in data storytelling. There are 12 videos which illustrate following sessions: * Introduction to the data literacy course * Lesson 1: Understanding data * Lesson 2: Explore data tables and data products * Lesson 3: Advanced Google Search * Lesson 4: Navigating data portals and validating data * Lesson 5: Common data format * Lesson 6: Data standard * Lesson 7: Data cleaning with Google Sheets * Lesson 8: Basic statistic * Lesson 9: Basic Data analysis using Google Sheets * Lesson 10: Data visualization * Lesson 11: Data Visualization with Flourish

  19. d

    Easing into Excellent Excel Practices Learning Series / Série...

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcoux, Julie (2023). Easing into Excellent Excel Practices Learning Series / Série d'apprentissages en route vers des excellentes pratiques Excel [Dataset]. http://doi.org/10.5683/SP3/WZYO1F
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Marcoux, Julie
    Description

    With a step-by-step approach, learn to prepare Excel files, data worksheets, and individual data columns for data analysis; practice conditional formatting and creating pivot tables/charts; go over basic principles of Research Data Management as they might apply to an Excel project. Avec une approche étape par étape, apprenez à préparer pour l’analyse des données des fichiers Excel, des feuilles de calcul de données et des colonnes de données individuelles; pratiquez la mise en forme conditionnelle et la création de tableaux croisés dynamiques ou de graphiques; passez en revue les principes de base de la gestion des données de recherche tels qu’ils pourraient s’appliquer à un projet Excel.

  20. B

    CRIME STATISTICS DATA ANALYTICS

    • borealisdata.ca
    • dataverse.scholarsportal.info
    Updated Jan 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheryl Kwong; Drew Anweiler; Mary Sarafraz (2019). CRIME STATISTICS DATA ANALYTICS [Dataset]. http://doi.org/10.5683/SP2/IE6NRY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2019
    Dataset provided by
    Borealis
    Authors
    Cheryl Kwong; Drew Anweiler; Mary Sarafraz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Crime isn't a topic most people want to use mental energy to think about. We want to avoid harm, protect our loved ones, and hold on to what we claim is ours. So how do we remain vigilant without digging too deep into the filth that is crime? Data, of course. The focus of our study is to explore possible trends between crime and communities in the city of Calgary. Our purpose is visualize Calgary criminal behaviour in order to help increase awareness for both citizens and law enforcement. Through the use of our visuals, individuals can make more informed decisions to improve the overall safety of their lives. Some of the main concerns of the study include: how crime rates increase with population, which areas in Calgary have the most crime, and if crime adheres to time-sensative patterns.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
J H Lee (2023). physical data [Dataset]. https://www.kaggle.com/datasets/goen01/physical-data/code
Organization logo

physical data

simple data for basic data analysis

Explore at:
zip(38313 bytes)Available download formats
Dataset updated
Dec 26, 2023
Authors
J H Lee
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset

This dataset was created by J H Lee

Released under MIT

Contents

Search
Clear search
Close search
Google apps
Main menu