100+ datasets found

students_perfomance_data
kaggle.com
zip
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suhana Lodhi (2025). students_perfomance_data [Dataset]. https://www.kaggle.com/datasets/suhanalodhi/students-perfomance-data
Explore at:
zip(558 bytes)Available download formats
Dataset updated
Oct 29, 2025
Authors
Suhana Lodhi
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The “Students Performance Data” dataset provides academic and demographic information of students. It includes their marks in Maths, Science, and English along with attendance and city details. This dataset is ideal for beginners learning data entry, analysis, and visualization using tools like Excel or Kaggle Notebooks.
Students Data Analysis
kaggle.com
zip
Updated Jul 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MOMONO (2022). Students Data Analysis [Dataset]. https://www.kaggle.com/datasets/erqizhou/students-data-analysis
Explore at:
zip(2174 bytes)Available download formats
Dataset updated
Jul 20, 2022
Authors
MOMONO
Description
A little paragraph from one real dataset, with a few little changes to protect students' private information. Permissions are given.

Goals

You are going to help teachers with only the data: 1. Prediction: To tell what makes a brilliant student who can apply for a graduate school, whether abroad or not. 2. Application: To help those who fails to apply for a graduate school with advice in job searching.

Tips

Educational data may have subtle structures, hierarchies and heterogeneity are probably involved. Simple regressions can hardly make any difference. Also, you should keep an eye on the collinearity in some indicators collected by teachers who have already forgot statistics.

Not all students are free to choose to apply for a graduate school, but some were born with privileges.

Some of the students are trying (or planning to try) to apply for a graduate school for years, you should be responsible to give advice accurately under their circumstances

About the Data

Some of the original structure are deleted or censored. For those are left: Basic data like: - ID - class: categorical, initially students were divided into 2 classes, yet teachers suspect that of different classes students may performance significant differently. - gender - race: categorical and censored - GPA: real numbers, float

Some teachers assume that scores of math curriculums can represent one's likelihood perfectly: - Algebra: real numbers, Advanced Algebra - ......

Some assume that background of students can affect their choices and likelihood significantly, which are all censored as: - from1: students' home locations - from2: a probably bad indicator for preference on mathematics - from 3: how did students apply for this university (undergraduate) - from4: a probably bad indicator for family background. 0 with more wealth, 4 with more poverty

The final indicator y: - 0, one fails to apply for the graduate school, who may apply again or search jobs in the future - 1, success, inland - 2, success, abroad
CSV file used in statistical analyses
data.csiro.au
researchdata.edu.au
+1more
Updated Oct 13, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CSIRO (2014). CSV file used in statistical analyses [Dataset]. http://doi.org/10.4225/08/543B4B4CA92E6
Explore at:
Unique identifier
https://doi.org/10.4225/08/543B4B4CA92E6
Dataset updated
Oct 13, 2014
Dataset authored and provided by
CSIROhttp://www.csiro.au/
License
https://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/
Time period covered
Mar 14, 2008 - Jun 9, 2009
Dataset funded by
CSIROhttp://www.csiro.au/
Description
A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival.
Data from: PISA Data Analysis Manual: SPSS, Second Edition
catalog.data.gov
s.cnmilf.com
Updated Mar 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (2021). PISA Data Analysis Manual: SPSS, Second Edition [Dataset]. https://catalog.data.gov/dataset/pisa-data-analysis-manual-spss-second-edition
Explore at:
Dataset updated
Mar 30, 2021
Dataset provided by
United States Department of Statehttp://state.gov/
Description
The OECD Programme for International Student Assessment (PISA) surveys collected data on students’ performances in reading, mathematics and science, as well as contextual information on students’ background, home characteristics and school factors which could influence performance. This publication includes detailed information on how to analyse the PISA data, enabling researchers to both reproduce the initial results and to undertake further analyses. In addition to the inclusion of the necessary techniques, the manual also includes a detailed account of the PISA 2006 database and worked examples providing full syntax in SPSS.
d
Data for: Integrating open education practices with data analysis of open...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marja Bakermans (2024). Data for: Integrating open education practices with data analysis of open science in an undergraduate course [Dataset]. http://doi.org/10.5061/dryad.37pvmcvst
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.37pvmcvst
Dataset updated
Jul 27, 2024
Dataset provided by
Dryad Digital Repository
Authors
Marja Bakermans
Description
The open science movement produces vast quantities of openly published data connected to journal articles, creating an enormous resource for educators to engage students in current topics and analyses. However, educators face challenges using these materials to meet course objectives. I present a case study using open science (published articles and their corresponding datasets) and open educational practices in a capstone course. While engaging in current topics of conservation, students trace connections in the research process, learn statistical analyses, and recreate analyses using the programming language R. I assessed the presence of best practices in open articles and datasets, examined student selection in the open grading policy, surveyed students on their perceived learning gains, and conducted a thematic analysis on student reflections. First, articles and datasets met just over half of the assessed fairness practices, but this increased with the publication date. There was a..., Article and dataset fairness To assess the utility of open articles and their datasets as an educational tool in an undergraduate academic setting, I measured the congruence of each pair to a set of best practices and guiding principles. I assessed ten guiding principles and best practices (Table 1), where each category was scored â€˜1â€™ or â€˜0â€™ based on whether it met that criteria, with a total possible score of ten. Open grading policies Students were allowed to specify the percentage weight for each assessment category in the course, including 1) six coding exercises (Exercises), 2) one lead exercise (Lead Exercise), 3) fourteen annotation assignments of readings (Annotations), 4) one final project (Final Project), 5) five discussion board posts and a statement of learning reflection (Discussion), and 6) attendance and participation (Participation). I examined if assessment categories (independent variable) were weighted (dependent variable) differently by students using an analysis of ..., , # Data for: Integrating open education practices with data analysis of open science in an undergraduate course

Author: Marja H Bakermans Affiliation: Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA 01609 USA ORCID: https://orcid.org/0000-0002-4879-7771 Institutional IRB approval: IRB-24â€“0314

Data and file overview

The full dataset file called OEPandOSdata (.xlsx extension) contains 8 files. Below are descriptions of the name and contents of each file. NA = not applicable or no data available

BestPracticesData.csv

Description: Data to assess the adherence of articles and datasets to open science best practices.

Column headers and descriptions:

Article: articles used in the study, numbered randomly

F1: Findable, Data are assigned a unique and persistent doi

F2: Findable, Metadata includes an identifier of data

F3: Findable, Data are registered in a searchable database

A1: ...
PISA 2003 Data Analysis Manual SPSS
catalog.data.gov
gimi9.com
+1more
Updated Mar 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (2021). PISA 2003 Data Analysis Manual SPSS [Dataset]. https://catalog.data.gov/dataset/pisa-2003-data-analysis-manual-spss
Explore at:
Dataset updated
Mar 30, 2021
Dataset provided by
United States Department of Statehttp://state.gov/
Description
This publication provides all the information required to understand the PISA 2003 educational performance database and perform analyses in accordance with the complex methodologies used to collect and process the data. It enables researchers to both reproduce the initial results and to undertake further analyses. The publication includes introductory chapters explaining the statistical theories and concepts required to analyse the PISA data, including full chapters on how to apply replicate weights and undertake analyses using plausible values; worked examples providing full syntax in SPSS®; and a comprehensive description of the OECD PISA 2003 international database. The PISA 2003 database includes micro-level data on student educational performance for 41 countries collected in 2003, together with students’ responses to the PISA 2003 questionnaires and the test questions. A similar manual is available for SAS users.
Number of interviews per participant.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Number of interviews per participant. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0295726.t002
Dataset updated
May 29, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Initial data analysis (IDA) is the part of the data pipeline that takes place between the end of data retrieval and the beginning of data analysis that addresses the research question. Systematic IDA and clear reporting of the IDA findings is an important step towards reproducible research. A general framework of IDA for observational studies includes data cleaning, data screening, and possible updates of pre-planned statistical analyses. Longitudinal studies, where participants are observed repeatedly over time, pose additional challenges, as they have special features that should be taken into account in the IDA steps before addressing the research question. We propose a systematic approach in longitudinal studies to examine data properties prior to conducting planned statistical analyses. In this paper we focus on the data screening element of IDA, assuming that the research aims are accompanied by an analysis plan, meta-data are well documented, and data cleaning has already been performed. IDA data screening comprises five types of explorations, covering the analysis of participation profiles over time, evaluation of missing data, presentation of univariate and multivariate descriptions, and the depiction of longitudinal aspects. Executing the IDA plan will result in an IDA report to inform data analysts about data properties and possible implications for the analysis plan—another element of the IDA framework. Our framework is illustrated focusing on hand grip strength outcome data from a data collection across several waves in a complex survey. We provide reproducible R code on a public repository, presenting a detailed data screening plan for the investigation of the average rate of age-associated decline of grip strength. With our checklist and reproducible R code we provide data analysts a framework to work with longitudinal data in an informed way, enhancing the reproducibility and validity of their work.
c
Walmart basic product details dataset
crawlfeeds.com
csv, zip
Updated Jul 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2024). Walmart basic product details dataset [Dataset]. https://crawlfeeds.com/datasets/walmart-basic-product-details-dataset
Explore at:
csv, zipAvailable download formats
Dataset updated
Jul 28, 2024
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Get access to the Walmart Basic Product Details Dataset, which includes essential information on a wide range of products available at Walmart.

This comprehensive dataset features product names, categories, descriptions, prices, and more. Ideal for market analysis, competitive research, and e-commerce applications.

Download now to enhance your data-driven strategies and insights with detailed Walmart product information.

The dataset having basic details of a dataset like title, id, image, price and descripton.

Records count: 2.5 million +
B
Big Data Basic Platform Report
archivemarketresearch.com
doc, pdf, ppt
Updated May 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Big Data Basic Platform Report [Dataset]. https://www.archivemarketresearch.com/reports/big-data-basic-platform-564496
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
May 22, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Big Data Basic Platform market is experiencing robust growth, projected to reach a market size of $150 billion by 2025, exhibiting a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033. This expansion is fueled by several key drivers, including the escalating volume and velocity of data generated across various industries, the increasing demand for real-time data analytics, and the growing adoption of cloud-based solutions for data storage and processing. Furthermore, advancements in technologies like artificial intelligence (AI) and machine learning (ML) are creating new opportunities for businesses to leverage big data for improved decision-making and enhanced operational efficiency. The market is segmented across various deployment models (cloud, on-premise, hybrid), industry verticals (finance, healthcare, retail, etc.), and functionalities (data ingestion, storage, processing, analytics). Key players in this competitive landscape include established technology giants like IBM, Microsoft, and AWS, alongside specialized big data solution providers such as Splunk and Cloudera. The market's growth trajectory is expected to remain strong throughout the forecast period, driven by ongoing digital transformation initiatives across enterprises globally. The significant market expansion reflects a confluence of factors. Businesses are increasingly recognizing the strategic value of big data for competitive advantage, leading to significant investments in platform infrastructure and skilled talent. Geographic expansion is also a notable driver, with developing economies witnessing accelerated adoption. However, challenges remain, including the complexities of data integration, security concerns related to sensitive data, and the need for skilled professionals capable of managing and interpreting large datasets. The market is witnessing increasing consolidation through mergers and acquisitions, as companies strive to broaden their service offerings and strengthen their market positions. The emergence of open-source technologies and the ongoing evolution of cloud computing architectures are further shaping the market's competitive dynamics, driving innovation and lowering the barrier to entry for new entrants. Future growth will likely depend on continued technological advancements, increasing data literacy, and the development of robust data governance frameworks.
f
Data from: Statistical Analysis of Individual Participant Data...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Oct 3, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Altman, Douglas G.; Stewart, Gavin B.; Duley, Lelia; Stewart, Lesley A.; Simmonds, Mark C.; Askie, Lisa M. (2012). Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001127650
Explore at:
Dataset updated
Oct 3, 2012
Authors
Altman, Douglas G.; Stewart, Gavin B.; Duley, Lelia; Stewart, Lesley A.; Simmonds, Mark C.; Askie, Lisa M.
Description
BackgroundIndividual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and FindingsWe included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. ConclusionsFor these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials.
H
Python and R Basics for Environmental Data Sciences
hydroshare.org
zip
Updated Nov 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tao Wen (2020). Python and R Basics for Environmental Data Sciences [Dataset]. https://www.hydroshare.org/resource/114e5092ab684bd9beb9fc845a25a087
Explore at:
zip(282.7 MB)Available download formats
Dataset updated
Nov 1, 2020
Dataset provided by
HydroShare
Authors
Tao Wen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
This resource collects teaching materials that are originally created for the in-person course 'GEOSC/GEOG 497 – Data Mining in Environmental Sciences' at Penn State University (co-taught by Tao Wen, Susan Brantley, and Alan Taylor) and then refined/revised by Tao Wen to be used in the online teaching module 'Data Science in Earth and Environmental Sciences' hosted on the NSF-sponsored HydroLearn platform.

This resource includes both R Notebooks and Python Jupyter Notebooks to teach the basics of R and Python coding, data analysis and data visualization, as well as building machine learning models in both programming languages by using authentic research data and questions. All of these R/Python scripts can be executed either on the CUAHSI JupyterHub or on your local machine.

This resource is shared under the CC-BY license. Please contact the creator Tao Wen at Syracuse University (twen08@syr.edu) for any questions you have about this resource. If you identify any errors in the files, please contact the creator.
H
Hydrologic Statistics and Data Analysis (M1)
beta.hydroshare.org
hydroshare.org
+2more
zip
Updated Sep 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Irene Garousi-Nejad; Belize Lane (2021). Hydrologic Statistics and Data Analysis (M1) [Dataset]. https://beta.hydroshare.org/resource/bd0b38fc5d1e4d5c895dc484ceeb2c2a/
Explore at:
zip(45.7 KB)Available download formats
Dataset updated
Sep 10, 2021
Dataset provided by
HydroShare
Authors
Irene Garousi-Nejad; Belize Lane
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
This resource contains a Jupyter Notebook that is used to introduce hydrologic data analysis and conservation laws. This resource is part of a HydroLearn Physical Hydrology learning module available at https://edx.hydrolearn.org/courses/course-v1:Utah_State_University+CEE6400+2019_Fall/about

In this activity, the student learns how to (1) calculate the residence time of water in land and rivers for the global hydrologic cycle; (2) quantify the relative and absolute uncertainties in components of the water balance; (3) navigate public websites and databases, extract key watershed attributes, and perform basic hydrologic data analysis for a watershed of interest; (4) assess, compare, and interpret hydrologic trends in the context of a specific watershed.

Please note that in problems 3-8, the user is asked to use an R package (i.e., dataRetrieval) and select a U.S. Geological Survey (USGS) streamflow gage to retrieve streamflow data and then apply the hydrological data analysis to the watershed of interest. We acknowledge that the material relies on USGS data that are only available within the U.S. If running for other watersheds of interest outside the U.S. or wishing to work with other datasets, the user must take some further steps and develop codes to prepare the streamflow dataset. Once a streamflow time series dataset is obtained for an international catchment of interest, the user would need to read that file into the workspace before working through subsequent analyses.
q
Data from: A Customizable Inquiry-Based Statistics Teaching Application for...
qubeshub.org
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg (2024). A Customizable Inquiry-Based Statistics Teaching Application for Introductory Biology Students [Dataset]. https://qubeshub.org/publications/4651/?v=1
Explore at:
Dataset updated
Apr 5, 2024
Dataset provided by
QUBES
Authors
Mikus Abolins-Abols*; Natalie Christian; Jeffery Masters; Rachel Pigg
Description
Building strong quantitative skills prepares undergraduate biology students for successful careers in science and medicine. While math and statistics anxiety can negatively impact student learning within biology classrooms, instructors may reduce this anxiety by steadily building student competency in quantitative reasoning through instructional scaffolding, application-based approaches, and simple computer program interfaces. However, few statistical programs exist that meet all needs of an inclusive, inquiry-based laboratory course. These needs include an open-source program, a simple interface, little required background knowledge in statistics for student users, and customizability to minimize cognitive load, align with course learning outcomes, and create desirable difficulty. To address these needs, we used the Shiny package in R to develop a custom statistical analysis application. Our “BioStats” app provides students with scaffolded learning experiences in applied statistics that promotes student agency and is customizable by the instructor. It introduces students to the strengths of the R interface, while eliminating the need for complex coding in the R programming language. It also prioritizes practical implementation of statistical analyses over learning statistical theory. To our knowledge, this is the first statistics teaching tool where students are presented basic statistics initially, more complex analyses as they advance, and includes an option to learn R statistical coding. The BioStats app interface yields a simplified introduction to applied statistics that is adaptable to many biology laboratory courses.

Primary Image: Singing Junco. A sketch of a junco singing on a pine tree branch, created by the lead author of this paper.
Basic Stand Alone Medicare Claims Public Use Files Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Basic Stand Alone Medicare Claims Public Use Files Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/basic-stand-alone-medicare-claims-public-use-files-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains claims-based data about beneficiaries of Medicare program services including Inpatient, Outpatient, related to Chronic Conditions, Skilled Nursing Facility, Home Health Agency, Hospice, Carrier, Durable Medical Equipment (DME) and data related to Prescription Drug Events. It is necessary to mention that the values are estimated and counted, by using a random sample of fee-for-service Medicare claims.
Exploratory Data Analysis Basics
kaggle.com
zip
Updated May 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lebelo Hailesilassie (2023). Exploratory Data Analysis Basics [Dataset]. https://www.kaggle.com/datasets/lebelohailesilassie/exploratory-data-analysis-basics
Explore at:
zip(105561 bytes)Available download formats
Dataset updated
May 24, 2023
Authors
Lebelo Hailesilassie
Description
Dataset

This dataset was created by Lebelo Hailesilassie

Contents
d
Data from: Units of Analysis: The Basics
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chuck Humphrey (2023). Units of Analysis: The Basics [Dataset]. http://doi.org/10.5683/SP3/DLBDT5
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/DLBDT5
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Chuck Humphrey
Description
One of the first steps in a reference interview is determining what is it the user really wants or needs. In many cases, the question comes down to the unit of analysis: what is it that is being investigated or researched? This presentation will take us through the concept of the unit of analysis so that we can improve our reference service — and make our lives easier as a result! Note: This presentation precedes Working with Complex Surveys: Canadian Travel Survey by Chuck Humphrey (14-Mar-2002).
c
PISA 2003 Data Analysis Manual SAS
s.cnmilf.com
gimi9.com
+1more
Updated Mar 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (2021). PISA 2003 Data Analysis Manual SAS [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/pisa-2003-data-analysis-manual-sas
Explore at:
Dataset updated
Mar 30, 2021
Dataset provided by
U.S. Department of State
Description
This publication provides all the information required to understand the PISA 2003 educational performance database and perform analyses in accordance with the complex methodologies used to collect and process the data. It enables researchers to both reproduce the initial results and to undertake further analyses. The publication includes introductory chapters explaining the statistical theories and concepts required to analyse the PISA data, including full chapters on how to apply replicate weights and undertake analyses using plausible values; worked examples providing full syntax in SAS®; and a comprehensive description of the OECD PISA 2003 international database. The PISA 2003 database includes micro-level data on student educational performance for 41 countries collected in 2003, together with students’ responses to the PISA 2003 questionnaires and the test questions. A similar manual is available for SPSS users.
B
Big Data Basic Platform Report
datainsightsmarket.com
doc, pdf, ppt
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Big Data Basic Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/big-data-basic-platform-1372362
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
May 11, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Big Data Basic Platform market is booming, projected to reach $150 billion by 2033 at a 15% CAGR. Discover key trends, drivers, restraints, and leading companies shaping this rapidly evolving sector. Learn more about cloud-based solutions, regional market shares, and future growth potential.
d
Analysis of Air Temperature using CUAHSI HIS Web Services
search.dataone.org
hydroshare.org
Updated Dec 5, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liza Brazil (2021). Analysis of Air Temperature using CUAHSI HIS Web Services [Dataset]. https://search.dataone.org/view/sha256%3Af0e49064a8c110ddfd3c3169685aa1e08fdb113b15f8e7f50c5ab62dcdadc3f4
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Liza Brazil
Description
This resource contains a Jupyter notebook that demonstrate how the CUAHSI JupyterHub platform can be used to perform basic hydrologic data analysis. Temperature data is collected via the CUAHSI Hydrologic Information System (HIS) using web services. These data are interrogated, organized using Python classes, and plotted in various ways to demonstrate common data analysis steps. To get started, click the Open with dropdown on the top right of the resource and select CUAHSI JupyterHub. To use CUAHSI JupyterHub, you will need a HydroShare account.
f
Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...
frontiersin.figshare.com
docx
Updated Mar 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2024.1379910.s001
Dataset updated
Mar 22, 2024
Dataset provided by
Frontiers
Authors
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

Facebook

Twitter

Click to copy link

Link copied

Cite

Suhana Lodhi (2025). students_perfomance_data [Dataset]. https://www.kaggle.com/datasets/suhanalodhi/students-perfomance-data

students_perfomance_data

Basic dataset for beginners to practice analysis.

Explore at:

zip(558 bytes)Available download formats

Dataset updated

Oct 29, 2025

Authors

Suhana Lodhi

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

The “Students Performance Data” dataset provides academic and demographic information of students. It includes their marks in Maths, Science, and English along with attendance and city details. This dataset is ideal for beginners learning data entry, analysis, and visualization using tools like Excel or Kaggle Notebooks.

Clear search

Close search

Google apps

Main menu

students_perfomance_data

Students Data Analysis

Goals

Tips

About the Data

CSV file used in statistical analyses

Data from: PISA Data Analysis Manual: SPSS, Second Edition

Data for: Integrating open education practices with data analysis of open...

Data and file overview

PISA 2003 Data Analysis Manual SPSS

Number of interviews per participant.

Walmart basic product details dataset

Big Data Basic Platform Report

Data from: Statistical Analysis of Individual Participant Data...

Python and R Basics for Environmental Data Sciences

Hydrologic Statistics and Data Analysis (M1)

Data from: A Customizable Inquiry-Based Statistics Teaching Application for...

Basic Stand Alone Medicare Claims Public Use Files Data Package

Exploratory Data Analysis Basics

Dataset

Contents

Data from: Units of Analysis: The Basics

PISA 2003 Data Analysis Manual SAS

Big Data Basic Platform Report

Analysis of Air Temperature using CUAHSI HIS Web Services

Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...

students_perfomance_data

Basic dataset for beginners to practice analysis.