100+ datasets found

Student Performance Dataset
kaggle.com
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghulam Muhammad Nabeel (2025). Student Performance Dataset [Dataset]. https://www.kaggle.com/datasets/nabeelqureshitiii/student-performance-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 27, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghulam Muhammad Nabeel
Description
📊 Student Performance Dataset (Synthetic, Realistic)

Overview

This dataset contains 1000000 rows of realistic student performance data, designed for beginners in Machine Learning to practice Linear Regression, model training, and evaluation techniques.

Each row represents one student with features like study hours, attendance, class participation, and final score.
The dataset is small, clean, and structured to be beginner-friendly.

🔑 Columns Description

student_id → Unique identifier for each student.

weekly_self_study_hours → Average weekly self-study hours (0–40). Generated using a normal distribution centered around 15 hours.

attendance_percentage → Attendance percentage (50–100). Simulated with a normal distribution around 85%.

class_participation → Score between 0–10 indicating how actively the student participates in class. Generated from a normal distribution centered around 6.

total_score → Final performance score (0–100). Calculated as a function of study hours + random noise, then clipped between 0–100. Stronger correlation with study hours.

grade → Categorical label (A, B, C, D, F) derived from total_score.

📐 Data Generation Logic

Weekly Study Hours: Modeled using a normal distribution (mean ≈ 15, std ≈ 7), capped between 0 and 40 hours.

Scores: More study hours → higher score. Formula:

Random noise simulates differences in learning ability, motivation, etc.

Attendance & Participation: Independent but realistic variations added.

Grades: Assigned from scores using thresholds:

A: ≥ 85

B: ≥ 70

C: ≥ 55

D: ≥ 40

F: < 40

🎯 How to Use This Dataset

Regression Tasks

Predict total_score from weekly_self_study_hours.

Train and evaluate Linear Regression models.

Extend to multiple regression using attendance_percentage and class_participation.

Classification Tasks

Predict grade (A–F) using study hours, attendance, and participation.

Model Evaluation Practice

Apply train-test split and cross-validation.

Evaluate with MAE, RMSE, R².

Compare simple vs. multiple regression.

✅ This dataset is intentionally kept simple, so that new ML learners can clearly see the relationship between input features (study, attendance, participation) and output (score/grade).
T
1-km monthly mean temperature dataset for china (1901-2023)
data.tpdc.ac.cn
zip
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shouzhang PENG (2024). 1-km monthly mean temperature dataset for china (1901-2023) [Dataset]. http://doi.org/10.11888/Meteoro.tpdc.270961
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.11888/Meteoro.tpdc.270961
Dataset updated
Jul 18, 2024
Dataset provided by
TPDC
Authors
Shouzhang PENG
Area covered

Description
This dataset includes the monthly mean temperature data with 0.0083333 arc degree (~1km) for China from Jan 1901 to Dec 2023. The data form belongs to NETCDF, namely .nc file. The unit of the data is 0.1 ℃. The dataset was spatially downscaled from CRU TS v4.02 with WorldClim datasets based on Delta downscaling method. The dataset was evaluated by 496 national weather stations across China, and the evaluation indicated that the downscaled dataset is reliable for the investigations related to climate change across China. The dataset covers the main land area of China, including Hong Kong, Macao and Taiwan regions, and excluding islands and reefs in South China Sea. WGS84 is recommended for data coordinate system.
d
1971-2000 mean annual precipitation data set for Louisiana StreamStats
catalog.data.gov
data.usgs.gov
Updated Sep 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). 1971-2000 mean annual precipitation data set for Louisiana StreamStats [Dataset]. https://catalog.data.gov/dataset/1971-2000-mean-annual-precipitation-data-set-for-louisiana-streamstats
Explore at:
Dataset updated
Sep 13, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Louisiana
Description
These data represent mean annual precipitation in the Louisiana StreamStats study area for the period of 1971-2000.
Mean house prices for administrative geographies: HPSSA dataset 12
ons.gov.uk
cy.ons.gov.uk
xls
Updated Sep 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2023). Mean house prices for administrative geographies: HPSSA dataset 12 [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/meanhousepricefornationalandsubnationalgeographiesquarterlyrollingyearhpssadataset12
Explore at:
xlsAvailable download formats
Dataset updated
Sep 20, 2023
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Mean price paid for residential property in England and Wales, by property type and administrative geographies. Annual data.
EIGHT COLOR ASTEROID SURVEY MEAN DATA V1.0
data.nasa.gov
s.cnmilf.com
+1more
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). EIGHT COLOR ASTEROID SURVEY MEAN DATA V1.0 [Dataset]. https://data.nasa.gov/dataset/eight-color-asteroid-survey-mean-data-v1-0-1079b
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The eight color asteroid survey provides reflection spectra for minor planets using eight filter passbands. This dataset includes mean data averaged for each of 589 minor planets. The primary data for these minor planets, the response curves for the filters, and the values determined for standard stars, are included in other related datasets. The wavelength range covered is .33 to 1.04 micrometers.
House Price Regression Dataset
kaggle.com
zip
Updated Sep 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prokshitha Polemoni (2024). House Price Regression Dataset [Dataset]. https://www.kaggle.com/datasets/prokshitha/home-value-insights
Explore at:
zip(27045 bytes)Available download formats
Dataset updated
Sep 6, 2024
Authors
Prokshitha Polemoni
Description
Home Value Insights: A Beginner's Regression Dataset

This dataset is designed for beginners to practice regression problems, particularly in the context of predicting house prices. It contains 1000 rows, with each row representing a house and various attributes that influence its price. The dataset is well-suited for learning basic to intermediate-level regression modeling techniques.

Features:

Square_Footage: The size of the house in square feet. Larger homes typically have higher prices.

Num_Bedrooms: The number of bedrooms in the house. More bedrooms generally increase the value of a home.

Num_Bathrooms: The number of bathrooms in the house. Houses with more bathrooms are typically priced higher.

Year_Built: The year the house was built. Older houses may be priced lower due to wear and tear.

Lot_Size: The size of the lot the house is built on, measured in acres. Larger lots tend to add value to a property.

Garage_Size: The number of cars that can fit in the garage. Houses with larger garages are usually more expensive.

Neighborhood_Quality: A rating of the neighborhood’s quality on a scale of 1-10, where 10 indicates a high-quality neighborhood. Better neighborhoods usually command higher prices.

House_Price (Target Variable): The price of the house, which is the dependent variable you aim to predict.

Potential Uses:

Beginner Regression Projects: This dataset can be used to practice building regression models such as Linear Regression, Decision Trees, or Random Forests. The target variable (house price) is continuous, making this an ideal problem for supervised learning techniques.

Feature Engineering Practice: Learners can create new features by combining existing ones, such as the price per square foot or age of the house, providing an opportunity to experiment with feature transformations.

Exploratory Data Analysis (EDA): You can explore how different features (e.g., square footage, number of bedrooms) correlate with the target variable, making it a great dataset for learning about data visualization and summary statistics.

Model Evaluation: The dataset allows for various model evaluation techniques such as cross-validation, R-squared, and Mean Absolute Error (MAE). These metrics can be used to compare the effectiveness of different models.

Versatility:

The dataset is highly versatile for a range of machine learning tasks. You can apply simple linear models to predict house prices based on one or two features, or use more complex models like Random Forest or Gradient Boosting Machines to understand interactions between variables.

It can also be used for dimensionality reduction techniques like PCA or to practice handling categorical variables (e.g., neighborhood quality) through encoding techniques like one-hot encoding.

This dataset is ideal for anyone wanting to gain practical experience in building regression models while working with real-world features.
d
Mean Annual Precipitation in West-Central Nevada using the...
catalog.data.gov
data.usgs.gov
+3more
Updated Nov 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Mean Annual Precipitation in West-Central Nevada using the Precipitation-Zone Method [Dataset]. https://catalog.data.gov/dataset/mean-annual-precipitation-in-west-central-nevada-using-the-precipitation-zone-method
Explore at:
Dataset updated
Nov 27, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Nevada
Description
This data set contains 1971-2000 mean annual precipitation estimates for west-central Nevada. This is a raster data set developed using the precipitation-zone method, which uses elevation-based regression equations to estimate mean annual precipitation for defined precipitation zones (Lopes and Medina, 2007.) This data set is based on the 30-meter National Elevation Dataset. Reference Cited Lopes, T.J., and Medina, R.L., 2007, Precipitation Zones of West-Central Nevada: Journal of Nevada Water Resources Association, v. 4, no 2, p. 21.
N
Income Distribution by Quintile: Mean Household Income in Park City, UT
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Park City, UT [Dataset]. https://www.neilsberg.com/research/datasets/94ddc441-7479-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Park City, Utah
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Park City, UT, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 22,989, while the mean income for the highest quintile (20% of households with the highest income) is 725,204. This indicates that the top earners earn 32 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 1,289,541, which is 177.82% higher compared to the highest quintile, and 5609.38% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/park-city-ut-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Park City, UT (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Park City median household income. You can refer the same here
w
West Africa Mean Annual Precipitation (CHIRP dataset)
data.wu.ac.at
data.europa.eu
wms
Updated Nov 29, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JRC DataCatalogue (2016). West Africa Mean Annual Precipitation (CHIRP dataset) [Dataset]. https://data.wu.ac.at/odso/drdsi_jrc_ec_europa_eu/YWJlOWQyZWUtY2YwZC00NjYyLTllYjktOGIzNjhjN2I3MTY3
Explore at:
wmsAvailable download formats
Dataset updated
Nov 29, 2016
Dataset provided by
JRC DataCatalogue
Description
Mean Annual Precipitation [mm/year] across West Africa using the Climate Hazards Group Infrared Precipitation with Station data (CHIRP) dataset.
Regional weather in Hong Kong – the latest 1-minute mean air temperature |...
data.gov.hk
Updated Dec 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.gov.hk (2022). Regional weather in Hong Kong – the latest 1-minute mean air temperature | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hk-hko-rss-latest-one-minute-mean-air-temp
Explore at:
Dataset updated
Dec 23, 2022
Dataset provided by
data.gov.hk
Area covered
Hong Kong
Description
Provide regional weather in Hong Kong - the latest 1-minute mean air temperature (the data provided is provisional). The multiple file formats are available for datasets download in API.
Data from: BOREAS AFM-06 Mean Temperature Profile Data
data.nasa.gov
data.globalchange.gov
+5more
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). BOREAS AFM-06 Mean Temperature Profile Data [Dataset]. https://data.nasa.gov/dataset/boreas-afm-06-mean-temperature-profile-data-85e49
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The BOREAS AFM-06 team from the National Oceanic and Atmospheric Administration Environment Technology Laboratory (NOAA/ETL) operated a 915 MHz wind/Radio Acoustic Sounding System (RASS) profiler system in the Southern Study Area (SSA) near the Old Jack Pine (OJP) tower from 21-May-1994 to 20-Sep-1994. The data set provides temperature profiles at 15 heights, containing the variables of virtual temperature, vertical velocity, the speed of sound, and w-bar.
d
The StreamCat Dataset: Accumulated Attributes for NHDPlusV2 (Version 2.1)...
catalog.data.gov
gimi9.com
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency, Office of Research and Development (ORD), Center for Public Health and Environmental Assessment (CPHEA), Pacific Ecological Systems Division (PESD), (2025). The StreamCat Dataset: Accumulated Attributes for NHDPlusV2 (Version 2.1) Catchments for the Conterminous United States: Reference Stream Temperature Predictions [Dataset]. https://catalog.data.gov/dataset/the-streamcat-dataset-accumulated-attributes-for-nhdplusv2-version-2-1-catchments-for-the--8d7d3
Explore at:
Dataset updated
Feb 4, 2025
Dataset provided by
U.S. Environmental Protection Agency, Office of Research and Development (ORD), Center for Public Health and Environmental Assessment (CPHEA), Pacific Ecological Systems Division (PESD),
Area covered
Contiguous United States, United States
Description
This dataset represents predictions made to individual, local NHDPlusV2 stream segments. Attributes were calculated for every local NHDPlusV2 stream segment. (See Supplementary Info for Glossary of Terms). These predictions were made to provide estimates of reference-condition stream temperatures in support of the 2008-2009 and 2013-2014 (forthcoming) National Rivers and Streams Assessments. These predictions were based on a set of published models (Hill et al. 2013; http://www.journals.uchicago.edu/doi/abs/10.1899/12-009.1). From Hill et al. (2013): "We modeled 3 ecologically important elements of the thermal regime: mean summer, mean winter, and mean annual stream temperature. These models used a set of least-disturbed USGS stations and sites to model stream temperatures from a set of landscape metrics. To build reference-condition models, we used daily mean ST data obtained from several thousand US Geological Survey temperature sites distributed across the conterminous USA and iteratively modeled ST with Random Forests to identify sites in reference condition. These data are summarized to produce local stream segment-level metrics as a continuous data type.
earthquake dataset
kaggle.com
zip
Updated Jan 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
baki turhan (2025). earthquake dataset [Dataset]. https://www.kaggle.com/datasets/bakiturhan/earthquake-dataset
Explore at:
zip(441683 bytes)Available download formats
Dataset updated
Jan 1, 2025
Authors
baki turhan
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This data set is taken from USGS(U.S Geological Survey). The USGS serves the Nation as an independent fact-finding agency that collects, monitors, analyzes, and provides scientific understanding about natural resource and natural hazard conditions, issues, and problems. The value of the USGS to the Nation rests on its ability to carry out studies on a national scale and to sustain long-term monitoring and assessment of natural resources and hazards. For additional information, visit the link.

https://www.usgs.gov/

Content

This dataset contains earthquake data with a magnitude of 4.5+ and an "alert" warning level, recorded between 1976 and 2025. Below is an explanation of the columns included in the dataset:

time: The timestamp indicating when the earthquake or event occurred, including the date and time in UTC format.

latitude: The geographical latitude of the earthquake's epicenter, measured in degrees.

longitude: The geographical longitude of the earthquake's epicenter, measured in degrees.

depth: The depth at which the earthquake occurred, typically measured in kilometers below the Earth's surface.

mag: The magnitude of the earthquake, representing the energy released by the seismic event. In this case, a value of 8.6 indicates a very large earthquake.

magType: The type of magnitude measurement used, such as "mww" (Moment Magnitude Scale), which is a common scale for large earthquakes.

nst: The number of stations reporting the earthquake, indicating how many seismic stations detected the event.

gap: The azimuthal gap, which refers to the angular distance between the two most distant seismic stations that recorded the earthquake. A smaller gap typically indicates better global coverage.

dmin: The minimum distance between the earthquake's epicenter and the nearest seismic station, measured in degrees.

rms: The root mean square of the amplitude of the seismic waves, representing the strength of the seismic signal.

net: The network identifier for the seismic station or data source that reported the earthquake.

id: A unique identifier for the earthquake event.

updated: The timestamp indicating when the earthquake data was last updated or reviewed.

place: The location or region where the earthquake occurred, often including the name of the area or nearby landmarks.

type: The type of event, such as "volcanic eruption" or "earthquake."

horizontalError: The error associated with the latitude and longitude coordinates of the epicenter, typically measured in kilometers.

depthError: The error associated with the depth measurement of the earthquake, typically measured in kilometers.

magError: The error associated with the magnitude measurement of the earthquake, representing the uncertainty in the reported magnitude.

magNst: The number of stations that contributed to the magnitude estimation.

status: The status of the earthquake event, such as "reviewed" or "automatic," indicating whether the data has been verified.

locationSource: The source of the location data for the earthquake, such as the seismic network or organization that provided the coordinates.

magSource: The source of the magnitude data, such as the network or organization that calculated the magnitude.

Alert: The alert level issued for the earthquake, such as "yellow," indicating the severity of the event and the potential for impact or danger.

Acknowledgements

Real Time Feeds(Spreadsheet format): courtesy of the U.S. Geological Survey

Credit: U.S. Geological Survey

Department of the Interior/USGS

https://www.usgs.gov/information-policies-and-instructions/copyrights-and-credits
N
Income Distribution by Quintile: Mean Household Income in Key West, FL
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Key West, FL [Dataset]. https://www.neilsberg.com/research/datasets/94b01938-7479-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Florida, Key West
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Key West, FL, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 20,685, while the mean income for the highest quintile (20% of households with the highest income) is 351,156. This indicates that the top earners earn 17 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 730,255, which is 207.96% higher compared to the highest quintile, and 3530.36% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/key-west-fl-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Key West, FL (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Key West median household income. You can refer the same here
n
Reconstructed Global Mean Sea Level 1900-2018
podaac.jpl.nasa.gov
cmr.earthdata.nasa.gov
+1more
html
Updated Aug 14, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PO.DAAC (2020). Reconstructed Global Mean Sea Level 1900-2018 [Dataset]. http://doi.org/10.5067/GMSLT-FJPL1
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.5067/GMSLT-FJPL1
Dataset updated
Aug 14, 2020
Dataset provided by
PO.DAAC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
SEA SURFACE HEIGHT
Description
This dataset contains reconstructed global-mean sea level evolution and the estimated contributing processes over 1900-2018. Reconstructed sea level is based on annual-mean tide-gauge observations and uses the virtual-station method to aggregate the individual observations into a global estimate. The contributing processes consist of thermosteric changes, glacier mass changes, mass changes of the Greenland and Antarctic Ice Sheet, and terrestrial water storage changes. The glacier, ice sheet, and terrestrial water storage are estimated by combining GRACE observations (2003-2018) with long-term estimates from in-situ observations and models. Steric estimates are based on in-situ temperature profiles. The upper- and lower bound represent the 5 and 95 percent confidence level. The numbers are equal to the ones presented in Frederikse et al. The causes of sea-level rise since 1900, Nature, 2020.This dataset was produced by the Heat and Ocean Mass from Gravity ESDR (HOMAGE) project, with funding from MeASUREs-2017. HOMAGE is combining satellite observations to create a set of ESDRs that provide a homogeneous basis for accurate and current quantification of the planetary sea level budget, ocean heat content, and large-scale ocean transport variations.
N
Income Distribution by Quintile: Mean Household Income in Lake City, MI
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Lake City, MI [Dataset]. https://www.neilsberg.com/research/datasets/94b36d77-7479-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Lake City, Michigan
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Lake City, MI, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 17,899, while the mean income for the highest quintile (20% of households with the highest income) is 161,779. This indicates that the top earners earn 9 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 235,068, which is 145.30% higher compared to the highest quintile, and 1313.30% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/lake-city-mi-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Lake City, MI (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Lake City median household income. You can refer the same here
Monthly Mean Temperature Data for Major US Cities
kaggle.com
zip
Updated Mar 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Garrick Hague (2023). Monthly Mean Temperature Data for Major US Cities [Dataset]. https://www.kaggle.com/datasets/garrickhague/temp-data-of-prominent-us-cities-from-1948-to-2022
Explore at:
zip(93354 bytes)Available download formats
Dataset updated
Mar 12, 2023
Authors
Garrick Hague
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
The monthly mean temperature data presented in this dataset was obtained from the Climate Prediction Center (CPC) Global Land Surface Air Temperature Analysis, which was loaded into Python using xarray. The data was then filtered to include only the latitude and longitude coordinates corresponding to each city in the dataset. In order to select the nearest location to each city, the 'select' method with the nearest point was used, resulting in temperature data that may not be exactly at the city location. The data is presented on a 0.5x0.5 degree grid across the globe.

The temperature data provides a valuable resource for time series analysis, and if you are interested in obtaining temperature data for additional cities, please let me know. I will also be sharing the source code on GitHub for anyone who would like to reproduce the data or analysis.
N
Income Distribution by Quintile: Mean Household Income in Hope, New York
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Hope, New York [Dataset]. https://www.neilsberg.com/research/datasets/94a6beef-7479-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Hope
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Hope, New York, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 19,893, while the mean income for the highest quintile (20% of households with the highest income) is 147,661. This indicates that the top earners earn 7 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 211,499, which is 143.23% higher compared to the highest quintile, and 1063.18% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/hope-ny-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Hope, New York (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Hope town median household income. You can refer the same here
N
Income Distribution by Quintile: Mean Household Income in Central City, PA
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Central City, PA [Dataset]. https://www.neilsberg.com/research/datasets/9471051c-7479-11ee-949f-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Pennsylvania, Central City
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Central City, PA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 11,912, while the mean income for the highest quintile (20% of households with the highest income) is 122,542. This indicates that the top earners earn 10 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 163,453, which is 133.39% higher compared to the highest quintile, and 1372.17% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/central-city-pa-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Central City, PA (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Central City median household income. You can refer the same here
Kokoro Speech Dataset v1.1 Tiny
kaggle.com
zip
Updated May 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katsuya Iida (2021). Kokoro Speech Dataset v1.1 Tiny [Dataset]. https://www.kaggle.com/datasets/kaiida/kokoro-speech-dataset-v11-tiny
Explore at:
zip(48156884 bytes)Available download formats
Dataset updated
May 14, 2021
Authors
Katsuya Iida
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Kokoro Speech Dataset

Kokoro Speech Dataset is a public domain Japanese speech dataset. It contains 34,958 short audio clips of a single speaker reading 9 novel books. The format of the metadata is similar to that of LJ Speech so that the dataset is compatible with modern speech synthesis systems.

The texts are from Aozora Bunko, which is in the public domain. The audio clips are from LibriVox project, which is also in the public domain. Readings are estimated by MeCab and UniDic Lite from kanji-kana mixture text. Readings are romanized which are similar to the format used by Julius.

The audio clips were split and transcripts were aligned automatically by Voice100.

Sample data

Listen from your browser or download randomly sampled 100 clips.

File Format

Metadata is provided in metadata.csv. This file consists of one record per line, delimited by the pipe character (0x7c). The fields are:

ID: this is the name of the corresponding .wav file

Transcription: Kanji-kana mixture text spoken by the reader (UTF-8)

Reading: Romanized text spoken by the reader (UTF-8)

Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz.

Statistics

The dataset is provided in different sizes, large, small, tiny. small and tiny don't share same clips. large contains all available clips, including small and tiny.

Large: Total clips: 34958 Min duration: 3.007 secs Max duration: 14.745 secs Mean duration: 4.978 secs Total duration: 48:20:24 Small: Total clips: 8812 Min duration: 3.007 secs Max duration: 14.431 secs Mean duration: 4.951 secs Total duration: 12:07:12 Tiny: Total clips: 285 Min duration: 3.019 secs Max duration: 9.462 secs Mean duration: 4.871 secs Total duration: 00:23:08

How to get the data

Because of its large data size of the dataset, audio files are not included in this repository, but the metadata is included.

To make .wav files of the dataset, run

$ bash download.sh

to download the metadata from the project page. Then run

$ pip3 install torchaudio $ python3 extract.py --size tiny

This prints a shell script example to download MP3 audio files from archive.org and extract them if you haven't done it already.

After doing so, run the command again

$ python3 extract.py --size tiny

to get files for tiny under ./output directory.

You can give another size name to the --size option to get dataset of the size.

Pretrained Tacotron model

Audio Samples

Pretrained model

Pretrained Tacotron model trained with Kokoro Speech Dataset and audio samples are available. The model was trained for 21K steps with small. According to the above repo, "Speech started to become intelligible around 20K steps" with LJ Speech Dataset. Audio samples read the first few sentences from Gon Gitsune which is not included in small.

Books

The dataset contains recordings from these books read by ekzemplaro

明暗 (Meian) 16:39:29 Online text

こころ (Kokoro) 08:46:41 Online text

田舎教師 (Inaka Kyoshi) 08:13:26 Online text

野分 (Nowaki) 4:40:49 Online text

草枕 (Kusamakura) 04:27:35 Online text

坊っちゃん (Botchan) 04:26:27 Online text

雁 (Gan) 03:41:31 Online text

ごん狐 (Gon gitsune) 0:15:42 Online text

[コーカサスの禿鷹 (Caucasus no Hagetaka)](https://l...

Facebook

Twitter

Click to copy link

Link copied

Cite

Ghulam Muhammad Nabeel (2025). Student Performance Dataset [Dataset]. https://www.kaggle.com/datasets/nabeelqureshitiii/student-performance-dataset

Student Performance Dataset

A generic data for ML Beginners

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 27, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Ghulam Muhammad Nabeel

Description

📊 Student Performance Dataset (Synthetic, Realistic)

Overview

This dataset contains 1000000 rows of realistic student performance data, designed for beginners in Machine Learning to practice Linear Regression, model training, and evaluation techniques.

Each row represents one student with features like study hours, attendance, class participation, and final score.
The dataset is small, clean, and structured to be beginner-friendly.

🔑 Columns Description

student_id → Unique identifier for each student.
weekly_self_study_hours → Average weekly self-study hours (0–40). Generated using a normal distribution centered around 15 hours.
attendance_percentage → Attendance percentage (50–100). Simulated with a normal distribution around 85%.
class_participation → Score between 0–10 indicating how actively the student participates in class. Generated from a normal distribution centered around 6.
total_score → Final performance score (0–100). Calculated as a function of study hours + random noise, then clipped between 0–100. Stronger correlation with study hours.
grade → Categorical label (A, B, C, D, F) derived from total_score.

📐 Data Generation Logic

Weekly Study Hours: Modeled using a normal distribution (mean ≈ 15, std ≈ 7), capped between 0 and 40 hours.
Scores: More study hours → higher score. Formula:

Random noise simulates differences in learning ability, motivation, etc.

Attendance & Participation: Independent but realistic variations added.
Grades: Assigned from scores using thresholds:

A: ≥ 85
B: ≥ 70
C: ≥ 55
D: ≥ 40
F: < 40

🎯 How to Use This Dataset

Regression Tasks

Predict total_score from weekly_self_study_hours.
Train and evaluate Linear Regression models.
Extend to multiple regression using attendance_percentage and class_participation.

Classification Tasks

Predict grade (A–F) using study hours, attendance, and participation.

Model Evaluation Practice

Apply train-test split and cross-validation.
Evaluate with MAE, RMSE, R².
Compare simple vs. multiple regression.

✅ This dataset is intentionally kept simple, so that new ML learners can clearly see the relationship between input features (study, attendance, participation) and output (score/grade).

Clear search

Close search

Google apps

Main menu

Student Performance Dataset

📊 Student Performance Dataset (Synthetic, Realistic)

Overview

🔑 Columns Description

📐 Data Generation Logic

🎯 How to Use This Dataset

1-km monthly mean temperature dataset for china (1901-2023)

1971-2000 mean annual precipitation data set for Louisiana StreamStats

Mean house prices for administrative geographies: HPSSA dataset 12

EIGHT COLOR ASTEROID SURVEY MEAN DATA V1.0

House Price Regression Dataset

Home Value Insights: A Beginner's Regression Dataset

Features:

Potential Uses:

Versatility:

Mean Annual Precipitation in West-Central Nevada using the...

Income Distribution by Quintile: Mean Household Income in Park City, UT

About this dataset

Content

Inspiration

Recommended for further research

West Africa Mean Annual Precipitation (CHIRP dataset)

Regional weather in Hong Kong – the latest 1-minute mean air temperature |...

Data from: BOREAS AFM-06 Mean Temperature Profile Data

The StreamCat Dataset: Accumulated Attributes for NHDPlusV2 (Version 2.1)...

earthquake dataset

Income Distribution by Quintile: Mean Household Income in Key West, FL

About this dataset

Content

Inspiration

Recommended for further research

Reconstructed Global Mean Sea Level 1900-2018

Income Distribution by Quintile: Mean Household Income in Lake City, MI

About this dataset

Content

Inspiration

Recommended for further research

Monthly Mean Temperature Data for Major US Cities

Income Distribution by Quintile: Mean Household Income in Hope, New York

About this dataset

Content

Inspiration

Recommended for further research

Income Distribution by Quintile: Mean Household Income in Central City, PA

About this dataset

Content

Inspiration

Recommended for further research

Kokoro Speech Dataset v1.1 Tiny

Kokoro Speech Dataset

Sample data

File Format

Statistics

How to get the data

Pretrained Tacotron model

Books

Student Performance Dataset

A generic data for ML Beginners

📊 Student Performance Dataset (Synthetic, Realistic)

Overview

🔑 Columns Description

📐 Data Generation Logic

🎯 How to Use This Dataset