100+ datasets found

d
An example data set for exploration of Multiple Linear Regression
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). An example data set for exploration of Multiple Linear Regression [Dataset]. https://catalog.data.gov/dataset/an-example-data-set-for-exploration-of-multiple-linear-regression
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
Linear Regression E-commerce Dataset
kaggle.com
zip
Updated Sep 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saurabh Kolawale (2019). Linear Regression E-commerce Dataset [Dataset]. https://www.kaggle.com/datasets/kolawale/focusing-on-mobile-app-or-website
Explore at:
zip(44169 bytes)Available download formats
Dataset updated
Sep 16, 2019
Authors
Saurabh Kolawale
Description
This dataset is having data of customers who buys clothes online. The store offers in-store style and clothing advice sessions. Customers come in to the store, have sessions/meetings with a personal stylist, then they can go home and order either on a mobile app or website for the clothes they want.

The company is trying to decide whether to focus their efforts on their mobile app experience or their website.
d
Data for multiple linear regression models for predicting microcystin...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Data for multiple linear regression models for predicting microcystin concentration action-level exceedances in selected lakes in Ohio [Dataset]. https://catalog.data.gov/dataset/data-for-multiple-linear-regression-models-for-predicting-microcystin-concentration-action
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Ohio
Description
Site-specific multiple linear regression models were developed for eight sites in Ohio—six in the Western Lake Erie Basin and two in northeast Ohio on inland reservoirs--to quickly predict action-level exceedances for a cyanotoxin, microcystin, in recreational and drinking waters used by the public. Real-time models include easily- or continuously-measured factors that do not require that a sample be collected. Real-time models are presented in two categories: (1) six models with continuous monitor data, and (2) three models with on-site measurements. Real-time models commonly included variables such as phycocyanin, pH, specific conductance, and streamflow or gage height. Many of the real-time factors were averages over time periods antecedent to the time the microcystin sample was collected, including water-quality data compiled from continuous monitors. Comprehensive models use a combination of discrete sample-based measurements and real-time factors. Comprehensive models were useful at some sites with lagged variables (< 2 weeks) for cyanobacterial toxin genes, dissolved nutrients, and (or) N to P ratios. Comprehensive models are presented in three categories: (1) three models with continuous monitor data and lagged comprehensive variables, (2) five models with no continuous monitor data and lagged comprehensive variables, and (3) one model with continuous monitor data and same-day comprehensive variables. Funding for this work was provided by the Ohio Water Development Authority and the U.S. Geological Survey Cooperative Water Program.
Student score (Suitable for linear regression)
kaggle.com
Updated Feb 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mahsa sanaei (2024). Student score (Suitable for linear regression) [Dataset]. https://www.kaggle.com/datasets/snmahsa/student-score-suitable-for-linear-regression
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2024
Dataset provided by
Kaggle
Authors
mahsa sanaei
Description
A simple dataset prepared for learning the subject of linear regression. This dataset is related to the scores of 61 students. It has two columns. It contains the duration of the exam and the column related to the score It has two columns. It contains the duration of the exam and the column related to the grade
1.01. Simple linear regression
kaggle.com
Updated Jan 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Abiodun SULAIMAN (2021). 1.01. Simple linear regression [Dataset]. https://www.kaggle.com/datasets/behordeun/101-simple-linear-regression
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 18, 2021
Dataset provided by
Kaggle
Authors
Muhammad Abiodun SULAIMAN
Description
Dataset

This dataset was created by Muhammad Abiodun SULAIMAN

Contents
Simple Linear Regression Dataset
kaggle.com
Updated Jan 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdul Ali Nawrozie (2024). Simple Linear Regression Dataset [Dataset]. https://www.kaggle.com/datasets/abdulalinawrozie/simple-linear-regression-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdul Ali Nawrozie
Description
Dataset

This dataset was created by Abdul Ali Nawrozie

Contents
q
Module M.4 Simple linear regression analysis
qubeshub.org
Updated Jun 26, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raisa Hernández-Pacheco; Alexandra Bland (2023). Module M.4 Simple linear regression analysis [Dataset]. http://doi.org/10.25334/M5DQ-AA91
Explore at:
Unique identifier
https://doi.org/10.25334/M5DQ-AA91
Dataset updated
Jun 26, 2023
Dataset provided by
QUBES
Authors
Raisa Hernández-Pacheco; Alexandra Bland
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Introduction to Primate Data Exploration and Linear Modeling with R was created with the goal of providing training to undergraduate biology students on data management and statistical analysis using authentic data of Cayo Santiago rhesus macaques. Module M.4 introduces simple linear regression analysis in R.
m
Panel dataset on Brazilian fuel demand
data.mendeley.com
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Prolo (2024). Panel dataset on Brazilian fuel demand [Dataset]. http://doi.org/10.17632/hzpwbp7j22.1
Explore at:
Unique identifier
https://doi.org/10.17632/hzpwbp7j22.1
Dataset updated
Oct 7, 2024
Authors
Sergio Prolo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary : Fuel demand is shown to be influenced by fuel prices, people's income and motorization rates. We explore the effects of electric vehicle's rates in gasoline demand using this panel dataset.

Files : dataset.csv - Panel dimensions are the Brazilian state ( i ) and year ( t ). The other columns are: gasoline sales per capita (ln_Sg_pc), prices of gasoline (ln_Pg) and ethanol (ln_Pe) and their lags, motorization rates of combustion vehicles (ln_Mi_c) and electric vehicles (ln_Mi_e) and GDP per capita (ln_gdp_pc). All variables are all under the natural log function, since we use this to calculate demand elasticities in a regression model.

adjacency.csv - The adjacency matrix used in interaction with electric vehicles' motorization rates to calculate spatial effects. At first, it follows a binary adjacency formula: for each pair of states i and j, the cell (i, j) is 0 if the states are not adjacent and 1 if they are. Then, each row is normalized to have sum equal to one.

regression.do - Series of Stata commands used to estimate the regression models of our study. dataset.csv must be imported to work, see comment section.

dataset_predictions.xlsx - Based on the estimations from Stata, we use this excel file to make average predictions by year and by state. Also, by including years beyond the last panel sample, we also forecast the model into the future and evaluate the effects of different policies that influence gasoline prices (taxation) and EV motorization rates (electrification). This file is primarily used to create images, but can be used to further understand how the forecasting scenarios are set up.

Sources: Fuel prices and sales: ANP (https://www.gov.br/anp/en/access-information/what-is-anp/what-is-anp) State population, GDP and vehicle fleet: IBGE (https://www.ibge.gov.br/en/home-eng.html?lang=en-GB) State EV fleet: Anfavea (https://anfavea.com.br/en/site/anuarios/)
h
linear-regression-synthetic-set-1000
huggingface.co
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuriy Serdyuk (2025). linear-regression-synthetic-set-1000 [Dataset]. https://huggingface.co/datasets/phoenyx08/linear-regression-synthetic-set-1000
Explore at:
Dataset updated
Jun 21, 2025
Authors
Yuriy Serdyuk
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
viewer: true

Synthetic Linear Regression Dataset

This dataset consists of 1000 synthetic data points for training and evaluating simple linear regression models.

Usage

You can load this dataset manually using pandas: import pandas as pd

df = pd.read_csv('synthetic_linear_data.csv') print(df.head())
c
Student Performance (Multiple Linear Regression) Dataset
cubig.ai
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Student Performance (Multiple Linear Regression) Dataset [Dataset]. https://cubig.ai/store/products/392/student-performance-multiple-linear-regression-dataset
Explore at:
Dataset updated
May 29, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Student Performance (Multiple Linear Regression) Dataset is designed to analyze the relationship between students’ learning habits and academic performance. Each sample includes key indicators related to learning, such as study hours, sleep duration, previous test scores, and the number of practice exams completed.

2) Data Utilization (1) Characteristics of the Student Performance (Multiple Linear Regression) Dataset: • The target variable, Hours Studied, quantitatively represents the amount of time a student has invested in studying. The dataset is structured to allow modeling and inference of learning behaviors based on correlations with other variables.

(2) Applications of the Student Performance (Multiple Linear Regression) Dataset: • AI-Based Study Time Prediction Models: The dataset can be used to develop regression models that estimate a student’s expected study time based on inputs like academic performance, sleep habits, and engagement patterns. • Behavioral Analysis and Personalized Learning Strategies: It can be applied to identify students with insufficient study time and design personalized study interventions based on academic and lifestyle patterns.
A
‘Simple Linear Regression - Placement data’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Simple Linear Regression - Placement data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-simple-linear-regression-placement-data-e596/latest
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Simple Linear Regression - Placement data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mayurdalvi/simple-linear-regression-placement-data on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

This package was build to understand Simple Linear Regression. The content in this dataset are easy to understand.

Content

Contains Two columns:

CGPA : Aggregate Cgpa received Package : Total Package (LPA)

Thank You !!

If like my work please UPVOTE 🙏🙏

Happy Learning

--- Original source retains full ownership of the source dataset ---
Simple Linear Regression
kaggle.com
Updated Sep 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hrishikesh_Dutta0078 (2021). Simple Linear Regression [Dataset]. https://www.kaggle.com/datasets/hrishikeshdutta0078/simple-linear-regression
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 15, 2021
Dataset provided by
Kaggle
Authors
Hrishikesh_Dutta0078
Description
Dataset

This dataset was created by Hrishikesh_Dutta0078

Contents
f
S1 Data -
plos.figshare.com
zip
Updated Dec 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lidiya Teshome; Haweni Adugna; Leul Deribe (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0295494.s003
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0295494.s003
Dataset updated
Dec 22, 2023
Dataset provided by
PLOS ONE
Authors
Lidiya Teshome; Haweni Adugna; Leul Deribe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionIntimate Partner Violence (IPV) is a worldwide public health problem and major human and legal rights abuses of women. It affects the physical, sexual, and psychological aspects of the victims therefore, it requires complex and multifaceted interventions. Health providers are responsible for providing essential healthcare services for IPV victims. However, there is a lack of detailed information on whether or not health providers are ready to identify and manage IPV. Therefore, this study aimed to assess health providers’ readiness and associated factors in managing IPV in public health institutions at Hawassa, Ethiopia.MethodInstitutional based cross-sectional study was conducted through a simple random sample of 424 health providers. Data was collected with an anonymous questioners using physician Readiness to Manage Intimate Partner Violence Survey (PREMIS) tool. Linear regression analysis was used to examine relationships among variables. The strength of association was assessed by using unstandardized β with 95% CI.ResultsThe mean score of perceived provider’s readiness in managing IPV was 26.18± 6.69. Higher providers age and providers perceived knowledge had positive association with provider perceived readiness in managing IPV. Whereas not had IPV training, absence of a protocol for dealing with IPV management, and provider attitude had a negative association with provider perceived readiness in managing IPV.Conclusion and recommendationThis study reviled that health providers had limited perceived readiness to manage IPV. Provision of training for providers and develop protocol for IPV managements have an important role to improve providers readiness in the managements of IPV.
J
The two‐sample linear regression model with interval‐censored covariates...
journaldata.zbw.eu
txt
Updated Dec 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Pacini; David Pacini (2022). The two‐sample linear regression model with interval‐censored covariates (replication data) [Dataset]. http://doi.org/10.15456/jae.2022327.0707557005
Explore at:
txt(4434)Available download formats
Unique identifier
https://doi.org/10.15456/jae.2022327.0707557005
Dataset updated
Dec 7, 2022
Dataset provided by
ZBW - Leibniz Informationszentrum Wirtschaft
Authors
David Pacini; David Pacini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are surveys that gather precise information on an outcome of interest, but measure continuous covariates by a discrete number of intervals, in which case the covariates are interval censored. For applications with a second independent dataset precisely measuring the covariates, but not the outcome, this paper introduces a semiparametrically efficient estimator for the coefficients in a linear regression model. The second sample serves to establish point identification. An empirical application investigating the relationship between income and body mass index illustrates the use of the estimator.
U
Suspended sediment and bedload data, simple linear regression models, loads,...
data.usgs.gov
s.cnmilf.com
+1more
Updated Mar 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joel Groten; Colin Livdahl; Stephen DeLong (2022). Suspended sediment and bedload data, simple linear regression models, loads, elevation data, and FaSTMECH models for Rice Creek, Minnesota, 2010-2019 [Dataset]. http://doi.org/10.5066/P9SJIY32
Explore at:
Unique identifier
https://doi.org/10.5066/P9SJIY32
Dataset updated
Mar 2, 2022
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Joel Groten; Colin Livdahl; Stephen DeLong
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
2010 - 2019
Area covered
Rice Creek, Minnesota
Description
A series of simple linear regression models were developed for the U.S. Geological Survey (USGS) streamgage at Rice Creek below Highway 8 in Mounds View, Minnesota (USGS station number 05288580). The simple linear regression models were calibrated using streamflow data to estimate suspended-sediment (total, fines, and sands) and bedload. Data were collected during water years 2010, 2011, 2014, 2018, and 2019. The estimates from the simple linear regressions were used to calculate loads for water years 2010 through 2019. The calibrated simple linear regression models were used to improve understanding of sediment transport processes and increase accuracy of estimating sediment and loads for Rice Creek. Two multidimensional flow and models were developed with the International River Interface Cooperative (iRIC) software and Flow and Sediment Transport with Morphological Evolution of Channels (FaSTMECH) solver. These models were developed with elevation data from terrestrial laser sc ...
h
testingdatasetcards
huggingface.co
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maria Murphy (2024). testingdatasetcards [Dataset]. https://huggingface.co/datasets/mariakmurphy55/testingdatasetcards
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 2, 2024
Authors
Maria Murphy
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Dataset Card for Testingdatasetcards

Very Simple Multiple Linear Regression Dataset

Dataset Details Dataset Description

Curated by: HUSSAIN NASIR KHAN (Kaggle) Shared by [optional]: Maria Murphy Language(s) (NLP): English License: CC0: Public Domain

Uses

Intended for practice with linear regression.

Dataset Structure

Contains three columns (age, experience, income) and twenty observations.
Simple Linear Regression
kaggle.com
Updated Jul 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samratsingh Dikkhat (2022). Simple Linear Regression [Dataset]. https://www.kaggle.com/datasets/samratsinghdikkhat/simple-linear-regression/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 2, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Samratsingh Dikkhat
Description
Dataset

This dataset was created by Samratsingh Dikkhat

Contents
f
Data from: Creating predictive clothing size models for online customers
tandf.figshare.com
docx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Allison Davidson; Ellen Gundlach (2023). Creating predictive clothing size models for online customers [Dataset]. http://doi.org/10.6084/m9.figshare.19330468.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19330468.v1
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Allison Davidson; Ellen Gundlach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A disadvantage to online clothes shopping is the inability to try on clothing to test the fit. A class project is discussed where students consult with the CEO of an online mensware clothing company to explore ways in which an online clothing customer can be assured of a superior fit by developing statistical models based on a shopper’s height and weight to predict measurements needed to create a suit that feels custom-made. The dataset is most amenable to use with students who have previously been exposed to simple linear regression, and can be used to explore multiple regression topics such as interaction terms, influential points, transformations, and polynomial predictors. Discussion points are included for more advanced topics such as canonical correlation, clustering, and dimension reduction.
f
Simple linear regression results for STS.
plos.figshare.com
xls
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Collard; April Ruttle; Briggs Buchanan; Michael J. O’Brien (2023). Simple linear regression results for STS. [Dataset]. http://doi.org/10.1371/journal.pone.0040975.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0040975.t003
Dataset updated
Jun 8, 2023
Dataset provided by
PLOS ONE
Authors
Mark Collard; April Ruttle; Briggs Buchanan; Michael J. O’Brien
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Simple linear regression results for STS.
SPSS Data Set S1 Logistic Regression Model Data
figshare.com
bin
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michelle Klailova; Phyllis Lee (2016). SPSS Data Set S1 Logistic Regression Model Data [Dataset]. http://doi.org/10.6084/m9.figshare.1051748.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1051748.v2
Dataset updated
Jan 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Michelle Klailova; Phyllis Lee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data set from PLOS ONE Article Published Entitled: Western Lowland Gorillas Signal Selectively Using Odor

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. Geological Survey (2024). An example data set for exploration of Multiple Linear Regression [Dataset]. https://catalog.data.gov/dataset/an-example-data-set-for-exploration-of-multiple-linear-regression

An example data set for exploration of Multiple Linear Regression

Explore at:

Dataset updated

Jul 6, 2024

Dataset provided by

United States Geological Surveyhttp://www.usgs.gov/

Description

This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.

Clear search

Close search

Google apps

Main menu

An example data set for exploration of Multiple Linear Regression

Linear Regression E-commerce Dataset

Data for multiple linear regression models for predicting microcystin...

Student score (Suitable for linear regression)

1.01. Simple linear regression

Dataset

Contents

Simple Linear Regression Dataset

Dataset

Contents

Module M.4 Simple linear regression analysis

Panel dataset on Brazilian fuel demand

linear-regression-synthetic-set-1000

Student Performance (Multiple Linear Regression) Dataset

‘Simple Linear Regression - Placement data’ analyzed by Analyst-2

Context

Content

Thank You !!

Happy Learning

Simple Linear Regression

Dataset

Contents

S1 Data -

The two‐sample linear regression model with interval‐censored covariates...

Suspended sediment and bedload data, simple linear regression models, loads,...

testingdatasetcards

Simple Linear Regression

Dataset

Contents

Data from: Creating predictive clothing size models for online customers

Simple linear regression results for STS.

SPSS Data Set S1 Logistic Regression Model Data

An example data set for exploration of Multiple Linear Regression