Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Here in This Dataset we have only 2 columns the first one is Age and the second one is Premium You can use this dataset in machine learning for Simple linear Regression and for Prediction Practices.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Abhishek Kumar
Released under Apache 2.0
Facebook
TwitterThis dataset has been created to demonstrate the use of a simple linear regression model. It includes two variables: an independent variable and a dependent variable. The data can be used for training, testing, and validating a simple linear regression model, making it ideal for educational purposes, tutorials, and basic predictive analysis projects. The dataset consists of 100 observations with no missing values, and it follows a linear relationship
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Introduction to Primate Data Exploration and Linear Modeling with R was created with the goal of providing training to undergraduate biology students on data management and statistical analysis using authentic data of Cayo Santiago rhesus macaques. Module M.4 introduces simple linear regression analysis in R.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Salary Dataset in CSV for Simple linear regression. It has also been used in Machine Learning A to Z course of my series.
Facebook
TwitterThis is a small dataset made for beginners. It can be used to predict the salary of the employee based on his experience. This is a simple example of Simple Linear Regression. You can use it to have some basic knowledge of Simple Linear Regression.
In this dataset, the experience of employees and their salary on the basis of their experience is given.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
viewer: true
Synthetic Linear Regression Dataset
This dataset consists of 1000 synthetic data points for training and evaluating simple linear regression models.
Usage
You can load this dataset manually using pandas: import pandas as pd
df = pd.read_csv('synthetic_linear_data.csv') print(df.head())
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is designed to help you practice linear regression, a fundamental concept in machine learning and statistical analysis. The dataset contains a simulated linear relationship between the number of hours a student studies and the marks they obtain. It is an ideal resource for beginners who want to understand how linear regression works, or for educators looking to provide a simple yet effective example to their students.
Facebook
TwitterThis dataset was created by Ali Aydamirov
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We reported the R code used to study the relationship between variables using a simple linear regression model in the software R (R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Accessed 24/09/2021).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description The table below gives the heights of fathers and their sons, based on a famous experiment by Karl Pearson around 1903. The number of cases is 1078. Random noise was added to the original data, to produce heights to the nearest 0.1 inch.
Objective: Use this dataset to practice simple linear regression.
Columns - Father height - Son height
Source: Department of Statistics, University of California, Berkeley
Download TSV source file: Pearson.tsv
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionIntimate Partner Violence (IPV) is a worldwide public health problem and major human and legal rights abuses of women. It affects the physical, sexual, and psychological aspects of the victims therefore, it requires complex and multifaceted interventions. Health providers are responsible for providing essential healthcare services for IPV victims. However, there is a lack of detailed information on whether or not health providers are ready to identify and manage IPV. Therefore, this study aimed to assess health providers’ readiness and associated factors in managing IPV in public health institutions at Hawassa, Ethiopia.MethodInstitutional based cross-sectional study was conducted through a simple random sample of 424 health providers. Data was collected with an anonymous questioners using physician Readiness to Manage Intimate Partner Violence Survey (PREMIS) tool. Linear regression analysis was used to examine relationships among variables. The strength of association was assessed by using unstandardized β with 95% CI.ResultsThe mean score of perceived provider’s readiness in managing IPV was 26.18± 6.69. Higher providers age and providers perceived knowledge had positive association with provider perceived readiness in managing IPV. Whereas not had IPV training, absence of a protocol for dealing with IPV management, and provider attitude had a negative association with provider perceived readiness in managing IPV.Conclusion and recommendationThis study reviled that health providers had limited perceived readiness to manage IPV. Provision of training for providers and develop protocol for IPV managements have an important role to improve providers readiness in the managements of IPV.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Generated datasets of multiple ideals distributions used in the research in linear regressions and machine learning algorithm for the thesis s 'Predicting the performance of Buchberger‘s algorithm' . Concatenated and concatenated_stats are the datasets with the ideals exponents and correspondent polynomial additions, these datasets were created specifically for RNN, features_dataset contains statistics regarding the ideals and polynomial_additions_dataset contains info regarding their polynomial additions created for multiple linear regression models and simple neural networks.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Regressions and meta-regressions are widely used to estimate patterns and effect sizes in various disciplines. However, many biological and medical analyses use relatively low sample size (N), contributing to concerns on reproducibility. What is the minimum N to identify the most plausible data pattern using regressions? Statistical power analysis is often used to answer that question, but it has its own problems and logically should follow model selection to first identify the most plausible model. Here we make null, simple linear and quadratic data with different variances and effect sizes. We then sample and use information theoretic model selection to evaluate minimum N for regression models. We also evaluate the use of coefficient of determination (R2) for this purpose; it is widely used but not recommended. With very low variance, both false positives and false negatives occurred at N < 8, but data shape was always clearly identified at N ≥ 8. With high variance, accurate inference was stable at N ≥ 25. Those outcomes were consistent at different effect sizes. Akaike Information Criterion weights (AICc wi) were essential to clearly identify patterns (e.g., simple linear vs. null); R2 or adjusted R2 values were not useful. We conclude that a minimum N = 8 is informative given very little variance, but minimum N ≥ 25 is required for more variance. Alternative models are better compared using information theory indices such as AIC but not R2 or adjusted R2. Insufficient N and R2-based model selection apparently contribute to confusion and low reproducibility in various disciplines. To avoid those problems, we recommend that research based on regressions or meta-regressions use N ≥ 25.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This study aims to determine "the effect of conflict on employee performance at Giant Pekanbaru". In this study, a sample of 90 people was used. Data collection was carried out through questionnaires and data analysis techniques used with a significance level of 0.05 were validity test, reliability test with crobanchalpha, simple linear regression and t test analysis and analysis of determination R Square (R2). The results of the analysis and data of this study using the help of SPSS Version 16.0, the results of the simple linear regression equation are Y = 45.561 + 0.256X. Based on the results of the research on the t-test showed results, Tcount> Ttable or 2,250> 1,987. So it can be concluded that there is a significant influence between conflict on performance. Based on the data obtained from the variable Y (performance), obtained R Square (R2) of 0.597 or 59.7%. R Square is used to determine the percentage of the influence of the Independent variable (conflict) on the Dependent variable (performance) is 59.7% while the remaining 40.3% is influenced by other variables not examined.
Facebook
TwitterThis dataset was created by Gaurav B R
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data and code associated with the study "Quantifying accuracy and precision from continuous response data in studies of spatial perception and crossmodal recalibration" by Patrick Bruns, Caroline Thun, and Brigitte Röder.
example_code.R contains analysis code that can be used to to calculate error-based and regression-based localization performance metrics from single-subject response data with a working example in R. It requires as inputs a numeric vector containing the stimulus location (true value) in each trial and a numeric vector containing the corresponding localization response (perceived value) in each trial.
example_data.csv contains the data used in the working example of the analysis code.
localization.csv contains extracted localization performance metrics from 188 subjects which were analyzed in the study to assess the agreement between error-based and regression-based measures of accuracy and precision. The subjects had all naively performed an azimuthal sound localization task (see related identifiers for the underlying raw data).
recalibration.csv contains extracted localization performance metrics from a subsample of 57 subjects in whom data from a second sound localization test, performed after exposure to audiovisual stimuli in which the visual stimulus was consistently presented 13.5° to the right of the sound source, were available. The file contains baseline performance (pre) and changes in performance after audiovisual exposure relative to baseline (delta) in each of the localization performance metrics.
Localization performance metrics were either derived from the single-trial localization errors (error-based approach) or from a linear regression of localization responses on the actual target locations (regression-based approach).The following localization performance metrics were included in the study:
bias: overall bias of localization responses to the left (negative values) or to the right (positive values), equivalent to constant error (CE) in error-based approaches and intercept in regression-based approaches
absolute constant error (aCE): absolute value of bias (or CE), indicates the amount of bias irrespective of direction
mean absolute contant error (maCE): mean of the aCE per target location, reflects over- or underestimation of peripheral target locations
variable error (VE): mean of the standard deviations (SD) of the single-trial localization errors at each target location
pooled variable error (pVE): SD of the single-trial localization errors pooled across trials from all target locations
absolute error (AE): mean of the absolute values of the single-trial localization errors, sensitive to both bias and variability of the localization responses
slope: slope of the regression model function, indicates an overestimation (values > 1) or underestimation (values < 1) of peripheral target locations
R2: coefficient of determination of the regression model, indicates the goodness of the fit of the localization responses to the regression line
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Samples of curves, or functional data, usually present phase variability in addition to amplitude variability. Existing functional regression methods do not handle phase variability in an efficient way. In this article we propose a functional regression method that incorporates phase synchronization as an intrinsic part of the model, and then attains better predictive power than ordinary linear regression in a simple and parsimonious way. The finite-sample properties of the estimators are studied by simulation. As an example of application, we analyze neuromotor data arising from a study of human lip movement. This article has supplementary materials online.
Facebook
TwitterThe estimates and standard errors were computed as follows:
Foot placement control sensitivities was estimated from linear regression over motion capture data, using methods identical to those in Yang and Srinivasan [1], Perry and Srinivasan [2], Joshi and Srinivasan [3], and Seethapathi and Srinivasan [4]. The estimates and standard errors were obtained from the linear regression software fitlm in MATLAB. The resting metabolic data just involves simple averages and are from Hanford and Srinivasan [5], Seethapathi and Srinivasan [6], and Brown, Seethapathi, and Srinivasan [7]. The estimates and standard errors were obtained by elementary formulas for mean and standard error. The exponential fit to the walking metabolic rate is performed using fminunc in MATLAB to minimize a mean squared error between an exponential a0 + a1*exp(-lambda t) and the data [7].
References [1] Wang, Yang, and Manoj Srinivasan. "Stepping in the direction of the fall: the next foot placement can be predicted f...
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.