100+ datasets found
  1. U

    An example data set for exploration of Multiple Linear Regression

    • data.usgs.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +1more
    Updated Feb 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Farmer (2024). An example data set for exploration of Multiple Linear Regression [Dataset]. http://doi.org/10.5066/P9T5ZEXV
    Explore at:
    Dataset updated
    Feb 24, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    William Farmer
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    1956 - 2016
    Description

    This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.

  2. d

    Data for multiple linear regression models for predicting microcystin...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data for multiple linear regression models for predicting microcystin concentration action-level exceedances in selected lakes in Ohio [Dataset]. https://catalog.data.gov/dataset/data-for-multiple-linear-regression-models-for-predicting-microcystin-concentration-action
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Ohio
    Description

    Site-specific multiple linear regression models were developed for eight sites in Ohio—six in the Western Lake Erie Basin and two in northeast Ohio on inland reservoirs--to quickly predict action-level exceedances for a cyanotoxin, microcystin, in recreational and drinking waters used by the public. Real-time models include easily- or continuously-measured factors that do not require that a sample be collected. Real-time models are presented in two categories: (1) six models with continuous monitor data, and (2) three models with on-site measurements. Real-time models commonly included variables such as phycocyanin, pH, specific conductance, and streamflow or gage height. Many of the real-time factors were averages over time periods antecedent to the time the microcystin sample was collected, including water-quality data compiled from continuous monitors. Comprehensive models use a combination of discrete sample-based measurements and real-time factors. Comprehensive models were useful at some sites with lagged variables (< 2 weeks) for cyanobacterial toxin genes, dissolved nutrients, and (or) N to P ratios. Comprehensive models are presented in three categories: (1) three models with continuous monitor data and lagged comprehensive variables, (2) five models with no continuous monitor data and lagged comprehensive variables, and (3) one model with continuous monitor data and same-day comprehensive variables. Funding for this work was provided by the Ohio Water Development Authority and the U.S. Geological Survey Cooperative Water Program.

  3. Linear Regression example Dataset

    • kaggle.com
    Updated Dec 20, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Çağrı Karadeniz (2021). Linear Regression example Dataset [Dataset]. https://www.kaggle.com/datasets/arkaradeniz/linear-regression-example-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Çağrı Karadeniz
    Description

    Dataset

    This dataset was created by Çağrı Karadeniz

    Contents

  4. m

    Panel dataset on Brazilian fuel demand

    • data.mendeley.com
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Prolo (2024). Panel dataset on Brazilian fuel demand [Dataset]. http://doi.org/10.17632/hzpwbp7j22.1
    Explore at:
    Dataset updated
    Oct 7, 2024
    Authors
    Sergio Prolo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary : Fuel demand is shown to be influenced by fuel prices, people's income and motorization rates. We explore the effects of electric vehicle's rates in gasoline demand using this panel dataset.

    Files : dataset.csv - Panel dimensions are the Brazilian state ( i ) and year ( t ). The other columns are: gasoline sales per capita (ln_Sg_pc), prices of gasoline (ln_Pg) and ethanol (ln_Pe) and their lags, motorization rates of combustion vehicles (ln_Mi_c) and electric vehicles (ln_Mi_e) and GDP per capita (ln_gdp_pc). All variables are all under the natural log function, since we use this to calculate demand elasticities in a regression model.

    adjacency.csv - The adjacency matrix used in interaction with electric vehicles' motorization rates to calculate spatial effects. At first, it follows a binary adjacency formula: for each pair of states i and j, the cell (i, j) is 0 if the states are not adjacent and 1 if they are. Then, each row is normalized to have sum equal to one.

    regression.do - Series of Stata commands used to estimate the regression models of our study. dataset.csv must be imported to work, see comment section.

    dataset_predictions.xlsx - Based on the estimations from Stata, we use this excel file to make average predictions by year and by state. Also, by including years beyond the last panel sample, we also forecast the model into the future and evaluate the effects of different policies that influence gasoline prices (taxation) and EV motorization rates (electrification). This file is primarily used to create images, but can be used to further understand how the forecasting scenarios are set up.

    Sources: Fuel prices and sales: ANP (https://www.gov.br/anp/en/access-information/what-is-anp/what-is-anp) State population, GDP and vehicle fleet: IBGE (https://www.ibge.gov.br/en/home-eng.html?lang=en-GB) State EV fleet: Anfavea (https://anfavea.com.br/en/site/anuarios/)

  5. Stonybrook_AMS578_Multiple_Regression_Dataset

    • kaggle.com
    Updated Dec 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Chan (2020). Stonybrook_AMS578_Multiple_Regression_Dataset [Dataset]. https://www.kaggle.com/josephchan524/stonybrook-ams578-multiple-regression-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Joseph Chan
    Description

    Context

    This is a dataset is a Multiple Regression Project from an Applied Math Science Graduate Level Course at Stony Brook (AMS578 Spring 2020).

    The class blackboard has a pdf file of a paper by Caspi et al. that reports a finding of a gene-environment interaction. This paper used multiple regression techniques as the methodology for its findings. You should read it for background, as it is the genesis of the models that you will be given. The data that you are analyzing is synthetic. That is, the TA used a model to generate the data. Your task is to find the model that the TA used for your data. For example, one possible model is

    The class blackboard also contains a paper by Risch et al. that uses a larger collection of data to assess the findings in Caspi et al. These researchers confirmed that Caspi et al. calculated their results correctly but that no other dataset had the relation reported in Caspi et al. That is, Caspi et al. seem to have reported a false positive (Type I error). The class blackboard contains a recent paper about the genetics of mental illness and a technical appendix giving the specifics. Together these papers are an example of the response of the research community to studying the genetics of mental illness, which is a notoriously difficult research area.

    Content

    One file contains the patient identifier and the dependent variable value. The second file contains the patient identifier and values of six environment variables called E1 to E6. The third file contains the patient identifier and the twenty independent indicator variables called G1 to G20. The records may not be in correct order in each file, and cases may be missing in one or more of the files. You can process the data with VMLOOKUP or other data merging software.

  6. u

    Data from: Dataset of the paper “Variable selection for linear regression in...

    • investigacion.ubu.es
    Updated 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pacheco Bonrostro, Joaquín; Casado Yusta, Silvia; Pacheco Bonrostro, Joaquín; Casado Yusta, Silvia (2020). Dataset of the paper “Variable selection for linear regression in large databases: exact methods” Applied Intelligence, 51(6), 3736-3756 [Dataset]. https://investigacion.ubu.es/documentos/682afba74c44bf76b28811e1
    Explore at:
    Dataset updated
    2020
    Authors
    Pacheco Bonrostro, Joaquín; Casado Yusta, Silvia; Pacheco Bonrostro, Joaquín; Casado Yusta, Silvia
    Description

    The variable selection problem in the context of Linear Regression for large databases is analysed. The problem consists in selecting a small subset of independent variables that can perform the prediction task optimally. This problem has a wide range of applications. One important type of application is the design of composite indicators in various areas (sociology and economics, for example). Other important applications of variable selection in linear regression can be found in fields such as chemometrics, genetics, and climate prediction, among many others. For this problem, we propose a Branch & Bound method. This is an exact method and therefore guarantees optimal solutions. We also provide strategies that enable this method to be applied in very large databases (with hundreds of thousands of cases) in a moderate computation time. A series of computational experiments shows that our method performs well compared with well-known methods in the literature and with commercial software.

  7. d

    Data from: Mining Distance-Based Outliers in Near Linear Time

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Mining Distance-Based Outliers in Near Linear Time [Dataset]. https://catalog.data.gov/dataset/mining-distance-based-outliers-in-near-linear-time
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Full title: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule Abstract: Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the efficiency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set.

  8. toy_lr

    • kaggle.com
    Updated Oct 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Dragon (2022). toy_lr [Dataset]. https://www.kaggle.com/datasets/daviddragon/toy-lr
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 9, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    David Dragon
    Description

    A toy dataset for running linear regression! The dataset consists of inputs and targets. Inputs are of shape (1000, 10), where there are 1000 examples and 10 input features. Targets are of shape (1000,), one target per example. Submit learned weights and biases at https://forms.gle/R4gRgrSYcMTPXZUy9 to get a score! Template notebook to get started: https://www.kaggle.com/code/daviddragon/toy-lr-template/notebook

  9. g

    Replication data for: Linear Models with Outliers: Choosing between...

    • datasearch.gesis.org
    • dataverse.harvard.edu
    • +1more
    Updated Jan 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harden, Jeffrey; Desmarais, Bruce (2020). Replication data for: Linear Models with Outliers: Choosing between Conditional-Mean and Conditional-Median Methods [Dataset]. https://datasearch.gesis.org/dataset/httpsdataverse.unc.eduoai--hdl1902.2911608
    Explore at:
    Dataset updated
    Jan 22, 2020
    Dataset provided by
    Odum Institute Dataverse Network
    Authors
    Harden, Jeffrey; Desmarais, Bruce
    Description

    State politics researchers commonly employ ordinary least squares (OLS) regression or one of its variants to test linear hypotheses. However, OLS is easily influenced by outliers and thus can produce misleading results when the error term distribution has heavy tails. Here we demonstrate that median regression (MR), an alternative to OLS that conditions the median of the dependent variable (rather than the mean) on the independent variables, can be a solution to this problem. Then we propose and validate a hypothesis test that applied researchers can use to select between OLS and MR in a given sample of data. Finally, we present two examples from state politics research in which (1) the test selects MR over OLS and (2) differences in results between the two methods could lead to different substantive inferences. We conclude that MR and the test we propose can improve linear models in state politics research.

  10. SPSS Data Set S1 Logistic Regression Model Data

    • figshare.com
    bin
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michelle Klailova; Phyllis Lee (2016). SPSS Data Set S1 Logistic Regression Model Data [Dataset]. http://doi.org/10.6084/m9.figshare.1051748.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Michelle Klailova; Phyllis Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data set from PLOS ONE Article Published Entitled: Western Lowland Gorillas Signal Selectively Using Odor

  11. Dataset for: A comparison of approaches for simultaneous inference of fixed...

    • wiley.figshare.com
    txt
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Signe Marie Jensen; Christian Ritz (2023). Dataset for: A comparison of approaches for simultaneous inference of fixed effects for multiple outcomes using linear mixed models [Dataset]. http://doi.org/10.6084/m9.figshare.5954836.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Wileyhttps://www.wiley.com/
    Authors
    Signe Marie Jensen; Christian Ritz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Longitudinal studies with multiple outcomes often pose challenges for the statistical analysis. A joint model including all outcomes has the advantage of incorporating the simultaneous behavior but is often difficult to fit due to computational challenges. We consider two alternative approaches in order to quantify and assess the loss in efficiency as compared to joint modelling when evaluating fixed effects. The first approach is pairwise fitting of pseudo-likelihood functions for pairs of outcomes. The second approach recovers correlations between parameter estimates across multiple marginal linear mixed models. The methods are evaluated both in terms of a data example from a study on the effects of milk protein on health in young adolescents and in an extensive simulation study. We find that the two alternatives give similar results in settings where an exchangeability condition is met, but otherwise pairwise fitting shows a larger loss in efficiency than the marginal models approach. Using an alternative to the joint modelling strategy will lead to some but not necessarily a large loss of efficiency for small sample sizes.

  12. Linear Algebra Dataset for Students

    • kaggle.com
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aswina Vinod Kumar A (2025). Linear Algebra Dataset for Students [Dataset]. https://www.kaggle.com/datasets/aswinavinod/linear-algebra-dataset-for-students/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 9, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aswina Vinod Kumar A
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains a small set of linear algebra concepts, definitions, and example solutions that I’ve compiled from my personal learning. Topics include determinants, trace, eigenvalues, eigenvectors, and basic matrix properties. Each concept is explained with relevant formulas or proof.

  13. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  14. f

    Data from: Solving linear regression without skewness of the residuals’...

    • tandf.figshare.com
    txt
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Ricker (2023). Solving linear regression without skewness of the residuals’ distribution [Dataset]. http://doi.org/10.6084/m9.figshare.8152901.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Martin Ricker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Linear ordinary least squares (OLS) regression assumes an unskewed distribution of the residuals for correct inference and prediction. A proof is given that for Manly’s exponential transformation of the dependent variable, there is always at least one solution for λ, such that the skewness of the standardized residuals’ distribution is zero. A computer code in Mathematica, together with an illustrative example, are provided. Generalized linear models are discussed briefly in comparison.

  15. e

    Introduction to spatial statistics - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Nov 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Introduction to spatial statistics - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/63f3ba40-a121-5fd4-bc46-ae5c7797c215
    Explore at:
    Dataset updated
    Nov 24, 2024
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset constitutes an introduction to plotting and mapping and the essential concepts of spatial data management and modeling. And data ready for several examples of regression and classification algorithms (Multiple Linear Regression, Generalized Linear Models, CART and Random Forest), also exploring classic interpolation methods (Inverse Distance Weighting and Kriging). R, 4.2.1 This research has led to the development of teaching materials and the improvement of teaching practices.

  16. f

    DataSheet1_Repeated Measures Correlation.pdf

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Z. Bakdash; Laura R. Marusich (2023). DataSheet1_Repeated Measures Correlation.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2017.00456.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Jonathan Z. Bakdash; Laura R. Marusich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Repeated measures correlation (rmcorr) is a statistical technique for determining the common within-individual association for paired measures assessed on two or more occasions for multiple individuals. Simple regression/correlation is often applied to non-independent observations or aggregated data; this may produce biased, specious results due to violation of independence and/or differing patterns between-participants versus within-participants. Unlike simple regression/correlation, rmcorr does not violate the assumption of independence of observations. Also, rmcorr tends to have much greater statistical power because neither averaging nor aggregation is necessary for an intra-individual research question. Rmcorr estimates the common regression slope, the association shared among individuals. To make rmcorr accessible, we provide background information for its assumptions and equations, visualization, power, and tradeoffs with rmcorr compared to multilevel modeling. We introduce the R package (rmcorr) and demonstrate its use for inferential statistics and visualization with two example datasets. The examples are used to illustrate research questions at different levels of analysis, intra-individual, and inter-individual. Rmcorr is well-suited for research questions regarding the common linear association in paired repeated measures data. All results are fully reproducible.

  17. e

    Simple download service (Atom) of the dataset: Linear entity at the origin...

    • data.europa.eu
    unknown
    Updated Jan 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Simple download service (Atom) of the dataset: Linear entity at the origin of the risk of Loison-sub-Lens NPP [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-srv-af5937d8-dd86-424a-a3c8-541b84ba8b68
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Jan 26, 2022
    Description

    The origin of the risk characterises the real-world entity which, through its presence, represents a potential risk. This origin may be characterised by a name and, in some cases, a geographical object locating the actual entity causing the risk. The location of the entity and the knowledge of the hazardous phenomenon are used to define the risk pools, the risk-exposed areas that underpin the RPP.For NPPs, this entity may, for example, correspond to a river, a geologically unstable area.

  18. e

    Simple download service (Atom) of the dataset: Linear coverings of the POS...

    • data.europa.eu
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simple download service (Atom) of the dataset: Linear coverings of the POS (doc. of 07.04.2000) of the municipality of Ménétréol-sous-Sancerre [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-srv-ef8a5ed6-50a0-4b71-b7f8-71cf4d44783c
    Explore at:
    inspire download serviceAvailable download formats
    Description

    The cladding elements are entries in relation to a regulatory provision (way width, odds, names of neighbouring municipalities.) or geometrical surface, linear or point indicative elements, dressing the graphic documents of the PLU or the POS. They are necessary for the paper edition of the applicable graphic documents. This may be, for example, a hold of a detail plan, a frame, a cartridge, a reminder for a writing, a draw to draw a rating, an equipment identification label

  19. t

    Linear Separation via Optimism - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Linear Separation via Optimism - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/linear-separation-via-optimism
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The dataset is a linearly separable dataset with margin γ > 0, where each example pair is described by a vector x(i) ∈ Rd and a label y(i) ∈ {±1}.

  20. e

    Simple download service (Atom) of the dataset: Linear entity at the origin...

    • data.europa.eu
    Updated Apr 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Simple download service (Atom) of the dataset: Linear entity at the origin of the risk of Mazingarbe PPRN [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-srv-dd4fe46a-c6c0-47ac-8c1c-c33ca06f1db5/embed
    Explore at:
    inspire download serviceAvailable download formats
    Dataset updated
    Apr 1, 2019
    Description

    The origin of the risk characterises the real-world entity which, through its presence, represents a potential risk. This origin may be characterised by a name and, in some cases, a geographical object locating the actual entity causing the risk. The location of the entity and the knowledge of the hazardous phenomenon are used to define the risk pools, the risk-exposed areas that underpin the RPP.For NPPs, this entity may, for example, correspond to a river, a geologically unstable area.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
William Farmer (2024). An example data set for exploration of Multiple Linear Regression [Dataset]. http://doi.org/10.5066/P9T5ZEXV

An example data set for exploration of Multiple Linear Regression

Explore at:
Dataset updated
Feb 24, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
William Farmer
License

U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically

Time period covered
1956 - 2016
Description

This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.

Search
Clear search
Close search
Google apps
Main menu