Facebook
TwitterThis dataset was created by Veronica Zheng
Released under Other (specified in description)
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by R Durai Srinivasan
Released under MIT
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File List glmmeg.R: R code demonstrating how to fit a logistic regression model, with a random intercept term, to randomly generated overdispersed binomial data. boot.glmm.R: R code for estimating P-values by applying the bootstrap to a GLMM likelihood ratio statistic. Description glmm.R is some example R code which show how to fit a logistic regression model (with or without a random effects term) and use diagnostic plots to check the fit. The code is run on some randomly generated data, which are generated in such a way that overdispersion is evident. This code could be directly applied for your own analyses if you read into R a data.frame called “dataset”, which has columns labelled “success” and “failure” (for number of binomial successes and failures), “species” (a label for the different rows in the dataset), and where we want to test for the effect of some predictor variable called “location”. In other cases, just change the labels and formula as appropriate. boot.glmm.R extends glmm.R by using bootstrapping to calculate P-values in a way that provides better control of Type I error in small samples. It accepts data in the same form as that generated in glmm.R.
Facebook
TwitterThis R code regenerates the simulated data sets. (R 6 kb)
Facebook
TwitterThis data release contains one dataset and one model archive in support of the journal article "Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies" by Jennifer C. Murphy and Jeffrey G. Chanat. The model archive contains scripts (run in R) to reproduce the four machine learning models (logistic regression, linear and quadratic discriminant analysis, and k-nearest neighbors) trained and tested as part of the journal article. The dataset contains the estimated probabilities for each of these models when applied to a training and test dataset.
Facebook
TwitterR code for conducting analyses described in Johnson, N.S., W.D. Swink, and T.O. Brenden. Field study suggests that sex determination in sea lamprey is directly influenced by larval growth rate. Proceedings of the Royal Society B.data.csv is the raw data for fitting the Bayesian hierarchical logistic regression modelRead me_Metadata... is the metadata describing the variables in the data.csv fileRscript.R is the R script for fitting the Bayesian hierarchical logistic regression model
Facebook
Twitterhttp://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Data for Hilbe, J.M. 2015. Practical Guide to Logistic Regression (Chapman and Hall/CRC Press).
Version: 1.3
CRAN: https://CRAN.R-project.org/package=LOGIT (removed)
CRAN archive: https://cran.r-project.org/src/contrib/Archive/LOGIT (archived on 2018-5-10)
Mirror: GitHub
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data are genetic assignments to upstream or downstream of Site C dam (bull trout, Arctic grayling, and rainbow trout). Columns are defined in the csv file. Also file of R code to run analysis
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chi-square test and logistic regression model in the study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the contemporary context of a burgeoning energy crisis, the accurate and dependable prediction of Solar Radiation (SR) has emerged as an indispensable component within thermal systems to facilitate renewable energy generation. Machine Learning (ML) models have gained widespread recognition for their precision and computational efficiency in addressing SR prediction challenges. Consequently, this paper introduces an innovative SR prediction model, denoted as the Cheetah Optimizer-Random Forest (CO-RF) model. The CO component plays a pivotal role in selecting the most informative features for hourly SR forecasting, subsequently serving as inputs to the RF model. The efficacy of the developed CO-RF model is rigorously assessed using two publicly available SR datasets. Evaluation metrics encompassing Mean Absolute Error (MAE), Mean Squared Error (MSE), and coefficient of determination (R2) are employed to validate its performance. Quantitative analysis demonstrates that the CO-RF model surpasses other techniques, Logistic Regression (LR), Support Vector Machine (SVM), Artificial Neural Network, and standalone Random Forest (RF), both in the training and testing phases of SR prediction. The proposed CO-RF model outperforms others, achieving a low MAE of 0.0365, MSE of 0.0074, and an R2 of 0.9251 on the first dataset, and an MAE of 0.0469, MSE of 0.0032, and R2 of 0.9868 on the second dataset, demonstrating significant error reduction.
Facebook
TwitterImplementation of GWR in R
This repository contains code and files related to my project on Geographically Weighted Regression (GWR) in R. The dataset is from Badan Pusat Statistik. Files
Dataset.xlsx: This file contains the dataset used in the analysis.
GWLR.R: This script implements Geographically Weighted Logistic Regression in R.
GWPR.r: This script implements Geographically Weighted Poisson Regression in R.
GWR.R: This script implements Geographically Weighted Regression in R.
License
This project is licensed under the MIT License. See the LICENSE file for more details. Contact
If you have any questions or suggestions, feel free to contact me.
Facebook
TwitterInitial set of covariates for logistic regression.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The uploaded files are:
1) Excel file containing 6 sheets in respective Order: "Data Extraction" (summarized final data extractions from the three reviewers involved), "Comparison Data" (data related to the comparisons investigated), "Paper level data" (summaries at paper level), "Outcome Event Data" (information with respect to number of events for every outcome investigated within a paper), "Tuning Classification" (data related to the manner of hyperparameter tuning of Machine Learning Algorithms).
2) R script used for the Analysis (In order to read the data, please: Save "Comparison Data", "Paper level data", "Outcome Event Data" Excel sheets as txt files. In the R script srpap: Refers to the "Paper level data" sheet, srevents: Refers to the "Outcome Event Data" sheet and srcompx: Refers to " Comparison data Sheet".
3) Supplementary Material: Including Search String, Tables of data, Figures
4) PRISMA checklist items
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterThis dataset was created by Gaurav B R
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Figure S1. Flow diagram of participants through the model development stages. T1D: type 1 diabetes, T2D: type 2 diabetes. Figure S2. ROC AUC plots obtained using external validation dataset for seven prediction models. Legend: Solid lines: black = Support Vector Machine, dark grey = Logistic Regression, light grey = Random Forest. Dotted lines: black = Neural Network, dark grey = K-Nearest Neighbours, light grey = Gradient Boosting Machine. Figure S3. Correlation coefficient matrix and scatter plot of model predictions obtained from external test validation data.
Facebook
TwitterThe dependent variable was dichotomous taking value 1 for years in which a European Economic Area (EEA) country was experiencing an HIV outbreak, 0 otherwise. The results include Odds Ratios (OR), Lower (L) and Upper (U) limits of the confidence interval (CI), P-values, the number of Observations (Obs) in each model, and the number of countries (C) from which data were obtained for at least one year.†Per capita or per population††PWI: Public Wealth Index = GDP per capita divided by S80/S20 ratio.Univariable logistic regression models.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Facebook
TwitterThis R script is designed to build a high-performance logistic regression model for predicting whether a car is a bad buy (IsBadBuy) using the Carvana dataset. It improves prediction accuracy by:
Handling Missing Values – Uses median for numeric and mode for categorical variables instead of replacing NULLs with zero. Feature Engineering – Adds log transformation for VehBCost and dummy encodes categorical variables for better model performance. Model Training & Evaluation – Runs a logistic regression model, calculates McFadden’s pseudo R² for model fit, and generates a hit rate score for accuracy. Prediction & Submission – Predicts IsBadBuy for the test set and creates a submission file (optimized_submission.csv) in the required Kaggle format.
Facebook
TwitterThis dataset was created by Veronica Zheng
Released under Other (specified in description)