Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.
https://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.htmlhttps://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html
Replication pack, FSE2018 submission #164: ------------------------------------------
**Working title:** Ecosystem-Level Factors Affecting the Survival of Open-Source Projects: A Case Study of the PyPI Ecosystem **Note:** link to data artifacts is already included in the paper. Link to the code will be included in the Camera Ready version as well. Content description =================== - **ghd-0.1.0.zip** - the code archive. This code produces the dataset files described below - **settings.py** - settings template for the code archive. - **dataset_minimal_Jan_2018.zip** - the minimally sufficient version of the dataset. This dataset only includes stats aggregated by the ecosystem (PyPI) - **dataset_full_Jan_2018.tgz** - full version of the dataset, including project-level statistics. It is ~34Gb unpacked. This dataset still doesn't include PyPI packages themselves, which take around 2TB. - **build_model.r, helpers.r** - R files to process the survival data (`survival_data.csv` in **dataset_minimal_Jan_2018.zip**, `common.cache/survival_data.pypi_2008_2017-12_6.csv` in **dataset_full_Jan_2018.tgz**) - **Interview protocol.pdf** - approximate protocol used for semistructured interviews. - LICENSE - text of GPL v3, under which this dataset is published - INSTALL.md - replication guide (~2 pages)
Replication guide ================= Step 0 - prerequisites ---------------------- - Unix-compatible OS (Linux or OS X) - Python interpreter (2.7 was used; Python 3 compatibility is highly likely) - R 3.4 or higher (3.4.4 was used, 3.2 is known to be incompatible) Depending on detalization level (see Step 2 for more details): - up to 2Tb of disk space (see Step 2 detalization levels) - at least 16Gb of RAM (64 preferable) - few hours to few month of processing time Step 1 - software ---------------- - unpack **ghd-0.1.0.zip**, or clone from gitlab: git clone https://gitlab.com/user2589/ghd.git git checkout 0.1.0 `cd` into the extracted folder. All commands below assume it as a current directory. - copy `settings.py` into the extracted folder. Edit the file: * set `DATASET_PATH` to some newly created folder path * add at least one GitHub API token to `SCRAPER_GITHUB_API_TOKENS` - install docker. For Ubuntu Linux, the command is `sudo apt-get install docker-compose` - install libarchive and headers: `sudo apt-get install libarchive-dev` - (optional) to replicate on NPM, install yajl: `sudo apt-get install yajl-tools` Without this dependency, you might get an error on the next step, but it's safe to ignore. - install Python libraries: `pip install --user -r requirements.txt` . - disable all APIs except GitHub (Bitbucket and Gitlab support were not yet implemented when this study was in progress): edit `scraper/init.py`, comment out everything except GitHub support in `PROVIDERS`. Step 2 - obtaining the dataset ----------------------------- The ultimate goal of this step is to get output of the Python function `common.utils.survival_data()` and save it into a CSV file: # copy and paste into a Python console from common import utils survival_data = utils.survival_data('pypi', '2008', smoothing=6) survival_data.to_csv('survival_data.csv') Since full replication will take several months, here are some ways to speedup the process: ####Option 2.a, difficulty level: easiest Just use the precomputed data. Step 1 is not necessary under this scenario. - extract **dataset_minimal_Jan_2018.zip** - get `survival_data.csv`, go to the next step ####Option 2.b, difficulty level: easy Use precomputed longitudinal feature values to build the final table. The whole process will take 15..30 minutes. - create a folder `
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This data set contains the ShinyFMBN app and the FoodMicrobionet database, containing metataxonomic data for bacterial communities of foods and food environments. Learn more at https://www.sciencedirect.com/science/article/pii/S0168160522001684 The ShinyFMBN app allows you to access FoodMicrobionet 4.2, a repository of data on food microbiome studies. To run the app you need to install R and R Studio. Data are available in both R (.rds) and .xlsx format (see below).
This compressed folder contains: a. folder R_lists: contains two .rds files containing all data in FoodMicrobionet 4.1.2. FMBN.rds is in a format usable with ShinyFMBN 2.4 (see below) while FMBN_plus.rds contains all tables and fields and is best accessed using custom R scripts (see https://github.com/ep142/ for examples). b. folder xlsx_files: contains all FoodMicrobionet tables in MS Excel format. These files may be useful because the locale of each given system may affect the way some fields containing accented letters are handled during the import of text files. c. folder shiny_FMBN_2_4_3: contains the app folder, the runShinyFMBN_2_4_3.R script (a R script to install all needed packages and run the app) and the app manual in .html format d. FMBNtablespecs_4_2.html describes the table specifications
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data archive includes R code and data for reproducing the analyses and figures in Lafferty, Metabarcoding is (usually) more cost effective than seining or qPCR for detecting tidewater gobies and other estuarine fishes.
To view the supplementary tables, open the Fig&TableSuppl.docx file. This file also includes the manuscript figures and tables and some explanatory text about how to generate them. To reproduce the figures, open the Fig&TableCode.Rmd in R studio and be sure the needed csv files included in the Dryad repository are in the working directory. The data files include more information than used in the analyses and can be used for other purposes. The code is not software, nor is it intended as an R package, but the code is annotated so others can understand and manipulate it. For each CSV file there is an associated metadata file that defines entries and columns and an information file that contains an abstract and ownership information. One of the data file ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional supporting information includes data, R script, and QGIS file supporting the main text:
CSV (Data Set)
residual_abyssal_peridotites.csv: Compilations of residual abyssal peridotites (n = 1162) and depleted MORB-mantle (n = 1)
residual_abyssal_peridotites_coda_results.csv: Filtered data and results of PCA and k-means clustering (n = 267)
model_cpx.csv: Clinopyroxene compositions obtained by open-system melting model
test.csv: csv file for testing new data
R
abyssal_cpx_pca.Rproj
coda.R: R script implemented in this study
test_your_data.R: R script to test new data comparing to abyssal clinopyroxenes
and modeled clinopyroxenes
QGIS
residual_abyssal_peridotites.qgz: QGIS using residual_abyssal_peridotites.csv and residual_abyssal_peridotites_coda_results.csv for Figure 1 and Figure S7
color_etopo1_ice_low_modified.tiff: ETOPO1 is a 1 arc-minute global relief model of Earth's surface that integrates land topography and ocean bathymetry from NOAA
We prepared an R script to compare new (your) clinopyroxene data with clinopyroxene from abyssal peridotites. New data will be plotted using the principal components derived from the natural clinopyroxene database presented in this paper.
The procedure is as follows:
Add clinopyroxene data (10 elements) and its label replacing under 2nd row * Label of data can be sample name, lithology, locality etc.
Open abyssal_cpx_pca.Rproj by R studio (double click) 3. Open test_your_data.R (double click)
Implement test_your_data.R.
To use test_your_data.R, first press cmd+A (ctrl+A) and press Run/cmd+enter (ctrl+enter).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides the artificial neural network architecture for a dual-fluid photovoltaic thermal (PV/T) collector which was experimentally tested in the outdoor environment of Malaysia. The system was set up and tested in three modes, which are (i) air mode, (ii) water mode and (iii) simultaneous mode. For modes (i), (ii) and (iii) air flows through the cooling channels, water flows through the cooling channels and both air and water flow together.
To create this dataset, the following steps were carried out:
Step 1: Import the data Step 2: Normalize the data Step 3: Split the dataset into training and testing data Step 4: Create the NN model in R studio.
The package 'Neuralnet' in the R programming language was used. The coding in R studio is provided in the attached file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research data supporting the publication:
Higgins SG, Nogiwa-Valdez AA, Stevens MM, Considerations for Implementing Electronic Laboratory Notebooks in an Academic Research Environment, Nature Protocols, 2021.
This repository contains the raw survey data of 172 current and historic electronic laboratory notebook (ELN) software packages.
Main files:
"ELN_Review_Higgins_2021_Survey.csv" = raw survey data in 'tidy' data format
"ELN_Review_Higgins_2021.Rmd" = an R Markdown File (R Notebook) that takes the survey data as input and produces summary statistics and plots. This file was written using R Studio as the IDE.
Derived files, generated from those above:
"ELN_Review_Higgins_2021.nb.html" = a self-contained HTML file that is automatically generated by R Studio, based on the markdown file. This can be opened in any web browser to allow manual inspection of the code and comments without the need for specialist software. Embedded within this file is also the original markdown script (i.e. a copy of the code in "ELN_Review_Higgins_2021.Rmd")
"ELN_Review_Higgins_2021_Lifetimes_Interactive_Figure1.html" = an HTML file generated by the script above via the plotly package. It contains an interactive version of the ELN survey data, allowing the user to hover over the timeline and explore the data.
"ELN_Review_Higgins_2021_Timeline.pdf" = static version of ELN timeline, used to generate figure in main manuscript.
"ELN_Review_Higgins_2021_Releases-Per-Year.pdf" = static version of number of new ELNs per year, used to generate figure in main manuscript.
This survey was generated from a mixture of primary and secondary sources (see references for secondary sources).
This data archive includes R code and data for reproducing the analyses and figures in Lafferty, Metabarcoding is (usually) more cost effective than seining or qPCR for detecting tidewater gobies and other estuarine fishes. To view the supplementary tables, open the Fig&TableSuppl.docx file. This file also includes the manuscript figures and tables and some explanatory text about how to generate them. To reproduce the figures, open the Fig&TableCode.Rmd in R studio and be sure the needed csv files included in the Dryad repository are in the working directory. The data files include more information than used in the analyses and can be used for other purposes. The code is not software, nor is it intended as an R package, but the code is annotated so others can understand and manipulate it. For each CSV file there is an associated metadata file that defines entries and columns and an information file that contains an abstract and ownership information. One of the data files required to reproduce the analyses (Schmelzle&Kinziger_occupancy.csv) was created from previously published data and was not produced by the author. Please cite it as: Schmelzle, Molly C., Kinziger, Andrew P. 2015. Data from: Using occupancy modeling to compare environmental DNA to traditional field methods for regional-scale monitoring of an endangered aquatic species. Dryad. 6rs23
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Column A: Binary classification of data based on laboratory values from Column B (cut-off value = 0: 0 = 0, > 0 = 1); Column B: Laboratory values; Column C: Randomized patient numbers; Columns E–H: Wavelengths with corresponding spectral data. (XLSX)
This dataset contains original quantitative datafiles, analysis data, a codebook, R scripts, syntax for replication, the original output from R studio and figures from a statistical program. The analyses can be found in Chapter 5 of my PhD dissertation, i.e., ‘Political Factors Affecting the EU Legislative Decision-Making Speed’. The data supporting the findings of this study are accessible and replicable. Restrictions apply to the availability of these data, which were used under license for this study. The datafiles include: File name of R script: Chapter 5 script.R File name of syntax: Syntax for replication 5.0.docx File name of the original output from R studio: The original output 5.0.pdf File name of code book: Codebook 5.0.txt File name of the analysis data: data5.0.xlsx File name of the dataset: Original quantitative data for Chapter 5.xlsx File name of the dataset: Codebook of policy responsiveness.pdf File name of figures: Chapter 5 Figures.zip Data analysis software: R studio R version 4.1.0 (2021-05-18) -- "Camp Pontanezen" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin17.0 (64-bit)
Vision and Change in Undergraduate Biology Education encouraged faculty to focus on core concepts and competencies in undergraduate curriculum. We created a sophomore-level course, Biologists' Toolkit, to focus on the competencies of quantitative reasoning and scientific communication. We introduce students to the statistical analysis of data using the open source statistical language and environment, R and R Studio, in the first two-thirds of the course. During this time the students learn to write basic computer commands to input data and conduct common statistical analysis. The students also learn to graphically represent their data using R. In a final project, we assign students unique data sets that require them to develop a hypothesis that can be explored with the data, analyze and graph the data, search literature related to their data set, and write a report that emulates a scientific paper. The final report includes publication quality graphs and proper reporting of data and statistical results. At the end of the course students reported greater confidence in their ability to read and make graphs, analyze data, and develop hypotheses. Although programming in R has a steep learning curve, we found that students who learned programming in R developed a robust strategy for data analyses and they retained and successfully applied those skills in other courses during their junior and senior years.
This dataset contains original quantitative datafiles, analysis data, a codebook, R scripts, syntax for replication, the original output from R studio and figures from a statistical program. The analyses can be found in Chapter 2 of my PhD dissertation, i.e., ‘Political Factors Affecting the EU Legislative Decision-Making Speed’. The data supporting the findings of this study are accessible and replicable. Restrictions apply to the availability of these data, which were used under license for this study. The datafiles include: File name of R script: Chapter 2 script.R File name of syntax: Syntax for replication 2.0.docx File name of the original output from R studio: The original output 2.0.pdf File name of code book: Codebook 2.0.txt File name of the analysis data: data2.1.xlsx File name of the dataset: Original quantitative data for Chapter 2.xlsx File name of figures: Chapter 2 Figures.zip
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COFI database includes power-generation projects in Belt and Road Initiative (BRI) countries financed by Chinese corporations and banks that reached financial closure from 2000 to 2020. Types of financing include debt and equity investment, with the latter including greenfield foreign direct investments (FDI) and cross-border mergers and acquisitions (M&As). COFI is consolidated using nine source databases using both automated join method in R Studio, and manual joining by analysts. The database includes power plant characteristics data and investment detail data. It captures 430 power plants in 76 BRI countries, including 220 equity investment transactions and 253 debt investment transactions made by Chinese investors. Key data points for financial transactions in COFI include the financial instrument (equity or debt), investor name, amount, and financial close year. Key technical characteristics tracked for projects in COFI include name, installed capacity, commissioning year, country, and primary fuel type. This project is a collaboration among the Boston University Global Development Policy Center, the Inter-American Dialogue, the China-Africa Research Initiative at the Johns Hopkins University (CARI), and the World Resources Institute (WRI). The detailed methodology is given in the World Resources Institute publication “China Overseas Finance Inventory”.
The data set was collected in Uppsala Sweden between 2019 and 2021. Hives were established using varroa resistant queens from Oslo, Norway (n = 3), Gotland Sweden, (n = 5), and Avignon, France (n = 4), with a varroa susceptible population from Uppsala, Sweden (n = 5) as control. All hives were located at the SLU Lövsta research station (GPS Coordinates: 59° 50’ 2.544”N, 17° 48’ 47.447”E). Varroa destructor mite reproductive success was measured on frames with adult honeybee workers exposed to, and excluded from access to honeybee larvae. Excluders were added directly after brood capping, and frames were dissected nine days later. Cell caps were removed using a scalpel with the pupae and mite families carefully removed from the cell using forceps and a fine paint brush. Mite reproductive success calculated by counting successful reproduction attempts, which was defined as a mite that successfully produced one male, and at least one female offspring. If a mite did not meet this requirement, it was considered a failed reproduction attempt and the reason for failure was documented. All data was analyzed in R version 4.0.1 using R Studio 1.3.959. A linear mixed-effect model was used with mite reproductive success as the response variable, population origin and excluder treatment as independent variables, with colony and year as random effect variables to compare treatments within each population as well as fecundity. Least-square means of the model were used to compare treatments between individual populations.
Scaramella_et_al_2023_Data.tsv - Data set consists of 34 rows and 21 columns. Colony demographics, and designated treatment are listed. All data collected are count data and are explained in more detail in read me file. R script used in analysis is attached. It is split into two sections, with the first being used for statistical analysis, and the second used for plot creations used in the paper. Sections defined by title SECTION 1 - ANALYSIS and SECTION 2 - PLOTs
The output Scaramella_et_al_2023_Analysis_Code_log.txt and plot file Rplots.pdf can, provided that the script is in the same directory as the data files and needed R packages are installed (see sessionInfo.txt), be reproduced by running:
Rscript Scaramella_et_al_2023_Analysis_Code.R >
Scaramella_et_al_2023_Analysis_Code_log.txt
Scaramella_et_al_2023_Bar_Graph_Data.tsv - Data set consisting of 8 rows & 5 columns. Colony demographics, and designated treatment are listed. All data generated from the count data in Scaramella_et_al_2023_Data.tsv and are explained in more detail in read me file.
Scaramella_et_al_2023_Stacked_Bar_Graph_Data.tsv - Data set consisting of 102 rows & 8 columns. Colony demographics, and designated treatment are listed. All data is Scaramella_et_al_2023_Data.tsv restructured to include reason failed as a column. The data is explained in more detail in read me file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This spreadsheet contains data from studies on e-books and English language learning. The data is from reliable sources in the Scopus database. It includes details like sample sizes, means, and standard deviations for both control and experimental groups. We include the algorithm that we use in the R studio software that we use.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data files (.csv) used in study of fruiting phenology patterns in Nyungwe National Park, Rwanda from 1996-2019. Datasets include climate variables (rain, irradiance, minimum and maximum temperatures, and ENSO index), fruiting phenology data, and GIS locations of study sites. Data are organized for use in statistical analyses using the R computational language.Instructions for use in R Project:We strongly suggest the creation of an R Project file in R Studio to use the scripts and data contained in this repository.Data files should be stored in a folder named "data", in the same directory as the R Project file. This will ensure that R scripts for loading data folders are accessing the correct directory.Script files should be stored in another folder in the same directory as the R Project file (suggested folder name: "scripts").
Community engagement in planning is essential for effective and just climate adaptation. However, historically underserved communities are often difficult to reach through traditional means of soliciting public input. The Climate Adaptation Solutions Accelerator (CASA) through School-Community Hubs project identifies public schools as promising sites for building both community engagement and community capacity for climate adaptation. To serve in this role, schools need information about the intersecting threats climate change poses to the communities they serve. The Climate Hazard Dashboard for California Schools is a platform that maps the current and future risks associated with five climate hazards, including wildfire, extreme heat days, wildfire extreme precipitation, flooding, and sea level rise, for the nearly 10,000 public schools serving Kindergarten through Grade 12 students in California. Each hazard is mapped and visualized at the school level, providing an accessible way fo..., Data for extreme heat and extreme precipitation were retrieved using API requests from the caladaptr package. The data retrieved to calculate extreme heat days were historical observed daily maximum temperature for 1961-2005 and projected daily maximum temperature for 2006-2064. The data retrieved to calculate extreme precipitation days were historical observed daily precipitation totals for 1961-2005 and projected daily precipitation totals for 2006-2064. Data for wildfire, flooding, and sea level rise were downloaded directly from their sources and stored in a remote server for use. All data were processed in R Studio using Quarto Docs. Tabular data for extreme heat and precipitation first used the retrieved historical data to calculate a threshold value to classify an extreme event. The threshold was determined to be the 98th percentile value of observed historical data for California. For extreme heat, this is 98°F. For extreme precipitation, this is 0.73 inches. Then, projected dai..., , ---
editor_options: markdown:
This README.txt file was generated on 2024-05-23 by Liane Chen, Charlie Curtin, Kristina Glass, and Hazel Vaquero. It is associated with the data archival on this project through Dryad. To view the data archival and download datasets, please visit https://doi.org/10.5061/dryad.1jwstqk3g.
Recommended citation:
Curtin, Charles; Glass, Kristina; Chen, Liane; Vaquero, Hazel (Forthcoming 2024). Climate Hazards Data Integration and Visualization for the Climate Adaptations Solutions Accelerator through School-Community Hubs [Dataset]. Dryad. https://doi.org/10.5061/dryad.1jwstqk3g
GENERAL INFORMATION
1. Title of the Project: Climate Adaptation Solutions Accelerator through School Community Hubs (alias CASAschools)
2. Author Information
A. Principal Investigator Contact Information
Name: Liane Chen, Charlie Curtin, Kristina Glass, and Hazel Vaque...
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This data set contains Crown of Thorns Starfish (Acanthaster planci and Acanthaster cf. solaris) behavioural data collected at Lankanfushi Island in the Maldives, and at Rib Reef on the Great Barrier Reef, Australia. The data is deposited here to accompany the Open Access publication from the Related Publications link below. Here, we include information on all individual starfish counted during surveys at different times of day (including at night) at both locations. Information provided includes location, date, time and depth at which each individual was found, as well as the maximum diameter of each individual and the behaviour each individual was exhibiting. More specifically, whether the starfish was hidden or exposed, if the individual was exhibiting resting, moving or feeding behaviour, and the prey items of those feeding is noted within this data set. Also included are point intercept coral cover data for each transect at each location as well as the R script used to analyse the data within the aforementioned publication.
The dataset consists of the following files:
The full methodology will be available in the Open Access publication from the Related Publications link below.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data release contains: (1) ASCII grids of predicted probability of elevated arsenic in groundwater for the Northwest and Central Minnesota regions, (2) input arsenic and predictive variable data used in model development and calculation of predictions, and (3) ASCII files used to predict the probability of elevated arsenic across the two study regions. The probability of elevated arsenic was predicted using Boosted Regression Tree (BRT) modeling methods using the gbm package in R Studio version 3.4.2. The response variable was the presence or absence of arsenic >10 µg/L, the U.S. Environmental Protection Agency’s maximum contaminant level for arsenic, in 3,283 wells located throughout both study regions (1,363 in the Northwest region and 1,920 in the Central). The original database used to develop the BRT model consisted of 127 predictor variables which included well characteristics, land use, soil properties, aquifer properties, depth to water table, and predicted nitrate ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.