Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
I have done Data Analysis Project on Movies Dataset using codes in Rstudio. I have used various data cleaning functions and used multiple data visualization codes to demonstrate various key findings on this projects. Dataset is related to a category #Entertainment.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The scripts in this folder weer used to combine all call statistic files per day into one file, resulting in nine files containing all call statistics per data. The script ‘merging_dataset.R’ was used to combine all days worth of call statistics and create subsets of two frequency ranges (18-32 and 32-96). The script ‘camera_data’ was used to combine all camera and observation data.
Facebook
TwitterThis dataset was created by Dishani Mishra
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Learn how to create professional data visualizations using R and ggplot2. A step-by-step guide for startup founders and analysts to build publication-quality charts.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In a world of increasing crime, many organizations are interested in examining incident details to learn from and prevent future crime. Our client, based in Los Angeles County, was interested in this exact thing. They asked us to examine the data to answer several questions; among them, what was the rate of increase or decrease in crime from 2020 to 2023, and which ethnicity or group of people were targeted the most.
Our data was collected from Kaggle.com at the following link:
https://www.kaggle.com/datasets/nathaniellybrand/los-angeles-crime-dataset-2020-present
It was cleaned, examined for further errors, and the analysis performed using RStudio. The results of this analysis are in the attached PDF entitled: "crime_data_analysis_report." Please feel free to review the results as well as follow along with the dataset on your own machine.
Facebook
TwitterABSTRACT Meta-analysis is an adequate statistical technique to combine results from different studies, and its use has been growing in the medical field. Thus, not only knowing how to interpret meta-analysis, but also knowing how to perform one, is fundamental today. Therefore, the objective of this article is to present the basic concepts and serve as a guide for conducting a meta-analysis using R and RStudio software. For this, the reader has access to the basic commands in the R and RStudio software, necessary for conducting a meta-analysis. The advantage of R is that it is a free software. For a better understanding of the commands, two examples were presented in a practical way, in addition to revising some basic concepts of this statistical technique. It is assumed that the data necessary for the meta-analysis has already been collected, that is, the description of methodologies for systematic review is not a discussed subject. Finally, it is worth remembering that there are many other techniques used in meta-analyses that were not addressed in this work. However, with the two examples used, the article already enables the reader to proceed with good and robust meta-analyses. Level of Evidence V, Expert Opinion.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
• Automated parametric analysis workflow built using R Studio.
• Demonstrates core statistical analysis methods on numerical datasets.
• Includes step-by-step R scripts for performing t-tests, ANOVA, and summary statistics.
• Provides visual outputs such as boxplots and distribution plots for better interpretation.
• Designed for students, researchers, and data analysts learning statistical automation in R.
• Useful for understanding reproducible research workflows in data analysis.
• Dataset helps in teaching how to automate statistical pipelines using R programming.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This is a repository for codes and datasets for the open-access paper in Linguistik Indonesia, the flagship journal for the Linguistic Society of Indonesia (Masyarakat Linguistik Indonesia [MLI]) (cf. the link in the references below).
Rajeg, G. P. W., Denistia, K., & Rajeg, I. M. (2018). Working with a linguistic corpus using R: An introductory note with Indonesian negating construction. Linguistik Indonesia, 36(1), 1–36. doi: 10.26499/li.v36i1.71
Cite (dark-pink button on the top-left) and select the citation style through the dropdown button (default style is Datacite option (right-hand side)Rmd file) used to write the paper and containing the R codes to generate the analyses in the paper.rds format so that all code-chunks in the R Markdown file can be run.csl files for the referencing and bibliography (with APA 6th style). Rproj). Double click on this file to open an RStudio session associated with the content of this repository. See here and here for details on Project-based workflow in RStudio.docx template file following the basic stylesheet for Linguistik Indonesiabookdown R package (Xie, 2018). Make sure this package is installed in R.
Facebook
TwitterStudents learn about the importance of good data management and begin to explore QGIS and RStudio for spatial analysis purposes. Students will explore National Land Cover Database raster data and made-up vector point data on both platforms.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study investigated predictive processing during both silent and oral reading, revealing a more pronounced predictability effect in the context of oral reading.
Facebook
TwitterLoad and view a real-world dataset in RStudio
• Calculate “Measure of Frequency” metrics
• Calculate “Measure of Central Tendency” metrics
• Calculate “Measure of Dispersion” metrics
• Use R’s in-built functions for additional data quality metrics
• Create a custom R function to calculate descriptive statistics on any given dataset
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Self Organizing Map Syntax and Data
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Project Name: Divvy Bikeshare Trip Data_Year2020 Date Range: April 2020 to December 2020. Analyst: Ajith Software: R Program, Microsoft Excel IDE: RStudio
The following are the basic system requirements, necessary for the project: Processor: Intel i3 or AMD Ryzen 3 and higher Internal RAM: 8 GB or higher Operating System: Windows 7 or above, MacOS
**Data Usage License: https://ride.divvybikes.com/data-license-agreement ** Introduction:
In this case, study we aim to utilize different data analysis techniques and tools, to understand the rental patterns of the divvy bike sharing company and understand the key business improvement suggestions. This case study is a mandatory project to be submitted to achieve the Google Data Analytics Certification. The data utilized in this case study was licensed based on the provided data usage license. The trips between April 2020 to December 2020 are used to analyse the data.
Scenario: Marketing team needs to design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ.
Objective: The main objective of this case study, is to understand the customer usage patterns and the breakdown of customers, based on their subscription status and the average durations of the rental bike usage.
Introduction to Data: The Data provided for this project, is adhered to the data usage license, laid down by the source company. The source data was provided in the CSV files and are month and quarter breakdowns. A total of 13 columns of data was provided in each csv file.
The following are the columns, which were initially observed across the datasets.
Ride_id Ride_type Start_station_name Start_station_id End_station_name End_station_id Usertype Start_time End_time Start_lat Start_lng End_lat End_lng
Documentation, Cleaning and Preparing Data for Analysis: The total size of the datasets, for the year 2020, is approximately 450 MB, which is tiring job, when you have to upload them to the SQL database and visualize using the BI tools. I wanted to improve my skills into R environment and this is the best opportunity and optimal to use R for the data analysis.
For more insights, installation procedures for R and RStudio, please refer to the following URL, for additional information.
R Projects Document: https://www.r-project.org/other-docs.html RStudio Download: https://www.rstudio.com/products/rstudio/ Installation Guide: https://www.youtube.com/watch?v=TFGYlKvQEQ4
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file compiles the different datasets used and analysis made in the paper "Visual Continuous Time Preferences". Both RStudio and Stata were used for the analysis. The first was used for descriptive statistics and graphs, the second for regressions. We join the datasets for both analysis.
"Analysis VCTP - RStudio.R" is the RStudio analysis. "Analysis VCTP - Stata.do" is the Stata analysis.
The RStudio datasets are: "data_Seville.xlsx" is the dataset of observations. "FormularioEng.xlsx" is the dataset of control variables.
The Stata datasets are: "data_Seville_Stata.dta" is the dataset of observations. "FormularioEng.dta" is the dataset of control variables
Additionally, the experimental instructions of the six experimental conditions are also available: "Hypothetical MPL-VCTP.pdf" is the instructions and task for hypothetical payment and MPL answered before VCTP. "Hypothetical VCTP-MPL.pdf" is the instructions and task for hypothetical payment and VCTP answered before MPL. "OneTenth MPL-VCTP.pdf" is the instructions and task for BRIS payment and MPL answered before VCTP. "OneTenth VCTP-MPL.pdf" is the instructions and task for BRIS payment and VCTP answered before MPL. "Real MPL-VCTP.pdf" is the instructions and task for real payment and VCTP answered before MPL. "Real VCTP-MPL.pdf" is the instructions and task for real payment and VCTP answered before MPL.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
In this study, we analyzed the efficiency of different adeno-associated viral (AAV) injections in transfecting neurons in the salamander Pleurodeles waltl. To query sources of variation in AAV injection outcomes, we analyzed metadata and outcomes for an additional 68 intraparenchymal injections performed 39 in post-metamorphic Pleurodeles salamanders. This dataset included both quantitative variables (age, weight, viral genomes (v.g.) injected, and a cell transduction score ranging from 0 to 4, see STAR Methods) and categorical variables (here referred to as qualitative variables: serotype, promoter, reporter, single vs. dual injection, manufacturer, and injection site). To assess the associations between these variables, we performed a Factor Analysis for Mixed Data (FAMD), a principal component method that is designed to determine significant sources of variability within datasets that contain both quantitative and qualitative data types. This dataset contains the injection outcomes and metadata for AAV injections administered in the salamander Pleurodeles waltl. The RactoMineR package (https://CRAN.R-project.org/package=FactoMineR) was used to determine significant contributions of a number of variables contributing to injection outcomes. Analysis of these data revealed that, among other factors, age co-varies with injection score. Therefore, we conclude that increased animal age decreases the efficacy of this tool.
Facebook
TwitterThis dataset contains original quantitative datafiles, analysis data, a codebook, R scripts, syntax for replication, the original output from R studio and figures from a statistical program. The analyses can be found in Chapter 5 of my PhD dissertation, i.e., ‘Political Factors Affecting the EU Legislative Decision-Making Speed’. The data supporting the findings of this study are accessible and replicable. Restrictions apply to the availability of these data, which were used under license for this study. The datafiles include: File name of R script: Chapter 5 script.R File name of syntax: Syntax for replication 5.0.docx File name of the original output from R studio: The original output 5.0.pdf File name of code book: Codebook 5.0.txt File name of the analysis data: data5.0.xlsx File name of the dataset: Original quantitative data for Chapter 5.xlsx File name of the dataset: Codebook of policy responsiveness.pdf File name of figures: Chapter 5 Figures.zip Data analysis software: R studio R version 4.1.0 (2021-05-18) -- "Camp Pontanezen" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin17.0 (64-bit)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R/R-Studio software code for waste survey data analysis of Jos Plateau state, Nigeria. This details the software code used in R-Studio (RStudio 2023.06.2+561 "Mountain Hydrangea" ) and R (R version 4.3.1) for analyzing the study on waste characterization survey/waste audit in Jos, Plateau state, Nigeria. The code is provided in .R, .docx, and .pdf., it is accessible directly using RStudio for .docx and .pdf. It includes codes for generating plots used in paper publication. Note: that you require the dataset used in the study which is provided in steps to reproduce.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The zip file contains the data files and R analysis script used in the manuscript titled 'Attentional bias modification in virtual reality - a VR-based dot-probe task with 2D and 3D stimuli'Analysis_script.R is a script file that can be opened by the statistical software R (https://www.r-project.org/) and Rstudio (https://www.rstudio.com/). All analysis steps and codes are found within this file.All files under the Data_files folder are directly called by Analysis_script from R, therefore please ensure that the folder structure and file names remain the same.Folder dot_probe_raw_data_files and its subfolders contain *.xml files with attentional bias (reaction time) data from the participants, generated by the VR program.outcome_measures_and_demographic_data.xlsx contains participant demographic data and questionnaire measures, generated by the iTerapi platform. This data file has been cleaned to remove information irrelevant to the analysis (e.g. number of reminder emails sent etc.).lsas_pre_individual_items.xlsx contains participant responses to individual items of the LSAS-SR questionnaire, generated by the iTerapi platform.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here you find an example research data dataset for the automotive demonstrator within the "AEGIS - Advanced Big Data Value Chain for Public Safety and Personal Security" big data project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732189. The time series data has been collected during trips conducted by three drivers driving the same vehicle in Austria.
The dataset contains 20Hz sampled CAN bus data from a passenger vehicle, e.g. WheelSpeed FL (speed of the front left wheel), SteerAngle (steering wheel angle), Role, Pitch, and accelerometer values per direction.
GPS data from the vehicle (see signals 'Latitude_Vehicle' and 'Longitude_Vehicle' in h5 group 'Math') and GPS data from the IMU device (see signals 'Latitude_IMU', 'Longitude_IMU' and 'Time_IMU' in h5 group 'Math') are included. However, as it had to be exported with single-precision, we lost some precision for those GPS values.
For data analysis we use R and R Studio (https://www.rstudio.com/) and the library h5.
e.g. check file with R code:
library(h5)
f <- h5file("file path/20181113_Driver1_Trip1.hdf")
summary(f["CAN/Yawrate1"][,])
summary(f["Math/Latitude_IMU"][,])
h5close(f)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.