100+ datasets found

d
Political Analysis Using R: Example Code and Data, Plus Data for Practice...
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ARKOTI
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Monogan, Jamie
Description
Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.
f
Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...
frontiersin.figshare.com
docx
Updated Mar 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2024.1379910.s001
Dataset updated
Mar 22, 2024
Dataset provided by
Frontiers
Authors
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.
Statistical Data Analysis using R
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistical Data Analysis using R [Dataset]. https://figshare.com/articles/dataset/Statistical_Data_Analysis_using_R/5501035
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5501035.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Samuel Barsanelli Costa
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.
w
Books series that contain Analyzing sensory data with R
workwithdata.com
Updated Mar 3, 2003
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2003). Books series that contain Analyzing sensory data with R [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Analyzing+sensory+data+with+R&j=1&j0=books
Explore at:
Dataset updated
Mar 3, 2003
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book series and is filtered where the books is Analyzing sensory data with R, featuring 10 columns including authors, average publication date, book publishers, book series, and books. The preview is ordered by number of books (descending).
f
Data from: pmartR: Quality Control and Statistics for Mass...
acs.figshare.com
figshare.com
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer (2023). pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00760.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jproteome.8b00760.s001
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography–MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
Data Set for "Analyzing Microbial Growth with R"
zenodo.org
csv
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian D. Connelly; Brian D. Connelly (2020). Data Set for "Analyzing Microbial Growth with R" [Dataset]. http://doi.org/10.5281/zenodo.1171129
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1171129
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Brian D. Connelly; Brian D. Connelly
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample data set used in "Analyzing Microbial Growth with R"
w
Data from: Sensory data analysis by example with R
workwithdata.com
Updated Jan 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2022). Sensory data analysis by example with R [Dataset]. https://www.workwithdata.com/object/sensory-data-analysis-by-example-with-r-book-by-sebastien-le-0000
Explore at:
Dataset updated
Jan 5, 2022
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sensory data analysis by example with R is a book. It was written by Sébastien Lê and published by Chapman&Hall/CRC in 2014.
d
Hydroinformatics: Intro to Hydrologic Analysis in R (Bookdown and Code)
search.dataone.org
beta.hydroshare.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John P Gannon (2021). Hydroinformatics: Intro to Hydrologic Analysis in R (Bookdown and Code) [Dataset]. https://search.dataone.org/view/sha256%3A0a728bb4a6759737e777a3ad29355a61b252ad7c0a59b33dab345c789107a8c8
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
John P Gannon
Description
The linked bookdown contains the notes and most exercises for a course on data analysis techniques in hydrology using the programming language R. The material will be updated each time the course is taught. If new topics are added, the topics they replace will remain, in case they are useful to others.

I hope these materials can be a resource to those teaching themselves R for hydrologic analysis and/or for instructors who may want to use a lesson or two or the entire course. At the top of each chapter there is a link to a github repository. In each repository is the code that produces each chapter and a version where the code chunks within it are blank. These repositories are all template repositories, so you can easily copy them to your own github space by clicking Use This Template on the repo page.

In my class, I work through the each document, live coding with students following along.Typically I ask students to watch as I code and explain the chunk and then replicate it on their computer. Depending on the lesson, I will ask students to try some of the chunks before I show them the code as an in-class activity. Some chunks are explicitly designed for this purpose and are typically labeled a “challenge.”

Chapters called ACTIVITY are either homework or class-period-long in-class activities. The code chunks in these are therefore blank. If you would like a key for any of these, please just send me an email.

If you have questions, suggestions, or would like activity answer keys, etc. please email me at jpgannon at vt.edu

Finally, if you use this resource, please fill out the survey on the first page of the bookdown (https://forms.gle/6Zcntzvr1wZZUh6S7). This will help me get an idea of how people are using this resource, how I might improve it, and whether or not I should continue to update it.
e
Subsetting
paper.erudition.co.in
html
Updated Mar 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
Explore at:
htmlAvailable download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024
m
Data for: A novel statistical model for analyzing data of a systematic...
data.mendeley.com
narcis.nl
Updated Oct 9, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonius Schneider (2017). Data for: A novel statistical model for analyzing data of a systematic review generates optimal cut-off values for FENO measurement for asthma diagnosis [Dataset]. http://doi.org/10.17632/fndpn5bnps.1
Explore at:
Unique identifier
https://doi.org/10.17632/fndpn5bnps.1
Dataset updated
Oct 9, 2017
Authors
Antonius Schneider
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
Supplementary Information:

FENO_MultipleCO_basic.csv (Raw Study Data) diagmeta.R (R functions for running the Multiple thresholds model) example.R (R code for running the model with the data)
g
Scripts and data to run R-QWTREND models and produce results | gimi9.com
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scripts and data to run R-QWTREND models and produce results | gimi9.com [Dataset]. https://www.gimi9.com/dataset/data-gov_scripts-and-data-to-run-r-qwtrend-models-and-produce-results/
Explore at:
Description
This child page contains a zipped folder which contains all items necessary to run trend models and produce results published in U.S. Geological Scientific Investigations Report 2021–XXXX [Tatge, W.S., Nustad, R.A., and Galloway, J.M., 2021, Evaluation of Salinity and Nutrient Conditions in the Heart River Basin, North Dakota, 1970-2020: U.S. Geological Survey Scientific Investigations Report 2021-XXXX, XX p.]. To run the R-QWTREND program in R 6 files are required and each is included in this child page: prepQWdataV4.txt, runQWmodelV4XXUEP.txt, plotQWtrendV4XXUEP.txt, qwtrend2018v4.exe, salflibc.dll, and StartQWTrendV4.R (Vecchia and Nustad, 2020). The folder contains: six items required to run the R–QWTREND trend analysis tool; a readme.txt file; a flowtrendData.RData file; an allsiteinfo.table.csv file, a folder called "scripts", and a folder called "waterqualitydata". The "scripts" folder contains the scripts that can be used to reproduce the results found in the USGS Scientific Investigations Report referenced above. The "waterqualitydata" folder contains .csv files with the naming convention of site_ions or site_nuts for major ions and nutrients constituents and contains machine readable files with the water-quality data used for the trend analysis at each site. R–QWTREND is a software package for analyzing trends in stream-water quality. The package is a collection of functions written in R (R Development Core Team, 2019), an open source language and a general environment for statistical computing and graphics. The following system requirements are necessary for using R–QWTREND: • Windows 10 operating system • R (version 3.4 or later; 64 bit recommended) • RStudio (version 1.1.456 or later). An accompanying report (Vecchia and Nustad, 2020) serves as the formal documentation for R–QWTREND. Vecchia, A.V., and Nustad, R.A., 2020, Time-series model, statistical methods, and software documentation for R–QWTREND—An R package for analyzing trends in stream-water quality: U.S. Geological Survey Open-File Report 2020–1014, 51 p., https://doi.org/10.3133/ofr20201014 R Development Core Team, 2019, R—A language and environment for statistical computing: Vienna, Austria, R Foundation for Statistical Computing, accessed December 7, 2020, at https://www.r-project.org.
Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Dec 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies [Dataset]. https://data.niaid.nih.gov/resources?id=dryad_w3r2280w0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.w3r2280w0
Dataset updated
Dec 7, 2023
Dataset provided by
HIV Prevention Trials Networkhttp://www.hptn.org/
National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
HIV Vaccine Trials Networkhttp://www.hvtn.org/
PEPFAR
Authors
Dylan Westfall; Mullins James
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Methods This serves as an overview of the analysis performed on PacBio sequence data that is summarized in Analysis Flowchart.pdf and was used as primary data for the paper by Westfall et al. "Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies" Five different PacBio sequencing datasets were used for this analysis: M027, M2199, M1567, M004, and M005 For the datasets which were indexed (M027, M2199), CCS reads from PacBio sequencing files and the chunked_demux_config files were used as input for the chunked_demux pipeline. Each config file lists the different Index primers added during PCR to each sample. The pipeline produces one fastq file for each Index primer combination in the config. For example, in dataset M027 there were 3–4 samples using each Index combination. The fastq files from each demultiplexed read set were moved to the sUMI_dUMI_comparison pipeline fastq folder for further demultiplexing by sample and consensus generation with that pipeline. More information about the chunked_demux pipeline can be found in the README.md file on GitHub. The demultiplexed read collections from the chunked_demux pipeline or CCS read files from datasets which were not indexed (M1567, M004, M005) were each used as input for the sUMI_dUMI_comparison pipeline along with each dataset's config file. Each config file contains the primer sequences for each sample (including the sample ID block in the cDNA primer) and further demultiplexes the reads to prepare data tables summarizing all of the UMI sequences and counts for each family (tagged.tar.gz) as well as consensus sequences from each sUMI and rank 1 dUMI family (consensus.tar.gz). More information about the sUMI_dUMI_comparison pipeline can be found in the paper and the README.md file on GitHub. The consensus.tar.gz and tagged.tar.gz files were moved from sUMI_dUMI_comparison pipeline directory on the server to the Pipeline_Outputs folder in this analysis directory for each dataset and appended with the dataset name (e.g. consensus_M027.tar.gz). Also in this analysis directory is a Sample_Info_Table.csv containing information about how each of the samples was prepared, such as purification methods and number of PCRs. There are also three other folders: Sequence_Analysis, Indentifying_Recombinant_Reads, and Figures. Each has an .Rmd file with the same name inside which is used to collect, summarize, and analyze the data. All of these collections of code were written and executed in RStudio to track notes and summarize results. Sequence_Analysis.Rmd has instructions to decompress all of the consensus.tar.gz files, combine them, and create two fasta files, one with all sUMI and one with all dUMI sequences. Using these as input, two data tables were created, that summarize all sequences and read counts for each sample that pass various criteria. These are used to help create Table 2 and as input for Indentifying_Recombinant_Reads.Rmd and Figures.Rmd. Next, 2 fasta files containing all of the rank 1 dUMI sequences and the matching sUMI sequences were created. These were used as input for the python script compare_seqs.py which identifies any matched sequences that are different between sUMI and dUMI read collections. This information was also used to help create Table 2. Finally, to populate the table with the number of sequences and bases in each sequence subset of interest, different sequence collections were saved and viewed in the Geneious program. To investigate the cause of sequences where the sUMI and dUMI sequences do not match, tagged.tar.gz was decompressed and for each family with discordant sUMI and dUMI sequences the reads from the UMI1_keeping directory were aligned using geneious. Reads from dUMI families failing the 0.7 filter were also aligned in Genious. The uncompressed tagged folder was then removed to save space. These read collections contain all of the reads in a UMI1 family and still include the UMI2 sequence. By examining the alignment and specifically the UMI2 sequences, the site of the discordance and its case were identified for each family as described in the paper. These alignments were saved as "Sequence Alignments.geneious". The counts of how many families were the result of PCR recombination were used in the body of the paper. Using Identifying_Recombinant_Reads.Rmd, the dUMI_ranked.csv file from each sample was extracted from all of the tagged.tar.gz files, combined and used as input to create a single dataset containing all UMI information from all samples. This file dUMI_df.csv was used as input for Figures.Rmd. Figures.Rmd used dUMI_df.csv, sequence_counts.csv, and read_counts.csv as input to create draft figures and then individual datasets for eachFigure. These were copied into Prism software to create the final figures for the paper.
w
Subjects of An introduction to analysis of financial data with R
workwithdata.com
Updated Jul 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Subjects of An introduction to analysis of financial data with R [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=book&fop0=%3D&fval0=An+introduction+to+analysis+of+financial+data+with+R
Explore at:
Dataset updated
Jul 1, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects and is filtered where the books is An introduction to analysis of financial data with R, featuring 10 columns including authors, average publication date, book publishers, book subject, and books. The preview is ordered by number of books (descending).
w
Longitudinal data analysis for the behavioral sciences using R
workwithdata.com
Updated Jan 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2022). Longitudinal data analysis for the behavioral sciences using R [Dataset]. https://www.workwithdata.com/object/longitudinal-data-analysis-for-the-behavioral-sciences-using-r-book-by-jeffrey-d-long-1964
Explore at:
Dataset updated
Jan 10, 2022
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Longitudinal data analysis for the behavioral sciences using R is a book. It was written by Jeffrey D. Long and published by SAGE in 2012.
e
Data Analysis with R (GE3B-07), 2nd Semester, Bachelor of Computer...
paper.erudition.co.in
html
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Data Analysis with R (GE3B-07), 2nd Semester, Bachelor of Computer Application 2023-2024, MAKAUT | Erudition Paper [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
Explore at:
htmlAvailable download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of Data Analysis with R (GE3B-07),2nd Semester,Bachelor of Computer Application 2023-2024,Maulana Abul Kalam Azad University of Technology
Data from: Accommodating the role of site memory in dynamic species...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated May 3, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Graziella DiRenzo; David Miller; Blake Hossack; Brent Sigafus; Paige Howell; Erin Muths; Evan Grant (2021). Accommodating the role of site memory in dynamic species distribution models [Dataset]. http://doi.org/10.5061/dryad.vdncjsxs7
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.vdncjsxs7
Dataset updated
May 3, 2021
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Pennsylvania State University
Authors
Graziella DiRenzo; David Miller; Blake Hossack; Brent Sigafus; Paige Howell; Erin Muths; Evan Grant
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
First-order dynamic occupancy models (FODOMs) are a class of state-space model in which the true state (occurrence) is observed imperfectly. An important assumption of FODOMs is that site dynamics only depend on the current state and that variations in dynamic processes are adequately captured with covariates or random effects. However, it is often difficult to understand and/or measure the covariates that generate ecological data, which are typically spatio-temporally correlated. Consequently, the non-independent error structure of correlated data causes underestimation of parameter uncertainty and poor ecological inference. Here, we extend the FODOM framework with a second-order Markov process to accommodate site memory when covariates are not available. Our modeling framework can be used to make reliable inference about site occupancy, colonization, extinction, turnover, and detection probabilities. We present a series of simulations to illustrate the data requirements and model performance. We then applied our modeling framework to 13 years of data from an amphibian community in southern Arizona, USA. In this analysis, we found residual temporal autocorrelation of population processes for most species, even after accounting for long-term drought dynamics. Our approach represents a valuable advance in obtaining inference on population dynamics, especially as they relate to metapopulations.

Methods

These files were written by: G. V. DiRenzo

If you have any questions, please email: grace.direnzo@gmail.com

This repository provides the code, data, and simulations to recreate all of the analysis, tables, and figures presented in the manuscript.

In this file, we direct the user to the location of files.

All methods can be found in the manuscript and associated supplements.

All file paths direct the user in navigating the files in this repo.

######## Objective & Table of contents

File objectives & Table of contents:

# 1. To navigate to files explaining how to simulate and analyze data using the main text parameterization # 2. To navigate to files explaining how to simulate and analyze data using the alternative parameterization (hidden Markov model) # 3. To navigate to files that created the parameter combinations for the simulation studies # 4. To navigate to files used to run scenarios in the manuscript # 4a. Scenario 1: data generated without site memory & without site heterogenity # 4b. Scenario 2: data generated with site memory & without site heterogenity # 4c. Scenario 3: data generated with site memory & with site heterogenity # 5. To navigate to files for general sample design guidelines # 6. Parameter accuracy, precision, and bias under different parameter combinations # 7. Model comparison under different scenarios # 8. To specifically navigate to code that recreates manuscript: # 8a. Figures # 8b. Tables # 9. To navigate to files for empirical analysis

### 1. Main text parameterization

To see model parameterization as written in the main text, please navigate to: /MemModel/OtherCode/MemoryMod_main.R

### 2. Alternative parameterization

To see alternative parameterization using a Hidden Markov Model, please navigate to: /MemModel/OtherCode/MemoryMod_HMM.R

### 3. Parameter Combinations

To see how parameter combinations were generated, please navigate to: /MemModel/ParameterCombinations/LHS_parameter_combos.R

To see stored parameter combinations for simulations, please navigate to: /MemModel/ParameterCombinations/parameter_combos_MemModel4.csv

### 4a. Scenario #1

To simulate data WITHOUT memory and analyze using: - memory model & - first-order dynamic occupancy model

Please navigate to: /MemModel/Simulations/withoutMem/Code/ MemoryMod_JobArray_withoutMem.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

All model output is stored in: /MemModel/Simulations/withoutMem/ModelOutput

### 4b. Scenario #2

To simulate data WITH memory and analyze using: - memory model & - first-order dynamic occupancy model

Please navigate to: /MemModel/Simulations/withMem/Code/ MemoryMod_JobArray_withMem.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

All model output is stored in: /MemModel/Simulations/withMem/ModelOutput

### 4c. Scenario #3

To simulate data WITH memory and WITH site heterogenity- analyze using: - memory model & - first-order dynamic occupancy model

Please navigate to: /MemModel/Simulations/Hetero/Code/ MemoryMod_JobArray_Hetero.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

All model output is stored in: /MemModel/Simulations/Hetero/ModelOutput

### 5. General sample design guidelines

To see methods for the general sample design guidelines, please navigate to: /MemModel/PostProcessingCode/Sampling_design_guidelines.R

### 6. Parameter accuracy, precision, and bias under different parameter combinations

To see methods for model performance under different parameter combinations, please navigate to: /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

### 7. Comparison of model performance

To see methods for model comparison, please navigate to: /MemModel/PostProcessingCode/ModelComparison.R

### 8a. Manuscript Figures

To create parts of Figure 1 of main text (case study): - Fig 1D & 1E: /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

To create Figure 2 of main text (Comparison across simulation scenarios): - /MemModel/PostProcessingCode/ModelComparison.R

To create Figure S1, S2, & S3 use file: - /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

To create Figure S4 & S5 use file: - /MemModel/PostProcessingCode/ModelComparison.R

### 8b. Manuscript Tables

To create Table 1 of main text (General sampling recommendations): - /MemModel/PostProcessingCode/Sampling_design_guidelines.R

To create Table S1: - /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

To create Table S2: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

To create Table S3: - /MemModel/PostProcessingCode/ModelComparison.R

To create Table S4 & S5: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

### 9. Empirical analysis

To recreate the empirical analysis of the case study, please navigate to: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R
w
A primer in biological data analysis and visualization using R
workwithdata.com
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). A primer in biological data analysis and visualization using R [Dataset]. https://www.workwithdata.com/object/a-primer-biological-data-analysis-visualization-using-r-book-by-gregg-hartvigsen-0000
Explore at:
Dataset updated
Feb 13, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A primer in biological data analysis and visualization using R is a book. It was written by Gregg Hartvigsen and published by Columbia University Press in 2014.
m
R Code for Systematic Review and Meta Analysis
data.mendeley.com
Updated May 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carmen Isensee (2020). R Code for Systematic Review and Meta Analysis [Dataset]. http://doi.org/10.17632/hympskpm3x.1
Explore at:
Unique identifier
https://doi.org/10.17632/hympskpm3x.1
Dataset updated
May 22, 2020
Authors
Carmen Isensee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This project presents all codes related to the review paper "The relationship between organizational culture, sustainability, and digitalization in SMEs: A systematic review."
Data Analysis in R.
kaggle.com
zip
Updated Nov 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierce Dsouza (2019). Data Analysis in R. [Dataset]. https://www.kaggle.com/piercerhymes/data-analysis-in-r
Explore at:
zip(527589 bytes)Available download formats
Dataset updated
Nov 14, 2019
Authors
Pierce Dsouza
Description
Dataset

This dataset was created by Pierce Dsouza

Contents
c
Research data supporting 'Lithic Technological Change and Behavioral...
repository.cam.ac.uk
bin, docx, xlsx
Updated Sep 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carroll, Peyton (2020). Research data supporting 'Lithic Technological Change and Behavioral Responses to the Last Glacial Maximum Across Southwestern Europe' [Dataset]. http://doi.org/10.17863/CAM.56697
Explore at:
xlsx(56230 bytes), bin(6066 bytes), bin(46471 bytes), xlsx(542779 bytes), docx(347181 bytes)Available download formats
Unique identifier
https://doi.org/10.17863/CAM.56697
Dataset updated
Sep 8, 2020
Dataset provided by
Apollo
University of Cambridge
Authors
Carroll, Peyton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was used to collect and analyze data for the MPhil Thesis, "Lithic Technological Change and Behavioral Responses to the Last Glacial Maximum Across Southwestern Europe." This dataset contains the raw data collected from published literature, and the R code used to run correspondence analysis on the data and create graphical representations of the results. It also contains notes to aid in interpreting the dataset, and a list detailing how variables in the dataset were grouped for use in analysis. The file "Diss Data.xlsx" contains the raw data collected from publications on Upper Paleolithic archaeological sites in France, Spain, and Italy. This data is the basis for all other files included in the repository. The document "Diss Data Notes.docx" contains detailed information about the raw data, and is useful for understanding its context. "Revised Variable Groups.docx" lists all of the variables from the raw data considered "tool types" and the major categories into which they were sorted for analysis. "Group Definitions.docx" provides the criteria considered to make the groups listed in the "Revised Variable Groups" document. "r_diss_data.xlsx" contains only the variables from the raw data that were considered for correspondence analysis carried-out in RStudio. The document "ca_barplot.R" contains the RStudio code written to perform correspondence analysis and percent composition analysis on the data from "R_Diss_Data.xlsx". This file also contains code for creating scatter plots and bar graphs displaying the results from the CA and Percent Comp tests. The RStudio packages used to carry out the analysis and to create graphical representations of the analysis results are listed under "Software/Usage Instructions." "climate_curve.R" contains the RStudio code used to create climate curves from NGRIP and GRIP data available open-access from the Neils Bohr Institute Center of Ice and Climate. The link to access this data is provided in "Related Resources" below.

Facebook

Twitter

Click to copy link

Link copied

Cite

Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI

Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems

Explore at:

Unique identifier

https://doi.org/10.7910/DVN/ARKOTI

Dataset updated

Nov 21, 2023

Dataset provided by

Harvard Dataverse

Authors

Monogan, Jamie

Description

Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

Clear search

Close search

Google apps

Main menu

Political Analysis Using R: Example Code and Data, Plus Data for Practice...

Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...

Statistical Data Analysis using R

Books series that contain Analyzing sensory data with R

Data from: pmartR: Quality Control and Statistics for Mass...

Data Set for "Analyzing Microbial Growth with R"

Data from: Sensory data analysis by example with R

Hydroinformatics: Intro to Hydrologic Analysis in R (Bookdown and Code)

Subsetting

Data for: A novel statistical model for analyzing data of a systematic...

Scripts and data to run R-QWTREND models and produce results | gimi9.com

Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...

Subjects of An introduction to analysis of financial data with R

Longitudinal data analysis for the behavioral sciences using R

Data Analysis with R (GE3B-07), 2nd Semester, Bachelor of Computer...

Data from: Accommodating the role of site memory in dynamic species...

These files were written by: G. V. DiRenzo

If you have any questions, please email: grace.direnzo@gmail.com

######## Objective & Table of contents

File objectives & Table of contents:

### 1. Main text parameterization

### 2. Alternative parameterization

### 3. Parameter Combinations

### 4a. Scenario #1

### 4b. Scenario #2

### 4c. Scenario #3

### 5. General sample design guidelines

### 6. Parameter accuracy, precision, and bias under different parameter combinations

### 7. Comparison of model performance

### 8a. Manuscript Figures

### 8b. Manuscript Tables

### 9. Empirical analysis

A primer in biological data analysis and visualization using R

R Code for Systematic Review and Meta Analysis

Data Analysis in R.

Dataset

Contents

Research data supporting 'Lithic Technological Change and Behavioral...

Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems