100+ datasets found
  1. d

    Political Analysis Using R: Example Code and Data, Plus Data for Practice...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Monogan, Jamie
    Description

    Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

  2. f

    Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  3. Statistical Data Analysis using R

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistical Data Analysis using R [Dataset]. https://figshare.com/articles/dataset/Statistical_Data_Analysis_using_R/5501035
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Samuel Barsanelli Costa
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.

  4. w

    Books series that contain Analyzing sensory data with R

    • workwithdata.com
    Updated Mar 3, 2003
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2003). Books series that contain Analyzing sensory data with R [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Analyzing+sensory+data+with+R&j=1&j0=books
    Explore at:
    Dataset updated
    Mar 3, 2003
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series and is filtered where the books is Analyzing sensory data with R, featuring 10 columns including authors, average publication date, book publishers, book series, and books. The preview is ordered by number of books (descending).

  5. f

    Data from: pmartR: Quality Control and Statistics for Mass...

    • acs.figshare.com
    • figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer (2023). pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00760.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography–MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.

  6. Data Set for "Analyzing Microbial Growth with R"

    • zenodo.org
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian D. Connelly; Brian D. Connelly (2020). Data Set for "Analyzing Microbial Growth with R" [Dataset]. http://doi.org/10.5281/zenodo.1171129
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Brian D. Connelly; Brian D. Connelly
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample data set used in "Analyzing Microbial Growth with R"

  7. w

    Data from: Sensory data analysis by example with R

    • workwithdata.com
    Updated Jan 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2022). Sensory data analysis by example with R [Dataset]. https://www.workwithdata.com/object/sensory-data-analysis-by-example-with-r-book-by-sebastien-le-0000
    Explore at:
    Dataset updated
    Jan 5, 2022
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sensory data analysis by example with R is a book. It was written by Sébastien Lê and published by Chapman&Hall/CRC in 2014.

  8. d

    Hydroinformatics: Intro to Hydrologic Analysis in R (Bookdown and Code)

    • search.dataone.org
    • beta.hydroshare.org
    • +1more
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John P Gannon (2021). Hydroinformatics: Intro to Hydrologic Analysis in R (Bookdown and Code) [Dataset]. https://search.dataone.org/view/sha256%3A0a728bb4a6759737e777a3ad29355a61b252ad7c0a59b33dab345c789107a8c8
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    John P Gannon
    Description

    The linked bookdown contains the notes and most exercises for a course on data analysis techniques in hydrology using the programming language R. The material will be updated each time the course is taught. If new topics are added, the topics they replace will remain, in case they are useful to others.

    I hope these materials can be a resource to those teaching themselves R for hydrologic analysis and/or for instructors who may want to use a lesson or two or the entire course. At the top of each chapter there is a link to a github repository. In each repository is the code that produces each chapter and a version where the code chunks within it are blank. These repositories are all template repositories, so you can easily copy them to your own github space by clicking Use This Template on the repo page.

    In my class, I work through the each document, live coding with students following along.Typically I ask students to watch as I code and explain the chunk and then replicate it on their computer. Depending on the lesson, I will ask students to try some of the chunks before I show them the code as an in-class activity. Some chunks are explicitly designed for this purpose and are typically labeled a “challenge.”

    Chapters called ACTIVITY are either homework or class-period-long in-class activities. The code chunks in these are therefore blank. If you would like a key for any of these, please just send me an email.

    If you have questions, suggestions, or would like activity answer keys, etc. please email me at jpgannon at vt.edu

    Finally, if you use this resource, please fill out the survey on the first page of the bookdown (https://forms.gle/6Zcntzvr1wZZUh6S7). This will help me get an idea of how people are using this resource, how I might improve it, and whether or not I should continue to update it.

  9. e

    Subsetting

    • paper.erudition.co.in
    html
    Updated Mar 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024

  10. m

    Data for: A novel statistical model for analyzing data of a systematic...

    • data.mendeley.com
    • narcis.nl
    Updated Oct 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonius Schneider (2017). Data for: A novel statistical model for analyzing data of a systematic review generates optimal cut-off values for FENO measurement for asthma diagnosis [Dataset]. http://doi.org/10.17632/fndpn5bnps.1
    Explore at:
    Dataset updated
    Oct 9, 2017
    Authors
    Antonius Schneider
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Supplementary Information:

    FENO_MultipleCO_basic.csv (Raw Study Data) diagmeta.R (R functions for running the Multiple thresholds model) example.R (R code for running the model with the data)

  11. g

    Scripts and data to run R-QWTREND models and produce results | gimi9.com

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scripts and data to run R-QWTREND models and produce results | gimi9.com [Dataset]. https://www.gimi9.com/dataset/data-gov_scripts-and-data-to-run-r-qwtrend-models-and-produce-results/
    Explore at:
    Description

    This child page contains a zipped folder which contains all items necessary to run trend models and produce results published in U.S. Geological Scientific Investigations Report 2021–XXXX [Tatge, W.S., Nustad, R.A., and Galloway, J.M., 2021, Evaluation of Salinity and Nutrient Conditions in the Heart River Basin, North Dakota, 1970-2020: U.S. Geological Survey Scientific Investigations Report 2021-XXXX, XX p.]. To run the R-QWTREND program in R 6 files are required and each is included in this child page: prepQWdataV4.txt, runQWmodelV4XXUEP.txt, plotQWtrendV4XXUEP.txt, qwtrend2018v4.exe, salflibc.dll, and StartQWTrendV4.R (Vecchia and Nustad, 2020). The folder contains: six items required to run the R–QWTREND trend analysis tool; a readme.txt file; a flowtrendData.RData file; an allsiteinfo.table.csv file, a folder called "scripts", and a folder called "waterqualitydata". The "scripts" folder contains the scripts that can be used to reproduce the results found in the USGS Scientific Investigations Report referenced above. The "waterqualitydata" folder contains .csv files with the naming convention of site_ions or site_nuts for major ions and nutrients constituents and contains machine readable files with the water-quality data used for the trend analysis at each site. R–QWTREND is a software package for analyzing trends in stream-water quality. The package is a collection of functions written in R (R Development Core Team, 2019), an open source language and a general environment for statistical computing and graphics. The following system requirements are necessary for using R–QWTREND: • Windows 10 operating system • R (version 3.4 or later; 64 bit recommended) • RStudio (version 1.1.456 or later). An accompanying report (Vecchia and Nustad, 2020) serves as the formal documentation for R–QWTREND. Vecchia, A.V., and Nustad, R.A., 2020, Time-series model, statistical methods, and software documentation for R–QWTREND—An R package for analyzing trends in stream-water quality: U.S. Geological Survey Open-File Report 2020–1014, 51 p., https://doi.org/10.3133/ofr20201014 R Development Core Team, 2019, R—A language and environment for statistical computing: Vienna, Austria, R Foundation for Statistical Computing, accessed December 7, 2020, at https://www.r-project.org.

  12. Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies [Dataset]. https://data.niaid.nih.gov/resources?id=dryad_w3r2280w0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    HIV Prevention Trials Networkhttp://www.hptn.org/
    National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
    HIV Vaccine Trials Networkhttp://www.hvtn.org/
    PEPFAR
    Authors
    Dylan Westfall; Mullins James
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Methods This serves as an overview of the analysis performed on PacBio sequence data that is summarized in Analysis Flowchart.pdf and was used as primary data for the paper by Westfall et al. "Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies" Five different PacBio sequencing datasets were used for this analysis: M027, M2199, M1567, M004, and M005 For the datasets which were indexed (M027, M2199), CCS reads from PacBio sequencing files and the chunked_demux_config files were used as input for the chunked_demux pipeline. Each config file lists the different Index primers added during PCR to each sample. The pipeline produces one fastq file for each Index primer combination in the config. For example, in dataset M027 there were 3–4 samples using each Index combination. The fastq files from each demultiplexed read set were moved to the sUMI_dUMI_comparison pipeline fastq folder for further demultiplexing by sample and consensus generation with that pipeline. More information about the chunked_demux pipeline can be found in the README.md file on GitHub. The demultiplexed read collections from the chunked_demux pipeline or CCS read files from datasets which were not indexed (M1567, M004, M005) were each used as input for the sUMI_dUMI_comparison pipeline along with each dataset's config file. Each config file contains the primer sequences for each sample (including the sample ID block in the cDNA primer) and further demultiplexes the reads to prepare data tables summarizing all of the UMI sequences and counts for each family (tagged.tar.gz) as well as consensus sequences from each sUMI and rank 1 dUMI family (consensus.tar.gz). More information about the sUMI_dUMI_comparison pipeline can be found in the paper and the README.md file on GitHub. The consensus.tar.gz and tagged.tar.gz files were moved from sUMI_dUMI_comparison pipeline directory on the server to the Pipeline_Outputs folder in this analysis directory for each dataset and appended with the dataset name (e.g. consensus_M027.tar.gz). Also in this analysis directory is a Sample_Info_Table.csv containing information about how each of the samples was prepared, such as purification methods and number of PCRs. There are also three other folders: Sequence_Analysis, Indentifying_Recombinant_Reads, and Figures. Each has an .Rmd file with the same name inside which is used to collect, summarize, and analyze the data. All of these collections of code were written and executed in RStudio to track notes and summarize results. Sequence_Analysis.Rmd has instructions to decompress all of the consensus.tar.gz files, combine them, and create two fasta files, one with all sUMI and one with all dUMI sequences. Using these as input, two data tables were created, that summarize all sequences and read counts for each sample that pass various criteria. These are used to help create Table 2 and as input for Indentifying_Recombinant_Reads.Rmd and Figures.Rmd. Next, 2 fasta files containing all of the rank 1 dUMI sequences and the matching sUMI sequences were created. These were used as input for the python script compare_seqs.py which identifies any matched sequences that are different between sUMI and dUMI read collections. This information was also used to help create Table 2. Finally, to populate the table with the number of sequences and bases in each sequence subset of interest, different sequence collections were saved and viewed in the Geneious program. To investigate the cause of sequences where the sUMI and dUMI sequences do not match, tagged.tar.gz was decompressed and for each family with discordant sUMI and dUMI sequences the reads from the UMI1_keeping directory were aligned using geneious. Reads from dUMI families failing the 0.7 filter were also aligned in Genious. The uncompressed tagged folder was then removed to save space. These read collections contain all of the reads in a UMI1 family and still include the UMI2 sequence. By examining the alignment and specifically the UMI2 sequences, the site of the discordance and its case were identified for each family as described in the paper. These alignments were saved as "Sequence Alignments.geneious". The counts of how many families were the result of PCR recombination were used in the body of the paper. Using Identifying_Recombinant_Reads.Rmd, the dUMI_ranked.csv file from each sample was extracted from all of the tagged.tar.gz files, combined and used as input to create a single dataset containing all UMI information from all samples. This file dUMI_df.csv was used as input for Figures.Rmd. Figures.Rmd used dUMI_df.csv, sequence_counts.csv, and read_counts.csv as input to create draft figures and then individual datasets for eachFigure. These were copied into Prism software to create the final figures for the paper.

  13. w

    Subjects of An introduction to analysis of financial data with R

    • workwithdata.com
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Subjects of An introduction to analysis of financial data with R [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=book&fop0=%3D&fval0=An+introduction+to+analysis+of+financial+data+with+R
    Explore at:
    Dataset updated
    Jul 1, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects and is filtered where the books is An introduction to analysis of financial data with R, featuring 10 columns including authors, average publication date, book publishers, book subject, and books. The preview is ordered by number of books (descending).

  14. w

    Longitudinal data analysis for the behavioral sciences using R

    • workwithdata.com
    Updated Jan 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2022). Longitudinal data analysis for the behavioral sciences using R [Dataset]. https://www.workwithdata.com/object/longitudinal-data-analysis-for-the-behavioral-sciences-using-r-book-by-jeffrey-d-long-1964
    Explore at:
    Dataset updated
    Jan 10, 2022
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Longitudinal data analysis for the behavioral sciences using R is a book. It was written by Jeffrey D. Long and published by SAGE in 2012.

  15. e

    Data Analysis with R (GE3B-07), 2nd Semester, Bachelor of Computer...

    • paper.erudition.co.in
    html
    Updated Mar 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Data Analysis with R (GE3B-07), 2nd Semester, Bachelor of Computer Application 2023-2024, MAKAUT | Erudition Paper [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of Data Analysis with R (GE3B-07),2nd Semester,Bachelor of Computer Application 2023-2024,Maulana Abul Kalam Azad University of Technology

  16. Data from: Accommodating the role of site memory in dynamic species...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated May 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Graziella DiRenzo; David Miller; Blake Hossack; Brent Sigafus; Paige Howell; Erin Muths; Evan Grant (2021). Accommodating the role of site memory in dynamic species distribution models [Dataset]. http://doi.org/10.5061/dryad.vdncjsxs7
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 3, 2021
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Pennsylvania State University
    Authors
    Graziella DiRenzo; David Miller; Blake Hossack; Brent Sigafus; Paige Howell; Erin Muths; Evan Grant
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    First-order dynamic occupancy models (FODOMs) are a class of state-space model in which the true state (occurrence) is observed imperfectly. An important assumption of FODOMs is that site dynamics only depend on the current state and that variations in dynamic processes are adequately captured with covariates or random effects. However, it is often difficult to understand and/or measure the covariates that generate ecological data, which are typically spatio-temporally correlated. Consequently, the non-independent error structure of correlated data causes underestimation of parameter uncertainty and poor ecological inference. Here, we extend the FODOM framework with a second-order Markov process to accommodate site memory when covariates are not available. Our modeling framework can be used to make reliable inference about site occupancy, colonization, extinction, turnover, and detection probabilities. We present a series of simulations to illustrate the data requirements and model performance. We then applied our modeling framework to 13 years of data from an amphibian community in southern Arizona, USA. In this analysis, we found residual temporal autocorrelation of population processes for most species, even after accounting for long-term drought dynamics. Our approach represents a valuable advance in obtaining inference on population dynamics, especially as they relate to metapopulations.

    Methods

    These files were written by: G. V. DiRenzo

    If you have any questions, please email: grace.direnzo@gmail.com

    This repository provides the code, data, and simulations to recreate all of the analysis, tables, and figures presented in the manuscript.

    In this file, we direct the user to the location of files.

    All methods can be found in the manuscript and associated supplements.

    All file paths direct the user in navigating the files in this repo.

    ######## Objective & Table of contents

    File objectives & Table of contents:

    # 1. To navigate to files explaining how to simulate and analyze data using the main text parameterization
    # 2. To navigate to files explaining how to simulate and analyze data using the alternative parameterization (hidden Markov model)
    # 3. To navigate to files that created the parameter combinations for the simulation studies
    # 4. To navigate to files used to run scenarios in the manuscript
      # 4a. Scenario 1: data generated without site memory & without site heterogenity
      # 4b. Scenario 2: data generated with site memory & without site heterogenity
      # 4c. Scenario 3: data generated with site memory & with site heterogenity
    # 5. To navigate to files for general sample design guidelines
    # 6. Parameter accuracy, precision, and bias under different parameter combinations
    # 7. Model comparison under different scenarios
    # 8. To specifically navigate to code that recreates manuscript:
      # 8a. Figures
      # 8b. Tables
    # 9. To navigate to files for empirical analysis
    
    ### 1. Main text parameterization

    To see model parameterization as written in the main text, please navigate to: /MemModel/OtherCode/MemoryMod_main.R

    ### 2. Alternative parameterization

    To see alternative parameterization using a Hidden Markov Model, please navigate to: /MemModel/OtherCode/MemoryMod_HMM.R

    ### 3. Parameter Combinations

    To see how parameter combinations were generated, please navigate to: /MemModel/ParameterCombinations/LHS_parameter_combos.R

    To see stored parameter combinations for simulations, please navigate to: /MemModel/ParameterCombinations/parameter_combos_MemModel4.csv

    ### 4a. Scenario #1

    To simulate data WITHOUT memory and analyze using: - memory model & - first-order dynamic occupancy model

    Please navigate to: /MemModel/Simulations/withoutMem/Code/ MemoryMod_JobArray_withoutMem.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

    All model output is stored in: /MemModel/Simulations/withoutMem/ModelOutput

    ### 4b. Scenario #2

    To simulate data WITH memory and analyze using: - memory model & - first-order dynamic occupancy model

    Please navigate to: /MemModel/Simulations/withMem/Code/ MemoryMod_JobArray_withMem.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

    All model output is stored in: /MemModel/Simulations/withMem/ModelOutput

    ### 4c. Scenario #3

    To simulate data WITH memory and WITH site heterogenity- analyze using: - memory model & - first-order dynamic occupancy model

    Please navigate to: /MemModel/Simulations/Hetero/Code/ MemoryMod_JobArray_Hetero.R = code to simulate & analyze data MemoryMod_JA1.sh = file to run simulations 1-5000 on HPC MemoryMod_JA2.sh = file to run simulations 5001-10000 on HPC

    All model output is stored in: /MemModel/Simulations/Hetero/ModelOutput

    ### 5. General sample design guidelines

    To see methods for the general sample design guidelines, please navigate to: /MemModel/PostProcessingCode/Sampling_design_guidelines.R

    ### 6. Parameter accuracy, precision, and bias under different parameter combinations

    To see methods for model performance under different parameter combinations, please navigate to: /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

    ### 7. Comparison of model performance

    To see methods for model comparison, please navigate to: /MemModel/PostProcessingCode/ModelComparison.R

    ### 8a. Manuscript Figures

    To create parts of Figure 1 of main text (case study): - Fig 1D & 1E: /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

    To create Figure 2 of main text (Comparison across simulation scenarios): - /MemModel/PostProcessingCode/ModelComparison.R

    To create Figure S1, S2, & S3 use file: - /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

    To create Figure S4 & S5 use file: - /MemModel/PostProcessingCode/ModelComparison.R

    ### 8b. Manuscript Tables

    To create Table 1 of main text (General sampling recommendations): - /MemModel/PostProcessingCode/Sampling_design_guidelines.R

    To create Table S1: - /MemModel/PostProcessingCode/Parameter_precison_accuracy_bias.R

    To create Table S2: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

    To create Table S3: - /MemModel/PostProcessingCode/ModelComparison.R

    To create Table S4 & S5: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

    ### 9. Empirical analysis

    To recreate the empirical analysis of the case study, please navigate to: - /MemModel/EmpiricalAnalysis/Code/Analysis/AZ_CaseStudy.R

  17. w

    A primer in biological data analysis and visualization using R

    • workwithdata.com
    Updated Feb 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). A primer in biological data analysis and visualization using R [Dataset]. https://www.workwithdata.com/object/a-primer-biological-data-analysis-visualization-using-r-book-by-gregg-hartvigsen-0000
    Explore at:
    Dataset updated
    Feb 13, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A primer in biological data analysis and visualization using R is a book. It was written by Gregg Hartvigsen and published by Columbia University Press in 2014.

  18. m

    R Code for Systematic Review and Meta Analysis

    • data.mendeley.com
    Updated May 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carmen Isensee (2020). R Code for Systematic Review and Meta Analysis [Dataset]. http://doi.org/10.17632/hympskpm3x.1
    Explore at:
    Dataset updated
    May 22, 2020
    Authors
    Carmen Isensee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This project presents all codes related to the review paper "The relationship between organizational culture, sustainability, and digitalization in SMEs: A systematic review."

  19. Data Analysis in R.

    • kaggle.com
    zip
    Updated Nov 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierce Dsouza (2019). Data Analysis in R. [Dataset]. https://www.kaggle.com/piercerhymes/data-analysis-in-r
    Explore at:
    zip(527589 bytes)Available download formats
    Dataset updated
    Nov 14, 2019
    Authors
    Pierce Dsouza
    Description

    Dataset

    This dataset was created by Pierce Dsouza

    Contents

  20. c

    Research data supporting 'Lithic Technological Change and Behavioral...

    • repository.cam.ac.uk
    bin, docx, xlsx
    Updated Sep 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carroll, Peyton (2020). Research data supporting 'Lithic Technological Change and Behavioral Responses to the Last Glacial Maximum Across Southwestern Europe' [Dataset]. http://doi.org/10.17863/CAM.56697
    Explore at:
    xlsx(56230 bytes), bin(6066 bytes), bin(46471 bytes), xlsx(542779 bytes), docx(347181 bytes)Available download formats
    Dataset updated
    Sep 8, 2020
    Dataset provided by
    Apollo
    University of Cambridge
    Authors
    Carroll, Peyton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was used to collect and analyze data for the MPhil Thesis, "Lithic Technological Change and Behavioral Responses to the Last Glacial Maximum Across Southwestern Europe." This dataset contains the raw data collected from published literature, and the R code used to run correspondence analysis on the data and create graphical representations of the results. It also contains notes to aid in interpreting the dataset, and a list detailing how variables in the dataset were grouped for use in analysis. The file "Diss Data.xlsx" contains the raw data collected from publications on Upper Paleolithic archaeological sites in France, Spain, and Italy. This data is the basis for all other files included in the repository. The document "Diss Data Notes.docx" contains detailed information about the raw data, and is useful for understanding its context. "Revised Variable Groups.docx" lists all of the variables from the raw data considered "tool types" and the major categories into which they were sorted for analysis. "Group Definitions.docx" provides the criteria considered to make the groups listed in the "Revised Variable Groups" document. "r_diss_data.xlsx" contains only the variables from the raw data that were considered for correspondence analysis carried-out in RStudio. The document "ca_barplot.R" contains the RStudio code written to perform correspondence analysis and percent composition analysis on the data from "R_Diss_Data.xlsx". This file also contains code for creating scatter plots and bar graphs displaying the results from the CA and Percent Comp tests. The RStudio packages used to carry out the analysis and to create graphical representations of the analysis results are listed under "Software/Usage Instructions." "climate_curve.R" contains the RStudio code used to create climate curves from NGRIP and GRIP data available open-access from the Neils Bohr Institute Center of Ice and Climate. The link to access this data is provided in "Related Resources" below.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI

Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Monogan, Jamie
Description

Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

Search
Clear search
Close search
Google apps
Main menu