6 datasets found
  1. d

    Trend Detection and Forecasting

    • search.dataone.org
    • hydroshare.org
    • +1more
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Garcia; Kateri Salk (2021). Trend Detection and Forecasting [Dataset]. https://search.dataone.org/view/sha256%3Acc6ce10bf4642cd85c69fc697a24b519ad086342c5da54012eb613d2f4f81e70
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    Gabriela Garcia; Kateri Salk
    Description

    Trend Detection and Forecasting

    This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the second part of a two-part exercise focusing on time series analysis.

    Introduction

    Time series are a special class of dataset, where a response variable is tracked over time. Time series analysis is a powerful technique that can be used to understand the various temporal patterns in our data by decomposing data into different cyclic trends. Time series analysis can also be used to predict how levels of a variable will change in the future, taking into account what has happened in the past.

    Learning Objectives

    1. Choose appropriate time series analyses for trend detection and forecasting
    2. Discuss the influence of seasonality on time series analysis
    3. Interpret and communicate results of time series analyses
  2. Storage and Transit Time Data and Code

    • zenodo.org
    zip
    Updated Nov 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton; Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. http://doi.org/10.5281/zenodo.14171251
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Felton; Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Author: Andrew J. Felton
    Date: 11/15/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis, and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably. Also please note that this R project has been updated multiple times as the analysis has updated throughout the peer review process.

    #Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/august_2024_lc/" contains the core datasets used in this study including global arrays summarizing five year (2016-2020) averages of mean (annual) and minimum (monthly) transit time, storage, canopy transpiration, and number of months of data able as both an array (.nc) or data table (.csv). These data were produced in python using the python scripts found in the "supporting_code" folder. The remaining files in the "data" and "data/supporting_data" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here. The "supporting_data"" folder also contains annual (2016-2020) MODIS land cover data used in the analysis and contains separate filters containing the original data (.hdf) and then the final process (filtered) data in .nc format. The resulting annual land cover distributions were used in the pre-processing of data in python.

    #Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a role:

    "01_start.R": This script sets the working directory, loads in the tidyverse package (the remaining packages in this project are called using the `::` operator), and can run two other scripts: one that loads the customized functions (02_functions.R) and one for importing and processing the key dataset for this analysis (03_import_data.R).

    "02_functions.R": This script contains custom functions. Load this using the `source()` function in the 01_start.R script.

    "03_import_data.R": This script imports and processes the .csv transit data. It joins the mean (annual) transit time data with the minimum (monthly) transit data to generate one dataset for analysis: annual_turnover_2. Load this using the
    `source()` function in the 01_start.R script.

    "04_figures_tables.R": This is the main workhouse for figure/table production and supporting analyses. This script generates the key figures and summary statistics used in the study that then get saved in the "manuscript_figures" folder. Note that all maps were produced using Python code found in the "supporting_code"" folder. Also note that within the "manuscript_figures" folder there is an "extended_data" folder, which contains tables of the summary statistics (e.g., quartiles and sample sizes) behind figures containing box plots or depicting regression coefficients.

    "supporting_generate_data.R": This script processes supporting data used in the analysis, primarily the varying ground-based datasets of leaf water content.

    "supporting_process_land_cover.R": This takes annual MODIS land cover distributions and processes them through a multi-step filtering process so that they can be used in preprocessing of datasets in python.

  3. Data from: Generalizable EHR-R-REDCap pipeline for a national...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +2more
    zip
    Updated Jan 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller (2022). Generalizable EHR-R-REDCap pipeline for a national multi-institutional rare tumor patient registry [Dataset]. http://doi.org/10.5061/dryad.rjdfn2zcm
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 9, 2022
    Dataset provided by
    Harvard Medical School
    Massachusetts General Hospital
    Authors
    Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objective: To develop a clinical informatics pipeline designed to capture large-scale structured EHR data for a national patient registry.

    Materials and Methods: The EHR-R-REDCap pipeline is implemented using R-statistical software to remap and import structured EHR data into the REDCap-based multi-institutional Merkel Cell Carcinoma (MCC) Patient Registry using an adaptable data dictionary.

    Results: Clinical laboratory data were extracted from EPIC Clarity across several participating institutions. Labs were transformed, remapped and imported into the MCC registry using the EHR labs abstraction (eLAB) pipeline. Forty-nine clinical tests encompassing 482,450 results were imported into the registry for 1,109 enrolled MCC patients. Data-quality assessment revealed highly accurate, valid labs. Univariate modeling was performed for labs at baseline on overall survival (N=176) using this clinical informatics pipeline.

    Conclusion: We demonstrate feasibility of the facile eLAB workflow. EHR data is successfully transformed, and bulk-loaded/imported into a REDCap-based national registry to execute real-world data analysis and interoperability.

    Methods eLAB Development and Source Code (R statistical software):

    eLAB is written in R (version 4.0.3), and utilizes the following packages for processing: DescTools, REDCapR, reshape2, splitstackshape, readxl, survival, survminer, and tidyverse. Source code for eLAB can be downloaded directly (https://github.com/TheMillerLab/eLAB).

    eLAB reformats EHR data abstracted for an identified population of patients (e.g. medical record numbers (MRN)/name list) under an Institutional Review Board (IRB)-approved protocol. The MCCPR does not host MRNs/names and eLAB converts these to MCCPR assigned record identification numbers (record_id) before import for de-identification.

    Functions were written to remap EHR bulk lab data pulls/queries from several sources including Clarity/Crystal reports or institutional EDW including Research Patient Data Registry (RPDR) at MGB. The input, a csv/delimited file of labs for user-defined patients, may vary. Thus, users may need to adapt the initial data wrangling script based on the data input format. However, the downstream transformation, code-lab lookup tables, outcomes analysis, and LOINC remapping are standard for use with the provided REDCap Data Dictionary, DataDictionary_eLAB.csv. The available R-markdown ((https://github.com/TheMillerLab/eLAB) provides suggestions and instructions on where or when upfront script modifications may be necessary to accommodate input variability.

    The eLAB pipeline takes several inputs. For example, the input for use with the ‘ehr_format(dt)’ single-line command is non-tabular data assigned as R object ‘dt’ with 4 columns: 1) Patient Name (MRN), 2) Collection Date, 3) Collection Time, and 4) Lab Results wherein several lab panels are in one data frame cell. A mock dataset in this ‘untidy-format’ is provided for demonstration purposes (https://github.com/TheMillerLab/eLAB).

    Bulk lab data pulls often result in subtypes of the same lab. For example, potassium labs are reported as “Potassium,” “Potassium-External,” “Potassium(POC),” “Potassium,whole-bld,” “Potassium-Level-External,” “Potassium,venous,” and “Potassium-whole-bld/plasma.” eLAB utilizes a key-value lookup table with ~300 lab subtypes for remapping labs to the Data Dictionary (DD) code. eLAB reformats/accepts only those lab units pre-defined by the registry DD. The lab lookup table is provided for direct use or may be re-configured/updated to meet end-user specifications. eLAB is designed to remap, transform, and filter/adjust value units of semi-structured/structured bulk laboratory values data pulls from the EHR to align with the pre-defined code of the DD.

    Data Dictionary (DD)

    EHR clinical laboratory data is captured in REDCap using the ‘Labs’ repeating instrument (Supplemental Figures 1-2). The DD is provided for use by researchers at REDCap-participating institutions and is optimized to accommodate the same lab-type captured more than once on the same day for the same patient. The instrument captures 35 clinical lab types. The DD serves several major purposes in the eLAB pipeline. First, it defines every lab type of interest and associated lab unit of interest with a set field/variable name. It also restricts/defines the type of data allowed for entry for each data field, such as a string or numerics. The DD is uploaded into REDCap by every participating site/collaborator and ensures each site collects and codes the data the same way. Automation pipelines, such as eLAB, are designed to remap/clean and reformat data/units utilizing key-value look-up tables that filter and select only the labs/units of interest. eLAB ensures the data pulled from the EHR contains the correct unit and format pre-configured by the DD. The use of the same DD at every participating site ensures that the data field code, format, and relationships in the database are uniform across each site to allow for the simple aggregation of the multi-site data. For example, since every site in the MCCPR uses the same DD, aggregation is efficient and different site csv files are simply combined.

    Study Cohort

    This study was approved by the MGB IRB. Search of the EHR was performed to identify patients diagnosed with MCC between 1975-2021 (N=1,109) for inclusion in the MCCPR. Subjects diagnosed with primary cutaneous MCC between 2016-2019 (N= 176) were included in the test cohort for exploratory studies of lab result associations with overall survival (OS) using eLAB.

    Statistical Analysis

    OS is defined as the time from date of MCC diagnosis to date of death. Data was censored at the date of the last follow-up visit if no death event occurred. Univariable Cox proportional hazard modeling was performed among all lab predictors. Due to the hypothesis-generating nature of the work, p-values were exploratory and Bonferroni corrections were not applied.

  4. f

    The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of...

    • figshare.com
    txt
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gede Primahadi Wijaya Rajeg (2024). The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture” [Dataset]. http://doi.org/10.6084/m9.figshare.27138921.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 30, 2024
    Dataset provided by
    figshare
    Authors
    Gede Primahadi Wijaya Rajeg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Rajeg
    Description

    How to citeRajeg, Gede Primahadi Wijaya (2024). The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture”. figshare. Dataset. https://doi.org/10.6084/m9.figshare.27138921OverviewA dataset of non-tabulated (raw) n-grams (from 2-grams up to 5-grams) derived from a corpus file in the Indonesian Leipzig Corpora Collection (ILCC), that is the “ind_newscrawl_2016_1M-sentences.txt”, the latest addition to the ILCC when the project associated with the generation of these n-grams was started in 2018. These large datasets were generated using R via one of Monash University’s high-performance computing facilities, MonARCH. The datasets became the basis for the linguistic analyses in the following publication:Rajeg, Gede Primahadi Wijaya, Poppy Siahaan & Alice Gaby. 2022. The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture. Linguistik Indonesia 40(1). 1–24. https://doi.org/10.26499/li.v40i1.297.This repository also includes the R scripts used to create the n-grams. The key R package to produce the n-gram (including the corpus tokenisation) is quanteda (Benoit et al. 2018), supported by the suit of R packages from the tidyverse (Wickham et al. 2019), tidytext (Silge & Robinson 2017), and corplingr (Rajeg 2021). Line 60 onwards in the file R-script-ngram-creation-2-4-grams.R shows how to search/filter and tabulate the n-gram frequency for a given time noun (i.e., tahun ‘year’ in the example).ReferencesSilge, J., & Robinson, D. (2017). Text mining with R: A tidy approach (First edition). O’Reilly.Benoit et al., (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774, https://doi.org/10.21105/joss.00774 https://quanteda.io.Wickham et al., (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686Rajeg, G. P. W. (2021). corplingr: Tidy concordances, collocates, and wordlist. Open Science Framework (OSF). https://doi.org/10.17605/OSF.IO/X8CW4 https://github.com/gederajeg/corplingr/.

  5. Brisbane Library Checkout Data

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Tierney; Nicholas Tierney (2020). Brisbane Library Checkout Data [Dataset]. http://doi.org/10.5281/zenodo.2437860
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Tierney; Nicholas Tierney
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This has been copied from the README.md file

    bris-lib-checkout

    This provides tidied up data from the Brisbane library checkouts

    Retrieving and cleaning the data

    The script for retrieving and cleaning the data is made available in scrape-library.R.

    The data

    • The data/ folder contains the tidy data
    • The data-raw/ folder contains the raw data

    data/

    This contains four tidied up dataframes:

    • tidy-brisbane-library-checkout.csv
    • metadata_branch.csv
    • metadata_heading.csv
    • metadata_item_type.csv

    tidy-brisbane-library-checkout.csv contains the following columns, with the metadata file metadata_heading containing the description of these columns.

    knitr::kable(readr::read_csv("data/metadata_heading.csv"))
    #> Parsed with column specification:
    #> cols(
    #> heading = col_character(),
    #> heading_explanation = col_character()
    #> )

    heading

    heading_explanation

    Title

    Title of Item

    Author

    Author of Item

    Call Number

    Call Number of Item

    Item id

    Unique Item Identifier

    Item Type

    Type of Item (see next column)

    Status

    Current Status of Item

    Language

    Published language of item (if not English)

    Age

    Suggested audience

    Checkout Library

    Checkout branch

    Date

    Checkout date

    We also added year, month, and day columns.

    The remaining data are all metadata files that contain meta information on the columns in the checkout data:

    library(tidyverse)
    #> ── Attaching packages ────────────── tidyverse 1.2.1 ──
    #> ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
    #> ✔ tibble 1.4.99.9006 ✔ dplyr 0.7.8
    #> ✔ tidyr 0.8.2 ✔ stringr 1.3.1
    #> ✔ readr 1.3.0 ✔ forcats 0.3.0
    #> ── Conflicts ───────────────── tidyverse_conflicts() ──
    #> ✖ dplyr::filter() masks stats::filter()
    #> ✖ dplyr::lag() masks stats::lag()
    knitr::kable(readr::read_csv("data/metadata_branch.csv"))
    #> Parsed with column specification:
    #> cols(
    #> branch_code = col_character(),
    #> branch_heading = col_character()
    #> )

    branch_code

    branch_heading

    ANN

    Annerley

    ASH

    Ashgrove

    BNO

    Banyo

    BRR

    BrackenRidge

    BSQ

    Brisbane Square Library

    BUL

    Bulimba

    CDA

    Corinda

    CDE

    Chermside

    CNL

    Carindale

    CPL

    Coopers Plains

    CRA

    Carina

    EPK

    Everton Park

    FAI

    Fairfield

    GCY

    Garden City

    GNG

    Grange

    HAM

    Hamilton

    HPK

    Holland Park

    INA

    Inala

    IPY

    Indooroopilly

    MBG

    Mt. Coot-tha

    MIT

    Mitchelton

    MTG

    Mt. Gravatt

    MTO

    Mt. Ommaney

    NDH

    Nundah

    NFM

    New Farm

    SBK

    Sunnybank Hills

    SCR

    Stones Corner

    SGT

    Sandgate

    VAN

    Mobile Library

    TWG

    Toowong

    WND

    West End

    WYN

    Wynnum

    ZIL

    Zillmere

    knitr::kable(readr::read_csv("data/metadata_item_type.csv"))
    #> Parsed with column specification:
    #> cols(
    #> item_type_code = col_character(),
    #> item_type_explanation = col_character()
    #> )

    item_type_code

    item_type_explanation

    AD-FICTION

    Adult Fiction

    AD-MAGS

    Adult Magazines

    AD-PBK

    Adult Paperback

    BIOGRAPHY

    Biography

    BSQCDMUSIC

    Brisbane Square CD Music

    BSQCD-ROM

    Brisbane Square CD Rom

    BSQ-DVD

    Brisbane Square DVD

    CD-BOOK

    Compact Disc Book

    CD-MUSIC

    Compact Disc Music

    CD-ROM

    CD Rom

    DVD

    DVD

    DVD_R18+

    DVD Restricted - 18+

    FASTBACK

    Fastback

    GAYLESBIAN

    Gay and Lesbian Collection

    GRAPHICNOV

    Graphic Novel

    ILL

    InterLibrary Loan

    JU-FICTION

    Junior Fiction

    JU-MAGS

    Junior Magazines

    JU-PBK

    Junior Paperback

    KITS

    Kits

    LARGEPRINT

    Large Print

    LGPRINTMAG

    Large Print Magazine

    LITERACY

    Literacy

    LITERACYAV

    Literacy Audio Visual

    LOCSTUDIES

    Local Studies

    LOTE-BIO

    Languages Other than English Biography

    LOTE-BOOK

    Languages Other than English Book

    LOTE-CDMUS

    Languages Other than English CD Music

    LOTE-DVD

    Languages Other than English DVD

    LOTE-MAG

    Languages Other than English Magazine

    LOTE-TB

    Languages Other than English Taped Book

    MBG-DVD

    Mt Coot-tha Botanical Gardens DVD

    MBG-MAG

    Mt Coot-tha Botanical Gardens Magazine

    MBG-NF

    Mt Coot-tha Botanical Gardens Non Fiction

    MP3-BOOK

    MP3 Audio Book

    NONFIC-SET

    Non Fiction Set

    NONFICTION

    Non Fiction

    PICTURE-BK

    Picture Book

    PICTURE-NF

    Picture Book Non Fiction

    PLD-BOOK

    Public Libraries Division Book

    YA-FICTION

    Young Adult Fiction

    YA-MAGS

    Young Adult Magazine

    YA-PBK

    Young Adult Paperback

    Example usage

    Let’s explore the data

    bris_libs <- readr::read_csv("data/bris-lib-checkout.csv")
    #> Parsed with column specification:
    #> cols(
    #> title = col_character(),
    #> author = col_character(),
    #> call_number = col_character(),
    #> item_id = col_double(),
    #> item_type = col_character(),
    #> status = col_character(),
    #> language = col_character(),
    #> age = col_character(),
    #> library = col_character(),
    #> date = col_double(),
    #> datetime = col_datetime(format = ""),
    #> year = col_double(),
    #> month = col_double(),
    #> day = col_character()
    #> )
    #> Warning: 20 parsing failures.
    #> row col expected actual file
    #> 587795 item_id a double REFRESH 'data/bris-lib-checkout.csv'
    #> 590579 item_id a double REFRESH 'data/bris-lib-checkout.csv'
    #> 590597 item_id a double REFRESH 'data/bris-lib-checkout.csv'
    #> 595774 item_id a double REFRESH 'data/bris-lib-checkout.csv'
    #> 597567 item_id a double REFRESH 'data/bris-lib-checkout.csv'
    #> ...... ....... ........ ....... ............................
    #> See problems(...) for more details.

    We can count the number of titles, item types, suggested age, and the library given:

    library(dplyr)
    count(bris_libs, title, sort = TRUE)
    #> # A tibble: 121,046 x 2
    #> title n
    #>

    License

    This data is provided under a CC BY 4.0 license

    It has been downloaded from Brisbane library checkouts, and tidied up using the code in data-raw.

  6. Replication materials for Garg and Fetzer (2024) 'Political Expression of...

    • zenodo.org
    zip
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garg Prashant; Garg Prashant (2024). Replication materials for Garg and Fetzer (2024) 'Political Expression of Academics on Social Media' [Dataset]. http://doi.org/10.5281/zenodo.11522064
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 7, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Garg Prashant; Garg Prashant
    License

    http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

    Time period covered
    2024
    Description


    # Replication Package for 'Political Expression of Academics on Social Media'
    A repository with replication material for the paper "Political Expression of Academics on Social Media" (2024) by Prashant Garg and Thiemo Fetzer

    ## Overview

    This replication package contains all necessary scripts and data to replicate the main figures and tables presented in the paper.

    ## Folder Structure

    ### 1. `1_scripts`

    This folder contains all scripts required to replicate the main figures and tables of the paper. The scripts are arranged in the order they should be run.

    - `0_init.Rmd`: An R Markdown file that installs and loads all packages necessary for the subsequent scripts.
    - `1_fig_1.Rmd`: Produces Figure 1 (Zipf's plots).
    - `2_fig_2_to_4.Rmd`: Produces Figures 2 to 4 (average levels of expression).
    - `3_fig_5_to_6.Rmd`: Produces Figures 5 to 6 (trends in expression).
    - `4_tab_1_to_3.Rmd`: Produces Tables 1 to 3 (descriptive tables).

    Expected run time for each script is under 2 minutes and requires around 4GB RAM. Script `3_fig_5_to_6.Rmd` can take up to 3-4 minutes and requires up to 6GB RAM. Installation of each package for the first time user may take around 2 minutes each, except 'tidyverse', which may take around 4 minutes.

    We have not provided a demo since the actual dataset used for analysis is small enough and computations are efficient enough to be run in most systems.

    Each script starts with a layperson explanation to overview the functionality of the code and a pseudocode for a detailed procedure, followed by the actual code.

    ### 2. `2_data`

    This folder contains data used to replicate the main results. The data is called by the respective scripts automatically using relative paths.

    - `data_dictionary.txt`: Provides a description of all variables as they are coded in the various datasets, especially the main author by time level dataset called `repl_df.csv`.
    - Processed data at individual author by time (year by month) level aggregated measures are provided, as raw data containing raw tweets cannot be shared.

    ## Installation Instructions

    ### Prerequisites

    This project uses R and RStudio. Make sure you have the following installed:

    - [R](https://cran.r-project.org/) (version 4.0.0 or later)
    - [RStudio](https://www.rstudio.com/products/rstudio/download/)

    Once installed, to ensure the correct versions of the required packages are installed, use the following R markdown script '0_init.Rmd'. This script will install the `remotes` package (if not already installed) and then install the specified versions of the required packages.

    ## Running the Scripts
    Open 0_init.Rmd in RStudio and run all chunks to install and load the required packages.
    Run the remaining scripts (1_fig_1.Rmd, 2_fig_2_to_4.Rmd, 3_fig_5_to_6.Rmd, and 4_tab_1_to_3.Rmd) in the order they are listed to reproduce the figures and tables from the paper.

    # Contact
    For any questions, feel free to contact Prashant Garg at prashant.garg@imperial.ac.uk.

    # License

    This project is licensed under the Apache License 2.0 - see the license.txt file for details.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gabriela Garcia; Kateri Salk (2021). Trend Detection and Forecasting [Dataset]. https://search.dataone.org/view/sha256%3Acc6ce10bf4642cd85c69fc697a24b519ad086342c5da54012eb613d2f4f81e70

Trend Detection and Forecasting

Explore at:
77 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Gabriela Garcia; Kateri Salk
Description

Trend Detection and Forecasting

This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the second part of a two-part exercise focusing on time series analysis.

Introduction

Time series are a special class of dataset, where a response variable is tracked over time. Time series analysis is a powerful technique that can be used to understand the various temporal patterns in our data by decomposing data into different cyclic trends. Time series analysis can also be used to predict how levels of a variable will change in the future, taking into account what has happened in the past.

Learning Objectives

  1. Choose appropriate time series analyses for trend detection and forecasting
  2. Discuss the influence of seasonality on time series analysis
  3. Interpret and communicate results of time series analyses
Search
Clear search
Close search
Google apps
Main menu