6 datasets found

d
Trend Detection and Forecasting
search.dataone.org
hydroshare.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriela Garcia; Kateri Salk (2021). Trend Detection and Forecasting [Dataset]. https://search.dataone.org/view/sha256%3Acc6ce10bf4642cd85c69fc697a24b519ad086342c5da54012eb613d2f4f81e70
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Gabriela Garcia; Kateri Salk
Description
Trend Detection and Forecasting

This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the second part of a two-part exercise focusing on time series analysis.

Introduction

Time series are a special class of dataset, where a response variable is tracked over time. Time series analysis is a powerful technique that can be used to understand the various temporal patterns in our data by decomposing data into different cyclic trends. Time series analysis can also be used to predict how levels of a variable will change in the future, taking into account what has happened in the past.

Learning Objectives

Choose appropriate time series analyses for trend detection and forecasting

Discuss the influence of seasonality on time series analysis

Interpret and communicate results of time series analyses
Storage and Transit Time Data and Code
zenodo.org
zip
Updated Nov 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Felton; Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. http://doi.org/10.5281/zenodo.14171251
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14171251
Dataset updated
Nov 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrew Felton; Andrew Felton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Author: Andrew J. Felton
Date: 11/15/2024

This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis, and figure production for the study entitled:

"Global estimates of the storage and transit time of water through vegetation"

Please note that 'turnover' and 'transit' are used interchangeably. Also please note that this R project has been updated multiple times as the analysis has updated throughout the peer review process.

#Data information:

The data folder contains key data sets used for analysis. In particular:

"data/turnover_from_python/updated/august_2024_lc/" contains the core datasets used in this study including global arrays summarizing five year (2016-2020) averages of mean (annual) and minimum (monthly) transit time, storage, canopy transpiration, and number of months of data able as both an array (.nc) or data table (.csv). These data were produced in python using the python scripts found in the "supporting_code" folder. The remaining files in the "data" and "data/supporting_data" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here. The "supporting_data"" folder also contains annual (2016-2020) MODIS land cover data used in the analysis and contains separate filters containing the original data (.hdf) and then the final process (filtered) data in .nc format. The resulting annual land cover distributions were used in the pre-processing of data in python.

#Code information

Python scripts can be found in the "supporting_code" folder.

Each R script in this project has a role:

"01_start.R": This script sets the working directory, loads in the tidyverse package (the remaining packages in this project are called using the `::` operator), and can run two other scripts: one that loads the customized functions (02_functions.R) and one for importing and processing the key dataset for this analysis (03_import_data.R).

"02_functions.R": This script contains custom functions. Load this using the `source()` function in the 01_start.R script.

"03_import_data.R": This script imports and processes the .csv transit data. It joins the mean (annual) transit time data with the minimum (monthly) transit data to generate one dataset for analysis: annual_turnover_2. Load this using the
`source()` function in the 01_start.R script.

"04_figures_tables.R": This is the main workhouse for figure/table production and supporting analyses. This script generates the key figures and summary statistics used in the study that then get saved in the "manuscript_figures" folder. Note that all maps were produced using Python code found in the "supporting_code"" folder. Also note that within the "manuscript_figures" folder there is an "extended_data" folder, which contains tables of the summary statistics (e.g., quartiles and sample sizes) behind figures containing box plots or depicting regression coefficients.

"supporting_generate_data.R": This script processes supporting data used in the analysis, primarily the varying ground-based datasets of leaf water content.

"supporting_process_land_cover.R": This takes annual MODIS land cover distributions and processes them through a multi-step filtering process so that they can be used in preprocessing of datasets in python.
Data from: Generalizable EHR-R-REDCap pipeline for a national...
data.niaid.nih.gov
explore.openaire.eu
+2more
zip
Updated Jan 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller (2022). Generalizable EHR-R-REDCap pipeline for a national multi-institutional rare tumor patient registry [Dataset]. http://doi.org/10.5061/dryad.rjdfn2zcm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.rjdfn2zcm
Dataset updated
Jan 9, 2022
Dataset provided by
Harvard Medical School
Massachusetts General Hospital
Authors
Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Objective: To develop a clinical informatics pipeline designed to capture large-scale structured EHR data for a national patient registry.

Materials and Methods: The EHR-R-REDCap pipeline is implemented using R-statistical software to remap and import structured EHR data into the REDCap-based multi-institutional Merkel Cell Carcinoma (MCC) Patient Registry using an adaptable data dictionary.

Results: Clinical laboratory data were extracted from EPIC Clarity across several participating institutions. Labs were transformed, remapped and imported into the MCC registry using the EHR labs abstraction (eLAB) pipeline. Forty-nine clinical tests encompassing 482,450 results were imported into the registry for 1,109 enrolled MCC patients. Data-quality assessment revealed highly accurate, valid labs. Univariate modeling was performed for labs at baseline on overall survival (N=176) using this clinical informatics pipeline.

Conclusion: We demonstrate feasibility of the facile eLAB workflow. EHR data is successfully transformed, and bulk-loaded/imported into a REDCap-based national registry to execute real-world data analysis and interoperability.

Methods eLAB Development and Source Code (R statistical software):

eLAB is written in R (version 4.0.3), and utilizes the following packages for processing: DescTools, REDCapR, reshape2, splitstackshape, readxl, survival, survminer, and tidyverse. Source code for eLAB can be downloaded directly (https://github.com/TheMillerLab/eLAB).

eLAB reformats EHR data abstracted for an identified population of patients (e.g. medical record numbers (MRN)/name list) under an Institutional Review Board (IRB)-approved protocol. The MCCPR does not host MRNs/names and eLAB converts these to MCCPR assigned record identification numbers (record_id) before import for de-identification.

Functions were written to remap EHR bulk lab data pulls/queries from several sources including Clarity/Crystal reports or institutional EDW including Research Patient Data Registry (RPDR) at MGB. The input, a csv/delimited file of labs for user-defined patients, may vary. Thus, users may need to adapt the initial data wrangling script based on the data input format. However, the downstream transformation, code-lab lookup tables, outcomes analysis, and LOINC remapping are standard for use with the provided REDCap Data Dictionary, DataDictionary_eLAB.csv. The available R-markdown ((https://github.com/TheMillerLab/eLAB) provides suggestions and instructions on where or when upfront script modifications may be necessary to accommodate input variability.

The eLAB pipeline takes several inputs. For example, the input for use with the ‘ehr_format(dt)’ single-line command is non-tabular data assigned as R object ‘dt’ with 4 columns: 1) Patient Name (MRN), 2) Collection Date, 3) Collection Time, and 4) Lab Results wherein several lab panels are in one data frame cell. A mock dataset in this ‘untidy-format’ is provided for demonstration purposes (https://github.com/TheMillerLab/eLAB).

Bulk lab data pulls often result in subtypes of the same lab. For example, potassium labs are reported as “Potassium,” “Potassium-External,” “Potassium(POC),” “Potassium,whole-bld,” “Potassium-Level-External,” “Potassium,venous,” and “Potassium-whole-bld/plasma.” eLAB utilizes a key-value lookup table with ~300 lab subtypes for remapping labs to the Data Dictionary (DD) code. eLAB reformats/accepts only those lab units pre-defined by the registry DD. The lab lookup table is provided for direct use or may be re-configured/updated to meet end-user specifications. eLAB is designed to remap, transform, and filter/adjust value units of semi-structured/structured bulk laboratory values data pulls from the EHR to align with the pre-defined code of the DD.

Data Dictionary (DD)

EHR clinical laboratory data is captured in REDCap using the ‘Labs’ repeating instrument (Supplemental Figures 1-2). The DD is provided for use by researchers at REDCap-participating institutions and is optimized to accommodate the same lab-type captured more than once on the same day for the same patient. The instrument captures 35 clinical lab types. The DD serves several major purposes in the eLAB pipeline. First, it defines every lab type of interest and associated lab unit of interest with a set field/variable name. It also restricts/defines the type of data allowed for entry for each data field, such as a string or numerics. The DD is uploaded into REDCap by every participating site/collaborator and ensures each site collects and codes the data the same way. Automation pipelines, such as eLAB, are designed to remap/clean and reformat data/units utilizing key-value look-up tables that filter and select only the labs/units of interest. eLAB ensures the data pulled from the EHR contains the correct unit and format pre-configured by the DD. The use of the same DD at every participating site ensures that the data field code, format, and relationships in the database are uniform across each site to allow for the simple aggregation of the multi-site data. For example, since every site in the MCCPR uses the same DD, aggregation is efficient and different site csv files are simply combined.

Study Cohort

This study was approved by the MGB IRB. Search of the EHR was performed to identify patients diagnosed with MCC between 1975-2021 (N=1,109) for inclusion in the MCCPR. Subjects diagnosed with primary cutaneous MCC between 2016-2019 (N= 176) were included in the test cohort for exploratory studies of lab result associations with overall survival (OS) using eLAB.

Statistical Analysis

OS is defined as the time from date of MCC diagnosis to date of death. Data was censored at the date of the last follow-up visit if no death event occurred. Univariable Cox proportional hazard modeling was performed among all lab predictors. Due to the hypothesis-generating nature of the work, p-values were exploratory and Bonferroni corrections were not applied.
f
The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of...
figshare.com
txt
Updated Sep 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gede Primahadi Wijaya Rajeg (2024). The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture” [Dataset]. http://doi.org/10.6084/m9.figshare.27138921.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27138921.v1
Dataset updated
Sep 30, 2024
Dataset provided by
figshare
Authors
Gede Primahadi Wijaya Rajeg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Rajeg
Description
How to citeRajeg, Gede Primahadi Wijaya (2024). The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture”. figshare. Dataset. https://doi.org/10.6084/m9.figshare.27138921OverviewA dataset of non-tabulated (raw) n-grams (from 2-grams up to 5-grams) derived from a corpus file in the Indonesian Leipzig Corpora Collection (ILCC), that is the “ind_newscrawl_2016_1M-sentences.txt”, the latest addition to the ILCC when the project associated with the generation of these n-grams was started in 2018. These large datasets were generated using R via one of Monash University’s high-performance computing facilities, MonARCH. The datasets became the basis for the linguistic analyses in the following publication:Rajeg, Gede Primahadi Wijaya, Poppy Siahaan & Alice Gaby. 2022. The Spatial Construal of TIME in Indonesian: Evidence from Language and Gesture. Linguistik Indonesia 40(1). 1–24. https://doi.org/10.26499/li.v40i1.297.This repository also includes the R scripts used to create the n-grams. The key R package to produce the n-gram (including the corpus tokenisation) is quanteda (Benoit et al. 2018), supported by the suit of R packages from the tidyverse (Wickham et al. 2019), tidytext (Silge & Robinson 2017), and corplingr (Rajeg 2021). Line 60 onwards in the file R-script-ngram-creation-2-4-grams.R shows how to search/filter and tabulate the n-gram frequency for a given time noun (i.e., tahun ‘year’ in the example).ReferencesSilge, J., & Robinson, D. (2017). Text mining with R: A tidy approach (First edition). O’Reilly.Benoit et al., (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774, https://doi.org/10.21105/joss.00774 https://quanteda.io.Wickham et al., (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686Rajeg, G. P. W. (2021). corplingr: Tidy concordances, collocates, and wordlist. Open Science Framework (OSF). https://doi.org/10.17605/OSF.IO/X8CW4 https://github.com/gederajeg/corplingr/.
Brisbane Library Checkout Data
zenodo.org
data.niaid.nih.gov
application/gzip, bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Tierney; Nicholas Tierney (2020). Brisbane Library Checkout Data [Dataset]. http://doi.org/10.5281/zenodo.2437860
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2437860
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicholas Tierney; Nicholas Tierney
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This has been copied from the README.md file

bris-lib-checkout

This provides tidied up data from the Brisbane library checkouts

Retrieving and cleaning the data

The script for retrieving and cleaning the data is made available in scrape-library.R.

The data

The data/ folder contains the tidy data

The data-raw/ folder contains the raw data

data/

This contains four tidied up dataframes:

tidy-brisbane-library-checkout.csv

metadata_branch.csv

metadata_heading.csv

metadata_item_type.csv

tidy-brisbane-library-checkout.csv contains the following columns, with the metadata file metadata_heading containing the description of these columns.

knitr::kable(readr::read_csv("data/metadata_heading.csv"))
#> Parsed with column specification:
#> cols(
#> heading = col_character(),
#> heading_explanation = col_character()
#> )

heading

heading_explanation

Title

Title of Item

Author

Author of Item

Call Number

Call Number of Item

Item id

Unique Item Identifier

Item Type

Type of Item (see next column)

Status

Current Status of Item

Language

Published language of item (if not English)

Age

Suggested audience

Checkout Library

Checkout branch

Date

Checkout date

We also added year, month, and day columns.

The remaining data are all metadata files that contain meta information on the columns in the checkout data:

library(tidyverse)
#> ── Attaching packages ────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
#> ✔ tibble 1.4.99.9006 ✔ dplyr 0.7.8
#> ✔ tidyr 0.8.2 ✔ stringr 1.3.1
#> ✔ readr 1.3.0 ✔ forcats 0.3.0
#> ── Conflicts ───────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
knitr::kable(readr::read_csv("data/metadata_branch.csv"))
#> Parsed with column specification:
#> cols(
#> branch_code = col_character(),
#> branch_heading = col_character()
#> )

branch_code

branch_heading

ANN

Annerley

ASH

Ashgrove

BNO

Banyo

BRR

BrackenRidge

BSQ

Brisbane Square Library

BUL

Bulimba

CDA

Corinda

CDE

Chermside

CNL

Carindale

CPL

Coopers Plains

CRA

Carina

EPK

Everton Park

FAI

Fairfield

GCY

Garden City

GNG

Grange

HAM

Hamilton

HPK

Holland Park

INA

Inala

IPY

Indooroopilly

MBG

Mt. Coot-tha

MIT

Mitchelton

MTG

Mt. Gravatt

MTO

Mt. Ommaney

NDH

Nundah

NFM

New Farm

SBK

Sunnybank Hills

SCR

Stones Corner

SGT

Sandgate

VAN

Mobile Library

TWG

Toowong

WND

West End

WYN

Wynnum

ZIL

Zillmere

knitr::kable(readr::read_csv("data/metadata_item_type.csv"))
#> Parsed with column specification:
#> cols(
#> item_type_code = col_character(),
#> item_type_explanation = col_character()
#> )

item_type_code

item_type_explanation

AD-FICTION

Adult Fiction

AD-MAGS

Adult Magazines

AD-PBK

Adult Paperback

BIOGRAPHY

Biography

BSQCDMUSIC

Brisbane Square CD Music

BSQCD-ROM

Brisbane Square CD Rom

BSQ-DVD

Brisbane Square DVD

CD-BOOK

Compact Disc Book

CD-MUSIC

Compact Disc Music

CD-ROM

CD Rom

DVD

DVD

DVD_R18+

DVD Restricted - 18+

FASTBACK

Fastback

GAYLESBIAN

Gay and Lesbian Collection

GRAPHICNOV

Graphic Novel

ILL

InterLibrary Loan

JU-FICTION

Junior Fiction

JU-MAGS

Junior Magazines

JU-PBK

Junior Paperback

KITS

Kits

LARGEPRINT

Large Print

LGPRINTMAG

Large Print Magazine

LITERACY

Literacy

LITERACYAV

Literacy Audio Visual

LOCSTUDIES

Local Studies

LOTE-BIO

Languages Other than English Biography

LOTE-BOOK

Languages Other than English Book

LOTE-CDMUS

Languages Other than English CD Music

LOTE-DVD

Languages Other than English DVD

LOTE-MAG

Languages Other than English Magazine

LOTE-TB

Languages Other than English Taped Book

MBG-DVD

Mt Coot-tha Botanical Gardens DVD

MBG-MAG

Mt Coot-tha Botanical Gardens Magazine

MBG-NF

Mt Coot-tha Botanical Gardens Non Fiction

MP3-BOOK

MP3 Audio Book

NONFIC-SET

Non Fiction Set

NONFICTION

Non Fiction

PICTURE-BK

Picture Book

PICTURE-NF

Picture Book Non Fiction

PLD-BOOK

Public Libraries Division Book

YA-FICTION

Young Adult Fiction

YA-MAGS

Young Adult Magazine

YA-PBK

Young Adult Paperback

Example usage

Let’s explore the data

bris_libs <- readr::read_csv("data/bris-lib-checkout.csv")
#> Parsed with column specification:
#> cols(
#> title = col_character(),
#> author = col_character(),
#> call_number = col_character(),
#> item_id = col_double(),
#> item_type = col_character(),
#> status = col_character(),
#> language = col_character(),
#> age = col_character(),
#> library = col_character(),
#> date = col_double(),
#> datetime = col_datetime(format = ""),
#> year = col_double(),
#> month = col_double(),
#> day = col_character()
#> )
#> Warning: 20 parsing failures.
#> row col expected actual file
#> 587795 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590579 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590597 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 595774 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 597567 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> ...... ....... ........ ....... ............................
#> See problems(...) for more details.

We can count the number of titles, item types, suggested age, and the library given:

library(dplyr)
count(bris_libs, title, sort = TRUE)
#> # A tibble: 121,046 x 2
#> title n
#>

License

This data is provided under a CC BY 4.0 license

It has been downloaded from Brisbane library checkouts, and tidied up using the code in data-raw.
Replication materials for Garg and Fetzer (2024) 'Political Expression of...
zenodo.org
zip
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Garg Prashant; Garg Prashant (2024). Replication materials for Garg and Fetzer (2024) 'Political Expression of Academics on Social Media' [Dataset]. http://doi.org/10.5281/zenodo.11522064
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11522064
Dataset updated
Jun 7, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Garg Prashant; Garg Prashant
License
http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
Time period covered
2024
Description

# Replication Package for 'Political Expression of Academics on Social Media'
A repository with replication material for the paper "Political Expression of Academics on Social Media" (2024) by Prashant Garg and Thiemo Fetzer

## Overview

This replication package contains all necessary scripts and data to replicate the main figures and tables presented in the paper.

## Folder Structure

### 1. `1_scripts`

This folder contains all scripts required to replicate the main figures and tables of the paper. The scripts are arranged in the order they should be run.

- `0_init.Rmd`: An R Markdown file that installs and loads all packages necessary for the subsequent scripts.
- `1_fig_1.Rmd`: Produces Figure 1 (Zipf's plots).
- `2_fig_2_to_4.Rmd`: Produces Figures 2 to 4 (average levels of expression).
- `3_fig_5_to_6.Rmd`: Produces Figures 5 to 6 (trends in expression).
- `4_tab_1_to_3.Rmd`: Produces Tables 1 to 3 (descriptive tables).

Expected run time for each script is under 2 minutes and requires around 4GB RAM. Script `3_fig_5_to_6.Rmd` can take up to 3-4 minutes and requires up to 6GB RAM. Installation of each package for the first time user may take around 2 minutes each, except 'tidyverse', which may take around 4 minutes.

We have not provided a demo since the actual dataset used for analysis is small enough and computations are efficient enough to be run in most systems.

Each script starts with a layperson explanation to overview the functionality of the code and a pseudocode for a detailed procedure, followed by the actual code.

### 2. `2_data`

This folder contains data used to replicate the main results. The data is called by the respective scripts automatically using relative paths.

- `data_dictionary.txt`: Provides a description of all variables as they are coded in the various datasets, especially the main author by time level dataset called `repl_df.csv`.
- Processed data at individual author by time (year by month) level aggregated measures are provided, as raw data containing raw tweets cannot be shared.

## Installation Instructions

### Prerequisites

This project uses R and RStudio. Make sure you have the following installed:

- [R](https://cran.r-project.org/) (version 4.0.0 or later)
- [RStudio](https://www.rstudio.com/products/rstudio/download/)

Once installed, to ensure the correct versions of the required packages are installed, use the following R markdown script '0_init.Rmd'. This script will install the `remotes` package (if not already installed) and then install the specified versions of the required packages.

## Running the Scripts
Open 0_init.Rmd in RStudio and run all chunks to install and load the required packages.
Run the remaining scripts (1_fig_1.Rmd, 2_fig_2_to_4.Rmd, 3_fig_5_to_6.Rmd, and 4_tab_1_to_3.Rmd) in the order they are listed to reproduce the figures and tables from the paper.

# Contact
For any questions, feel free to contact Prashant Garg at prashant.garg@imperial.ac.uk.

# License

This project is licensed under the Apache License 2.0 - see the license.txt file for details.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Gabriela Garcia; Kateri Salk (2021). Trend Detection and Forecasting [Dataset]. https://search.dataone.org/view/sha256%3Acc6ce10bf4642cd85c69fc697a24b519ad086342c5da54012eb613d2f4f81e70

Trend Detection and Forecasting

Explore at:

77 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Dec 5, 2021

Dataset provided by

Hydroshare

Authors

Gabriela Garcia; Kateri Salk

Description

Trend Detection and Forecasting

This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the second part of a two-part exercise focusing on time series analysis.

Introduction

Time series are a special class of dataset, where a response variable is tracked over time. Time series analysis is a powerful technique that can be used to understand the various temporal patterns in our data by decomposing data into different cyclic trends. Time series analysis can also be used to predict how levels of a variable will change in the future, taking into account what has happened in the past.

Learning Objectives

Choose appropriate time series analyses for trend detection and forecasting
Discuss the influence of seasonality on time series analysis
Interpret and communicate results of time series analyses

Clear search

Close search

Google apps

Main menu

Trend Detection and Forecasting

Storage and Transit Time Data and Code

Data from: Generalizable EHR-R-REDCap pipeline for a national...

The raw n-grams dataset for Rajeg et al.’s (2022) “The Spatial Construal of...

Brisbane Library Checkout Data

Replication materials for Garg and Fetzer (2024) 'Political Expression of...

Trend Detection and Forecasting