5 datasets found
  1. f

    The banksia plot: a method for visually comparing point estimates and...

    • datasetcatalog.nlm.nih.gov
    • researchdata.edu.au
    • +1more
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McKenzie, Joanne E.; Turner, Simon; Karahalios, Amalia; Korevaar, Elizabeth (2024). The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001315764
    Explore at:
    Dataset updated
    Oct 15, 2024
    Authors
    McKenzie, Joanne E.; Turner, Simon; Karahalios, Amalia; Korevaar, Elizabeth
    Description

    Companion data for the creation of a banksia plot:Background:In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.Methods:The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.Results:In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.Conclusions:The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1

  2. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  3. Data from: Comparison methods.

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhe Zhang; Yuhao Chen; Huixue Wang; Qiming Fu; Jianping Chen; You Lu (2023). Comparison methods. [Dataset]. http://doi.org/10.1371/journal.pone.0286770.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Zhe Zhang; Yuhao Chen; Huixue Wang; Qiming Fu; Jianping Chen; You Lu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A critical issue in intelligent building control is detecting energy consumption anomalies based on intelligent device status data. The building field is plagued by energy consumption anomalies caused by a number of factors, many of which are associated with one another in apparent temporal relationships. For the detection of abnormalities, most traditional detection methods rely solely on a single variable of energy consumption data and its time series changes. Therefore, they are unable to examine the correlation between the multiple characteristic factors that affect energy consumption anomalies and their relationship in time. The outcomes of anomaly detection are one-sided. To address the above problems, this paper proposes an anomaly detection method based on multivariate time series. Firstly, in order to extract the correlation between different feature variables affecting energy consumption, this paper introduces a graph convolutional network to build an anomaly detection framework. Secondly, as different feature variables have different influences on each other, the framework is enhanced by a graph attention mechanism so that time series features with higher influence on energy consumption are given more attention weights, resulting in better anomaly detection of building energy consumption. Finally, the effectiveness of this paper’s method and existing methods for detecting energy consumption anomalies in smart buildings are compared using standard data sets. The experimental results show that the model has better detection accuracy.

  4. # Blocks? Graphs? Why Not Both? Designing and Evaluating a Hybrid...

    • zenodo.org
    zip
    Updated Mar 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2023). # Blocks? Graphs? Why Not Both? Designing and Evaluating a Hybrid Programming Environment for End-users: Replication Package [Dataset]. http://doi.org/10.5281/zenodo.7783404
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 30, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blocks? Graphs? Why Not Both? Designing and Evaluating a Hybrid Programming Environment for End-users: Replication Package

    This repository contains supplementary materials for the paper "Blocks? Graphs? Why Not Both? Designing and Evaluating a Hybrid Programming Environment for End-users". We provide this data for transparency reasons and to support replications of our experiemnts.

    Note: This package is anonymized for peer review purposes. We will provide contact information for the authors at a later date. We also plan to add interactive versions of our tasks and tutorials in an updated version to allow readers easier exploration/experimentation.

    Summary of files contained in this package

    This package contains two parts:

    • The data-analysis/ folder contains the raw dataset we collected for our experiment in CSV format, as well as scripts we used for our analyses.

      • Column ID contains a unique 4-digit identifier for each participant that they were assigned throughout our study.
      • Column Group contains the group (Blocks/Graph) that participants were randomly assigned to.
      • Columns Task1Time and Task2Time contain the time participants spent to complete the two programming tasks of our study in minutes.
      • Columns Task1Success and Task2Success contain a boolean value indicating whether the participants successfully completed the given task. Note that participants had unlimited attempts until they timed out after a strict time limit of 30 minutes, so if a participant was unsuccessful the corresponding time value is 30.
      • Columns Task1Tests and Task2Tests contain the number of times a participant executed their code throughout a task, including their final submission if they were successful.
      • Columns LearnTask, ReadTask and WriteTask contain the scores that participants gave to the task editor component of their assigned programming environment. There are 3 scores for the categories "learnability", "readability" and "writability". Scores are on a 5-point scale from 1 (worst) to 5 (best).
      • Columns LearnTrig, ReadTrig and WriteTrig contain the scores that participants gave to the trigger editor component of their assigned programming environment. There are 3 scores for the categories "learnability", "readability" and "writability". Scores are on a 5-point scale from 1 (worst) to 5 (best).
      • Columns LearnComp, ReadComp and WriteComp contain the scores that participants gave to their assigned assigned programming environment in direct comparison to the other alternative. There are 3 scores for the categories "learnability", "readability" and "writability". Unlike in the paper, where scores are on a scale from -2 to 2, the raw scores here are on a 5-point scale from 1 (strong preference for other environment) to 5 (strong preference for own environment).
      • The script survival.py was used to perform the survival analysis presented in the paper and generate the related figure.
      • The script batplot.py was used to generate the 3x3 grid of ratings used in a figure in the paper.
    • The materials/ folder contains the tutorials and task descriptions we presented to study participants. It also contains the exact wording of pre-screening and post-experiemental survey questions.

      • The image pre-screening.png shows the three pre-screening questions we used to determine whether our participants could be included in our study.
      • The images tutorial1_instructions.png and tutorial1_sim.png contain the instructions and initial simulator state we provided to participants for the first programming tutorial. This tutorial did not provide starter code and was identical for both participant groups.
      • The images tutorial2_instructions.png and tutorial2_sim.png contain the instructions and initial simulator state we provided to participants for the second programming tutorial. This tutorial was identical for both participant groups and provided participants with starter code, which is shown in the images:
        • tutorial2_code_main.png for the main program in the left canvas
        • tutorial2_code_move.png for the definition of "Move box to the right".
      • The images tutorial3_instructions_blocks.png/tutorial3_instructions_graph.png and tutorial3_sim.png contain the instructions and initial simulator state we provided to participants for the third programming tutorial. This tutorial also provided participants with starter code, which is shown in the images:
        • tutorial3_code_main.png for the main program in the left canvas
        • tutorial3_code_pick.png for the definition of "Pick up box"
        • tutorial3_code_place.png for the definition of "Place box"
      • The images task1_instructions.png and task1_sim.png contain the instructions and initial simulator state we provided to participants for the first programming task. The task did not provide starter code and the instructions were identical for both participant groups.
      • The images task2_instructions.png and task2_sim.png contain the instructions and initial simulator state we provided to participants for the second programming task. The instructions were identical for both groups. This task also provided participants with starter code, which is shown in the images:
        • task2_code_main.png for the main program in the left canvas
        • task2_code_pick_prog.png for the definition of "Pick up block"
        • task2_code_load_trig_blocks.png/task2_code_load_trig_graph.png for the definition of the trigger "Ready to load machine"
        • task2_code_load_prog.png for the definition of "Load and activate machine"
        • task2_code_finished_trig_blocks.png/task2_code_finished_trig_graph.png for the definition of the trigger "Machine finished"
        • task2_code_finished_prog1.png for the definition of "Get block from machine"
        • task2_code_finished_prog2.png for the definition of "Place block in bin"
      • The image usability.png shows the usability questions we used to determine a participant's rating of their assigned programming environment. The questions were identical for both participant groups.
      • The images comprehension_blocks_1.png and comprehension_blocks_2.png show the program comprehension questions we used to determine whether participants in the Blocks group could understand more complex triggers.
      • The images comprehension_graph_1.png and comprehension_graph_2.png show the program comprehension questions we used to determine whether participants in the Graph group could understand more complex triggers.
      • The images comparison_blocks.png and comparison_graph.png show the images of triggers in the alternative environment that we showed to our participants before choosing their preferred environment. The questions were identical for both participant groups.
      • The image comparison.png shows the questions we used to determine a participant's preference between the two programming environment alternatives.
  5. Comparison of F1-score, Precision, and Recall of anomaly detection models.

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhe Zhang; Yuhao Chen; Huixue Wang; Qiming Fu; Jianping Chen; You Lu (2023). Comparison of F1-score, Precision, and Recall of anomaly detection models. [Dataset]. http://doi.org/10.1371/journal.pone.0286770.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Zhe Zhang; Yuhao Chen; Huixue Wang; Qiming Fu; Jianping Chen; You Lu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of F1-score, Precision, and Recall of anomaly detection models.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
McKenzie, Joanne E.; Turner, Simon; Karahalios, Amalia; Korevaar, Elizabeth (2024). The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001315764

The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets

Explore at:
Dataset updated
Oct 15, 2024
Authors
McKenzie, Joanne E.; Turner, Simon; Karahalios, Amalia; Korevaar, Elizabeth
Description

Companion data for the creation of a banksia plot:Background:In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.Methods:The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.Results:In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.Conclusions:The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1

Search
Clear search
Close search
Google apps
Main menu