100+ datasets found
  1. f

    Data from: Nonparametric Anomaly Detection on Time Series of Graphs

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

  2. R

    CAMELS-FR time series dynamic graphs

    • entrepot.recherche.data.gouv.fr
    text/markdown, zip
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olivier Delaigue; Olivier Delaigue; Benoît Génot; Guilherme Mendoza Guimarães; Guilherme Mendoza Guimarães; Benoît Génot (2024). CAMELS-FR time series dynamic graphs [Dataset]. http://doi.org/10.57745/HBQWP5
    Explore at:
    text/markdown(2250), zip(297806091), zip(297833679)Available download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Olivier Delaigue; Olivier Delaigue; Benoît Génot; Guilherme Mendoza Guimarães; Guilherme Mendoza Guimarães; Benoît Génot
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Area covered
    France
    Description

    These dynamic graphs are derived from the "CAMELS-FR dataset". A html file is provided for each catchment, where dynamic plots of hydroclimatic time series are displayed. The files are available in a few languages.

  3. Wikipedia time-series graph

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre (2025). Wikipedia time-series graph [Dataset]. http://doi.org/10.5281/zenodo.886484
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Wikipedia temporal graph.

    The dataset is based on two Wikipedia SQL dumps: (1) English language articles and (2) user visit counts per page per hour (aka pagecounts). The original datasets are publicly available on the Wikimedia website.

    Static graph structure is extracted from English language Wikipedia articles. Redirects are removed. Before building the Wikipedia graph we introduce thresholds on the minimum number of visits per hour and maximum in-degree. We remove the pages that have less than 500 visits per hour at least once during the specified period. Besides, we remove the nodes (pages) with in-degree higher than 8 000 to build a more meaningful initial graph. After cleaning, the graph contains 116 016 nodes (out of total 4 856 639 pages), 6 573 475 edges. The graph can be imported in two ways: (1) using edges.csv and vertices.csv or (2) using enwiki-20150403-graph.gt file that can be opened with open source Python library Graph-Tool.

    Time-series data contains users' visit counts from 02:00, 23 September 2014 until 23:00, 30 April 2015. The total number of hours is 5278. The data is stored in two formats: CSV and H5. CSV file contains data in the following format [page_id :: count_views :: layer], where layer represents an hour. In H5 file, each layer corresponds to an hour as well.

  4. f

    Comparison of classification results.

    • plos.figshare.com
    xls
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amjad Iqbal; Rashid Amin; Faisal S. Alsubaei; Abdulrahman Alzahrani (2024). Comparison of classification results. [Dataset]. http://doi.org/10.1371/journal.pone.0303890.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Amjad Iqbal; Rashid Amin; Faisal S. Alsubaei; Abdulrahman Alzahrani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Anomaly detection in time series data is essential for fraud detection and intrusion monitoring applications. However, it poses challenges due to data complexity and high dimensionality. Industrial applications struggle to process high-dimensional, complex data streams in real time despite existing solutions. This study introduces deep ensemble models to improve traditional time series analysis and anomaly detection methods. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks effectively handle variable-length sequences and capture long-term relationships. Convolutional Neural Networks (CNNs) are also investigated, especially for univariate or multivariate time series forecasting. The Transformer, an architecture based on Artificial Neural Networks (ANN), has demonstrated promising results in various applications, including time series prediction and anomaly detection. Graph Neural Networks (GNNs) identify time series anomalies by capturing temporal connections and interdependencies between periods, leveraging the underlying graph structure of time series data. A novel feature selection approach is proposed to address challenges posed by high-dimensional data, improving anomaly detection by selecting different or more critical features from the data. This approach outperforms previous techniques in several aspects. Overall, this research introduces state-of-the-art algorithms for anomaly detection in time series data, offering advancements in real-time processing and decision-making across various industrial sectors.

  5. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  6. H

    Replication Data for: visibility graphs algorithm in R language

    • dataverse.harvard.edu
    Updated Dec 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dirceu Melo (2020). Replication Data for: visibility graphs algorithm in R language [Dataset]. http://doi.org/10.7910/DVN/XMLHZD
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 23, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Dirceu Melo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Implementation of visibility graphs algorithm in R language. These scripts generate visibility graphs from series built in RStudio or imported into RStudio; Plot the series, the series histogram, the degree distribution of the graphs generated from these series; They determine the fit of the distribution curve in a log-log graph; Calculates the fundamental metrics of complex networks for the visibility graphs associated with each series.

  7. W

    HUN Mine Footprints Timeseries Graph v01

    • cloud.csiss.gmu.edu
    • researchdata.edu.au
    • +1more
    Updated Dec 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australia (2019). HUN Mine Footprints Timeseries Graph v01 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/11493517-df5f-49ed-84dc-23afdbe00c5e
    Explore at:
    Dataset updated
    Dec 14, 2019
    Dataset provided by
    Australia
    Description

    Abstract

    The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    This dataset contains time series figures (shown in the report) generated for baseline and crdp mine footprints , which represent the footprints used in the surface water modelling. The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadhseet (HUN mine time series tables v01).

    Dataset History

    The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadsheet (HUN mine time series tables v01). Timelines for all mines were assembled into the spreadsheet Mine_files_summary_Final.xlsx. The script MineFootprint_TimeSeries_Final.m reads the data from the spreadsheet and creates the time series figures in png format which form the dataset.

    Dataset Citation

    Bioregional Assessment Programme (XXXX) HUN Mine Footprints Timeseries Graph v01. Bioregional Assessment Derived Dataset. Viewed 22 June 2018, http://data.bioregionalassessments.gov.au/dataset/11493517-df5f-49ed-84dc-23afdbe00c5e.

    Dataset Ancestors

  8. 1000 Empirical Time series

    • figshare.com
    • researchdata.edu.au
    png
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ben Fulcher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

  9. 4

    Data underlying Ph.D. thesis: Large set of graphs and timeseries of supply...

    • data.4tu.nl
    zip
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isabelle van Schilt (2024). Data underlying Ph.D. thesis: Large set of graphs and timeseries of supply chain simulation model [Dataset]. http://doi.org/10.4121/adf4373c-7a9a-4d9c-a1ff-0f893d8d0b06.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Isabelle van Schilt
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This data is part of the Ph.D. thesis of Isabelle M. van Schilt, Delft University of Technology.


    Data includes the time series data of the synthetic counterfeit PPE supply chain discrete event simulation model. This time series data is used for the paper of structural uncertainty and the quality diversity (QD) algorithm.


    Also, the data includes the decision variables dictionary for both papers. These are two dictionaries in .pkl format that include 40.000 randomly generated graphs with real-world port data for the case study. One dictionary is sorted on betweenness, and the other (QD) on the density of the network. Following, a database example of 50.000 randomly generated graphs (without real-world data) has been included in this data.

  10. Data from: Climate Prediction Center (CPC) Global Precipitation Time Series

    • data.cnra.ca.gov
    • datadiscoverystudio.org
    • +1more
    html
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Oceanic and Atmospheric Administration (2023). Climate Prediction Center (CPC) Global Precipitation Time Series [Dataset]. https://data.cnra.ca.gov/dataset/climate-prediction-center-cpc-global-precipitation-time-series
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 1, 2023
    Dataset authored and provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    The global precipitation time series provides time series charts showing observations of daily precipitation as well as accumulated precipitation compared to normal accumulated amounts for various stations around the world. These charts are created for different scales of time (30, 90, 365 days). Each station has a graphic that contains two charts. The first chart in the graphic is a time series in the format of a line graph, representing accumulated precipitation for each day in the time series compared to the accumulated normal amount of precipitation. The second chart is a bar graph displaying actual daily precipitation. The total accumulation and surplus or deficit amounts are displayed as text on the charts representing the entire time scale, in both inches and millimeters. The graphics are updated daily and the graphics reflect the updated observations and accumulated precipitation amounts including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.

  11. w

    Data from: Climate Prediction Center (CPC) Global Temperature Time Series

    • data.wu.ac.at
    • datadiscoverystudio.org
    html
    Updated Jan 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Commerce (2016). Climate Prediction Center (CPC) Global Temperature Time Series [Dataset]. https://data.wu.ac.at/odso/data_gov/MmIwZDk5NjgtM2RmOS00YmFmLTliMzgtZjk1ZDdmMzY4MGFj
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jan 29, 2016
    Dataset provided by
    Department of Commerce
    Area covered
    84c9c8bd0e7080c290688624df00d6e50f14451c
    Description

    The global temperature time series provides time series charts using station based observations of daily temperature. These charts provide information about the observations compared to the derived daily normal temperature for various time scales (30, 90, 365 days). Each station has a graphic that contains three charts. The first chart in the graphic is a time series in the format of a line graph, representing the daily average temperatures compared to the expected daily normal temperatures. The second chart is a bar graph displaying daily departures from normal, including a line depicting the mean departure for the period. The third chart is a time series of the observed daily maximum and minimum temperatures. The graphics are updated daily and the graphics reflect the updated observations including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.

  12. Statistical Data Analysis using R

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Barsanelli Costa (2023). Statistical Data Analysis using R [Dataset]. http://doi.org/10.6084/m9.figshare.5501035.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Samuel Barsanelli Costa
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.

  13. D

    Data from: Indicator from the graph Laplacian of stock market time series...

    • researchdata.ntu.edu.sg
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DR-NTU (Data) (2024). Indicator from the graph Laplacian of stock market time series cross sections can precisely determine the durations of market crashes [Dataset]. http://doi.org/10.21979/N9/7YNZAQ
    Explore at:
    application/x-compressed(4042980746), application/x-compressed(7263573043), application/x-compressed(327987), txt(4855)Available download formats
    Dataset updated
    Sep 23, 2024
    Dataset provided by
    DR-NTU (Data)
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2019 - Jun 30, 2022
    Dataset funded by
    Ministry of Education, Singapore
    Ministry of Education (MOE)
    Description

    This repository include the processed ultrametric distance matrices data, MATLAB scripts and data holder files (in .mat format) used to generate the results and figures in the PLOS paper with the above title.

  14. Enron Email Time-Series Network

    • zenodo.org
    • explore.openaire.eu
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst (2020). Enron Email Time-Series Network [Dataset]. http://doi.org/10.5281/zenodo.1342353
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We use the Enron email dataset to build a network of email addresses. It contains 614586 emails sent over the period from 6 January 1998 until 4 February 2004. During the pre-processing, we remove the periods of low activity and keep the emails from 1 January 1999 until 31 July 2002 which is 1448 days of email records in total. Also, we remove email addresses that sent less than three emails over that period. In total, the Enron email network contains 6 600 nodes and 50 897 edges.

    To build a graph G = (V, E), we use email addresses as nodes V. Every node vi has an attribute which is a time-varying signal that corresponds to the number of emails sent from this address during a day. We draw an edge eij between two nodes i and j if there is at least one email exchange between the corresponding addresses.

    Column 'Count' in 'edges.csv' file is the number of 'From'->'To' email exchanges between the two addresses. This column can be used as an edge weight.

    The file 'nodes.csv' contains a dictionary that is a compressed representation of time-series. The format of the dictionary is Day->The Number Of Emails Sent By the Address During That Day. The total number of days is 1448.

    'id-email.csv' is a file containing the actual email addresses.

  15. H

    MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO

    • dataverse.harvard.edu
    Updated Nov 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dirceu Melo (2021). MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO [Dataset]. http://doi.org/10.7910/DVN/V1WQ7D
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 14, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Dirceu Melo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In this MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO We will first present basic notions that will allow the understanding of the mapping process, then we'll show the computational idea. Finally, let's work with the R scripts inside the RStudio, exploring pseudo-random series, Brownian motion series, periodic series, series of fibonacci and series of audio signals. We'll show you: 1) how to generate time series in RS Studio and later turn them into visibility graphs. 2) how to import time series allocated in a directory, turning them into visibility graphs. 3) how to visualize networks using three types of algorithms, followed by calculation and visualization of the main properties of complex networks. About the codes included The 3 codes included generates visibility graphs of series generated by RStudio functions. This code also calculates some metrics for complex networks, generates the graph plot and its degree distribution, shows the plot of the series and its histogram.

  16. d

    Surface-Water-Quality Data and Time-Series Plots to Support Implementation...

    • datasets.ai
    • data.usgs.gov
    • +1more
    55
    Updated Sep 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior (2024). Surface-Water-Quality Data and Time-Series Plots to Support Implementation of Site-Dependent Aluminum Criteria in Massachusetts, 2018–19 (ver. 1.1, Februrary 2023) [Dataset]. https://datasets.ai/datasets/surface-water-quality-data-and-time-series-plots-to-support-implementation-of-site-depende
    Explore at:
    55Available download formats
    Dataset updated
    Sep 11, 2024
    Dataset authored and provided by
    Department of the Interior
    Description

    This data release includes water-quality data collected at 38 sites in central and eastern Massachusetts from April 2018 through May 2019 by the U.S. Geological Survey to support the implementation of site-dependent aluminum criteria for Massachusetts waters. Samples of effluent and receiving surface waters were collected monthly at four wastewater-treatment facilities (WWTFs) and seven water-treatment facilities (WTFs) (see SWQ_data_and_instantaneous_CMC_CCC_values.txt). The measured properties and constituents include pH, hardness, and filtered (dissolved) organic carbon, which are required inputs to the U.S. Environmental Protection Agency's Aluminum Criteria Calculator version 2.0. Outputs from the Aluminum Criteria Calculator are also provided in that file; these outputs consist of acute (Criterion Maximum Concentration, CMC) and chronic (Criterion Continuous Concentration, CCC) instantaneous water-quality values for total recoverable aluminum, calculated for monthly samples at selected ambient sites near each of the 11 facilities. Quality-control data from blank, replicate, and spike samples are provided (see SWQ_QC_data.txt). In addition to data tables, the data release includes time-series graphs of the discrete water-quality data (see SWQ_plot_discrete_all.zip). For pH, time-series graphs also are provided showing pH from the discrete monthly water-quality samples as well as near-continuous pH measured at one surface-water site at each facility (see SWQ_plot_contin_discrete_pH.zip). The near-continuous pH data, along with all of the discrete water-quality data except the quality-control data, are also available online from the U.S. Geological Survey's National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/nwis).

  17. Time series plot (RKSI).xlsx

    • figshare.com
    xlsx
    Updated Sep 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoonbae Chung (2022). Time series plot (RKSI).xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.21078223.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 11, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yoonbae Chung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    raw data for time series plot of model and observation data

  18. d

    HUN groundwater flow rate time series v01

    • data.gov.au
    • gimi9.com
    • +2more
    zip
    Updated Apr 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2022). HUN groundwater flow rate time series v01 [Dataset]. https://data.gov.au/data/dataset/57b928ac-9d9d-407a-87d8-8405f4a4b11a
    Explore at:
    zip(702289)Available download formats
    Dataset updated
    Apr 13, 2022
    Dataset authored and provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    The dataset includes a script and data for generating flow rate time-series figures for HUN GW modelling. The flow rate data points represent historical pumping rates and estimates of future pumping rates used to represent the impacts of coal mining on groundwater levels and surface water - groundwater fluxes in the Hunter subregion.

    The script was written to generate time-series graphs of flow rates used in the HUN GW modelling for each mine in the Hunter subregion.

    Dataset History

    Historical mine water pumping rates and estimates of future flow rates were extracted from mining reports (groundwater modelling within mine Environmental Assessments) for each baseline and additional coal resource development modelled in the Hunter subregion. These flow rates are inputs to the groundwater model to represent the impacts of coal mining over time on groundwater (drawdowns and changes in surface water - groundwater fluxes).

    A script was written to generate time-series graphs for each mine represented in the groundwater model. The full set of mining reports from which data were extracted and the time-series graphs generated from these data are included in Herron et al. (2016).

    Herron NF, Frery E, Wilkins A, Crosbie RS, Peña-Arancibia JL, Zhang YQ, Viney NR, Rachakonda PK, Ramage A, Marvanek SP,

    Gresham MP and McVicar TR (2016) Observations analysis, statistical analysis and interpolation for the Hunter subregion. Product 2.1-2.2 for the Hunter subregion from the Northern Sydney Basin Bioregional Assessment. Department of the Environment, Bureau of Meteorology, CSIRO and Geoscience Australia, Australia. http://data.bioregionalassessments.gov.au/product/NSB/HUN/2.1-2.2.

    Dataset Citation

    Bioregional Assessment Programme (XXXX) HUN groundwater flow rate time series v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/57b928ac-9d9d-407a-87d8-8405f4a4b11a.

    Dataset Ancestors

  19. Commercial Tool Rental Data For 2016 and 2017

    • kaggle.com
    Updated Feb 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Maillie (2019). Commercial Tool Rental Data For 2016 and 2017 [Dataset]. https://www.kaggle.com/dmaillie/commercial-tool-rental-data-for-2016-and-2017/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    David Maillie
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by David Maillie

    Released under CC BY-SA 4.0

    Contents

  20. NYC Bike Sharing Network: Time-Series Enhanced Nodes and Edges Dataset

    • zenodo.org
    json
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Constantin Urbainsky; Constantin Urbainsky (2024). NYC Bike Sharing Network: Time-Series Enhanced Nodes and Edges Dataset [Dataset]. http://doi.org/10.5281/zenodo.13846868
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Constantin Urbainsky; Constantin Urbainsky
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    This dataset presents a comprehensive graph representation of the New York City Bike Sharing system, structured with nodes representing stations and edges delineating trips between these stations. The dataset is distinctive in integrating dynamic properties as time series data, which are meticulously updated using historical records (csv files) and live data feeds (gbfs files) provided by NYC Bike sharing system.

    • Nodes:

      • Source: Data is collected from the New York City Bike Station Information API.
      • Attributes:
        • ID: Unique identifier for each station.
        • Name: Name of the station.
        • Capacity: Number of bikes the station can accommodate.
        • Short ID: A condensed identifier used internally.
      • Time Series Data:
        • Updated every 5 minutes from the Station Status API.
        • Captures changes in bike availability, recording values only when they differ from previous data points.
    • Edges:

      • Source: Compiled from trip data provided in CSV format specific to NYC Bike Sharing.
      • Attributes:
        • Trip Counter: Total number of trips recorded.
        • Bike Type Counter: Counts trips made with electric versus classic bikes.
        • Trip Type Counter: Separates trips made by members versus casual riders.
        • Active Trips Tracker: Tracks the number of active trips at any given moment.
      • Aggregation: Trip data between identical start and end points, in the same direction, are aggregated into a single edge, with time-series tracking the frequency of these trips.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3

Data from: Nonparametric Anomaly Detection on Time Series of Graphs

Related Article
Explore at:
zipAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

Search
Clear search
Close search
Google apps
Main menu