100+ datasets found

f
Data from: Nonparametric Anomaly Detection on Time Series of Graphs
tandf.figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13180181.v3
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.
R
CAMELS-FR time series dynamic graphs
entrepot.recherche.data.gouv.fr
text/markdown, zip
Updated Sep 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olivier Delaigue; Olivier Delaigue; Benoît Génot; Guilherme Mendoza Guimarães; Guilherme Mendoza Guimarães; Benoît Génot (2024). CAMELS-FR time series dynamic graphs [Dataset]. http://doi.org/10.57745/HBQWP5
Explore at:
text/markdown(2250), zip(297806091), zip(297833679)Available download formats
Unique identifier
https://doi.org/10.57745/HBQWP5
Dataset updated
Sep 20, 2024
Dataset provided by
Recherche Data Gouv
Authors
Olivier Delaigue; Olivier Delaigue; Benoît Génot; Guilherme Mendoza Guimarães; Guilherme Mendoza Guimarães; Benoît Génot
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Area covered
France
Description
These dynamic graphs are derived from the "CAMELS-FR dataset". A html file is provided for each catchment, where dynamic plots of hydroclimatic time series are displayed. The files are available in a few languages.
Wikipedia time-series graph
zenodo.org
data.niaid.nih.gov
bin, csv
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre (2025). Wikipedia time-series graph [Dataset]. http://doi.org/10.5281/zenodo.886484
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.886484
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Wikipedia temporal graph.

The dataset is based on two Wikipedia SQL dumps: (1) English language articles and (2) user visit counts per page per hour (aka pagecounts). The original datasets are publicly available on the Wikimedia website.

Static graph structure is extracted from English language Wikipedia articles. Redirects are removed. Before building the Wikipedia graph we introduce thresholds on the minimum number of visits per hour and maximum in-degree. We remove the pages that have less than 500 visits per hour at least once during the specified period. Besides, we remove the nodes (pages) with in-degree higher than 8 000 to build a more meaningful initial graph. After cleaning, the graph contains 116 016 nodes (out of total 4 856 639 pages), 6 573 475 edges. The graph can be imported in two ways: (1) using edges.csv and vertices.csv or (2) using enwiki-20150403-graph.gt file that can be opened with open source Python library Graph-Tool.

Time-series data contains users' visit counts from 02:00, 23 September 2014 until 23:00, 30 April 2015. The total number of hours is 5278. The data is stored in two formats: CSV and H5. CSV file contains data in the following format [page_id :: count_views :: layer], where layer represents an hour. In H5 file, each layer corresponds to an hour as well.
f
Comparison of classification results.
plos.figshare.com
xls
Updated Jun 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amjad Iqbal; Rashid Amin; Faisal S. Alsubaei; Abdulrahman Alzahrani (2024). Comparison of classification results. [Dataset]. http://doi.org/10.1371/journal.pone.0303890.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0303890.t002
Dataset updated
Jun 6, 2024
Dataset provided by
PLOS ONE
Authors
Amjad Iqbal; Rashid Amin; Faisal S. Alsubaei; Abdulrahman Alzahrani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Anomaly detection in time series data is essential for fraud detection and intrusion monitoring applications. However, it poses challenges due to data complexity and high dimensionality. Industrial applications struggle to process high-dimensional, complex data streams in real time despite existing solutions. This study introduces deep ensemble models to improve traditional time series analysis and anomaly detection methods. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks effectively handle variable-length sequences and capture long-term relationships. Convolutional Neural Networks (CNNs) are also investigated, especially for univariate or multivariate time series forecasting. The Transformer, an architecture based on Artificial Neural Networks (ANN), has demonstrated promising results in various applications, including time series prediction and anomaly detection. Graph Neural Networks (GNNs) identify time series anomalies by capturing temporal connections and interdependencies between periods, leveraging the underlying graph structure of time series data. A novel feature selection approach is proposed to address challenges posed by high-dimensional data, improving anomaly detection by selecting different or more critical features from the data. This approach outperforms previous techniques in several aspects. Overall, this research introduces state-of-the-art algorithms for anomaly detection in time series data, offering advancements in real-time processing and decision-making across various industrial sectors.
H
Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...
dataverse.harvard.edu
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ZZDYM9
Dataset updated
Jul 8, 2024
Dataset provided by
Harvard Dataverse
Authors
Georgios Boumis; Brad Peter
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...
H
Replication Data for: visibility graphs algorithm in R language
dataverse.harvard.edu
Updated Dec 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dirceu Melo (2020). Replication Data for: visibility graphs algorithm in R language [Dataset]. http://doi.org/10.7910/DVN/XMLHZD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/XMLHZD
Dataset updated
Dec 23, 2020
Dataset provided by
Harvard Dataverse
Authors
Dirceu Melo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Implementation of visibility graphs algorithm in R language. These scripts generate visibility graphs from series built in RStudio or imported into RStudio; Plot the series, the series histogram, the degree distribution of the graphs generated from these series; They determine the fit of the distribution curve in a log-log graph; Calculates the fundamental metrics of complex networks for the visibility graphs associated with each series.
W
HUN Mine Footprints Timeseries Graph v01
cloud.csiss.gmu.edu
researchdata.edu.au
+1more
Updated Dec 14, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australia (2019). HUN Mine Footprints Timeseries Graph v01 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/11493517-df5f-49ed-84dc-23afdbe00c5e
Explore at:
Dataset updated
Dec 14, 2019
Dataset provided by
Australia
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset contains time series figures (shown in the report) generated for baseline and crdp mine footprints , which represent the footprints used in the surface water modelling. The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadhseet (HUN mine time series tables v01).

Dataset History

The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadsheet (HUN mine time series tables v01). Timelines for all mines were assembled into the spreadsheet Mine_files_summary_Final.xlsx. The script MineFootprint_TimeSeries_Final.m reads the data from the spreadsheet and creates the time series figures in png format which form the dataset.

Dataset Citation

Bioregional Assessment Programme (XXXX) HUN Mine Footprints Timeseries Graph v01. Bioregional Assessment Derived Dataset. Viewed 22 June 2018, http://data.bioregionalassessments.gov.au/dataset/11493517-df5f-49ed-84dc-23afdbe00c5e.

Dataset Ancestors

Derived From HUN Groundwater footprint polygons v01

Derived From HUN mine time series tables v01

Derived From BILO Gridded Climate Data: Daily Climate Data for each year from 1900 to 2012

Derived From HUN Historical Landsat Images Mine Foot Prints v01

Derived From Historical Mining footprints DTIRIS HUN 20150707

Derived From HUN Mine footprints for timeseries

Derived From Climate model 0.05x0.05 cells and cell centroids

Derived From HUN Historical Landsat Derived Mine Foot Prints v01

Derived From HUN SW footprint shapefiles v01

Derived From Mean Annual Climate Data of Australia 1981 to 2012
1000 Empirical Time series
figshare.com
researchdata.edu.au
png
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5436136.v10
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ben Fulcher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.
4
Data underlying Ph.D. thesis: Large set of graphs and timeseries of supply...
data.4tu.nl
zip
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabelle van Schilt (2024). Data underlying Ph.D. thesis: Large set of graphs and timeseries of supply chain simulation model [Dataset]. http://doi.org/10.4121/adf4373c-7a9a-4d9c-a1ff-0f893d8d0b06.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/adf4373c-7a9a-4d9c-a1ff-0f893d8d0b06.v1
Dataset updated
Jul 22, 2024
Dataset provided by
4TU.ResearchData
Authors
Isabelle van Schilt
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This data is part of the Ph.D. thesis of Isabelle M. van Schilt, Delft University of Technology.

Data includes the time series data of the synthetic counterfeit PPE supply chain discrete event simulation model. This time series data is used for the paper of structural uncertainty and the quality diversity (QD) algorithm.

Also, the data includes the decision variables dictionary for both papers. These are two dictionaries in .pkl format that include 40.000 randomly generated graphs with real-world port data for the case study. One dictionary is sorted on betweenness, and the other (QD) on the density of the network. Following, a database example of 50.000 randomly generated graphs (without real-world data) has been included in this data.
Data from: Climate Prediction Center (CPC) Global Precipitation Time Series
data.cnra.ca.gov
datadiscoverystudio.org
+1more
html
Updated Mar 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Oceanic and Atmospheric Administration (2023). Climate Prediction Center (CPC) Global Precipitation Time Series [Dataset]. https://data.cnra.ca.gov/dataset/climate-prediction-center-cpc-global-precipitation-time-series
Explore at:
htmlAvailable download formats
Dataset updated
Mar 1, 2023
Dataset authored and provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Description
The global precipitation time series provides time series charts showing observations of daily precipitation as well as accumulated precipitation compared to normal accumulated amounts for various stations around the world. These charts are created for different scales of time (30, 90, 365 days). Each station has a graphic that contains two charts. The first chart in the graphic is a time series in the format of a line graph, representing accumulated precipitation for each day in the time series compared to the accumulated normal amount of precipitation. The second chart is a bar graph displaying actual daily precipitation. The total accumulation and surplus or deficit amounts are displayed as text on the charts representing the entire time scale, in both inches and millimeters. The graphics are updated daily and the graphics reflect the updated observations and accumulated precipitation amounts including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.
w
Data from: Climate Prediction Center (CPC) Global Temperature Time Series
data.wu.ac.at
datadiscoverystudio.org
html
Updated Jan 29, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Commerce (2016). Climate Prediction Center (CPC) Global Temperature Time Series [Dataset]. https://data.wu.ac.at/odso/data_gov/MmIwZDk5NjgtM2RmOS00YmFmLTliMzgtZjk1ZDdmMzY4MGFj
Explore at:
htmlAvailable download formats
Dataset updated
Jan 29, 2016
Dataset provided by
Department of Commerce
Area covered
84c9c8bd0e7080c290688624df00d6e50f14451c
Description
The global temperature time series provides time series charts using station based observations of daily temperature. These charts provide information about the observations compared to the derived daily normal temperature for various time scales (30, 90, 365 days). Each station has a graphic that contains three charts. The first chart in the graphic is a time series in the format of a line graph, representing the daily average temperatures compared to the expected daily normal temperatures. The second chart is a bar graph displaying daily departures from normal, including a line depicting the mean departure for the period. The third chart is a time series of the observed daily maximum and minimum temperatures. The graphics are updated daily and the graphics reflect the updated observations including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.
Statistical Data Analysis using R
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Barsanelli Costa (2023). Statistical Data Analysis using R [Dataset]. http://doi.org/10.6084/m9.figshare.5501035.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5501035.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Samuel Barsanelli Costa
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.
D
Data from: Indicator from the graph Laplacian of stock market time series...
researchdata.ntu.edu.sg
Updated Sep 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DR-NTU (Data) (2024). Indicator from the graph Laplacian of stock market time series cross sections can precisely determine the durations of market crashes [Dataset]. http://doi.org/10.21979/N9/7YNZAQ
Explore at:
application/x-compressed(4042980746), application/x-compressed(7263573043), application/x-compressed(327987), txt(4855)Available download formats
Unique identifier
https://doi.org/10.21979/N9/7YNZAQ
Dataset updated
Sep 23, 2024
Dataset provided by
DR-NTU (Data)
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Time period covered
Jan 1, 2019 - Jun 30, 2022
Dataset funded by
Ministry of Education, Singapore
Ministry of Education (MOE)
Description
This repository include the processed ultrametric distance matrices data, MATLAB scripts and data holder files (in .mat format) used to generate the results and figures in the PLOS paper with the above title.
Enron Email Time-Series Network
zenodo.org
explore.openaire.eu
csv
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst (2020). Enron Email Time-Series Network [Dataset]. http://doi.org/10.5281/zenodo.1342353
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1342353
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We use the Enron email dataset to build a network of email addresses. It contains 614586 emails sent over the period from 6 January 1998 until 4 February 2004. During the pre-processing, we remove the periods of low activity and keep the emails from 1 January 1999 until 31 July 2002 which is 1448 days of email records in total. Also, we remove email addresses that sent less than three emails over that period. In total, the Enron email network contains 6 600 nodes and 50 897 edges.

To build a graph G = (V, E), we use email addresses as nodes V. Every node v_i has an attribute which is a time-varying signal that corresponds to the number of emails sent from this address during a day. We draw an edge e_ij between two nodes i and j if there is at least one email exchange between the corresponding addresses.

Column 'Count' in 'edges.csv' file is the number of 'From'->'To' email exchanges between the two addresses. This column can be used as an edge weight.

The file 'nodes.csv' contains a dictionary that is a compressed representation of time-series. The format of the dictionary is Day->The Number Of Emails Sent By the Address During That Day. The total number of days is 1448.

'id-email.csv' is a file containing the actual email addresses.
H
MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO
dataverse.harvard.edu
Updated Nov 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dirceu Melo (2021). MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO [Dataset]. http://doi.org/10.7910/DVN/V1WQ7D
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/V1WQ7D
Dataset updated
Nov 14, 2021
Dataset provided by
Harvard Dataverse
Authors
Dirceu Melo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
In this MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO We will first present basic notions that will allow the understanding of the mapping process, then we'll show the computational idea. Finally, let's work with the R scripts inside the RStudio, exploring pseudo-random series, Brownian motion series, periodic series, series of fibonacci and series of audio signals. We'll show you: 1) how to generate time series in RS Studio and later turn them into visibility graphs. 2) how to import time series allocated in a directory, turning them into visibility graphs. 3) how to visualize networks using three types of algorithms, followed by calculation and visualization of the main properties of complex networks. About the codes included The 3 codes included generates visibility graphs of series generated by RStudio functions. This code also calculates some metrics for complex networks, generates the graph plot and its degree distribution, shows the plot of the series and its histogram.
d
Surface-Water-Quality Data and Time-Series Plots to Support Implementation...
datasets.ai
data.usgs.gov
+1more
55
Updated Sep 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of the Interior (2024). Surface-Water-Quality Data and Time-Series Plots to Support Implementation of Site-Dependent Aluminum Criteria in Massachusetts, 2018–19 (ver. 1.1, Februrary 2023) [Dataset]. https://datasets.ai/datasets/surface-water-quality-data-and-time-series-plots-to-support-implementation-of-site-depende
Explore at:
55Available download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
Department of the Interior
Description
This data release includes water-quality data collected at 38 sites in central and eastern Massachusetts from April 2018 through May 2019 by the U.S. Geological Survey to support the implementation of site-dependent aluminum criteria for Massachusetts waters. Samples of effluent and receiving surface waters were collected monthly at four wastewater-treatment facilities (WWTFs) and seven water-treatment facilities (WTFs) (see SWQ_data_and_instantaneous_CMC_CCC_values.txt). The measured properties and constituents include pH, hardness, and filtered (dissolved) organic carbon, which are required inputs to the U.S. Environmental Protection Agency's Aluminum Criteria Calculator version 2.0. Outputs from the Aluminum Criteria Calculator are also provided in that file; these outputs consist of acute (Criterion Maximum Concentration, CMC) and chronic (Criterion Continuous Concentration, CCC) instantaneous water-quality values for total recoverable aluminum, calculated for monthly samples at selected ambient sites near each of the 11 facilities. Quality-control data from blank, replicate, and spike samples are provided (see SWQ_QC_data.txt). In addition to data tables, the data release includes time-series graphs of the discrete water-quality data (see SWQ_plot_discrete_all.zip). For pH, time-series graphs also are provided showing pH from the discrete monthly water-quality samples as well as near-continuous pH measured at one surface-water site at each facility (see SWQ_plot_contin_discrete_pH.zip). The near-continuous pH data, along with all of the discrete water-quality data except the quality-control data, are also available online from the U.S. Geological Survey's National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/nwis).
Time series plot (RKSI).xlsx
figshare.com
xlsx
Updated Sep 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yoonbae Chung (2022). Time series plot (RKSI).xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.21078223.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21078223.v1
Dataset updated
Sep 11, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Yoonbae Chung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
raw data for time series plot of model and observation data
d
HUN groundwater flow rate time series v01
data.gov.au
gimi9.com
+2more
zip
Updated Apr 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2022). HUN groundwater flow rate time series v01 [Dataset]. https://data.gov.au/data/dataset/57b928ac-9d9d-407a-87d8-8405f4a4b11a
Explore at:
zip(702289)Available download formats
Dataset updated
Apr 13, 2022
Dataset authored and provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

The dataset includes a script and data for generating flow rate time-series figures for HUN GW modelling. The flow rate data points represent historical pumping rates and estimates of future pumping rates used to represent the impacts of coal mining on groundwater levels and surface water - groundwater fluxes in the Hunter subregion.

The script was written to generate time-series graphs of flow rates used in the HUN GW modelling for each mine in the Hunter subregion.

Dataset History

Historical mine water pumping rates and estimates of future flow rates were extracted from mining reports (groundwater modelling within mine Environmental Assessments) for each baseline and additional coal resource development modelled in the Hunter subregion. These flow rates are inputs to the groundwater model to represent the impacts of coal mining over time on groundwater (drawdowns and changes in surface water - groundwater fluxes).

A script was written to generate time-series graphs for each mine represented in the groundwater model. The full set of mining reports from which data were extracted and the time-series graphs generated from these data are included in Herron et al. (2016).

Herron NF, Frery E, Wilkins A, Crosbie RS, Peña-Arancibia JL, Zhang YQ, Viney NR, Rachakonda PK, Ramage A, Marvanek SP,

Gresham MP and McVicar TR (2016) Observations analysis, statistical analysis and interpolation for the Hunter subregion. Product 2.1-2.2 for the Hunter subregion from the Northern Sydney Basin Bioregional Assessment. Department of the Environment, Bureau of Meteorology, CSIRO and Geoscience Australia, Australia. http://data.bioregionalassessments.gov.au/product/NSB/HUN/2.1-2.2.

Dataset Citation

Bioregional Assessment Programme (XXXX) HUN groundwater flow rate time series v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/57b928ac-9d9d-407a-87d8-8405f4a4b11a.

Dataset Ancestors

Derived From HUN GW Model Mines raw data v01
Commercial Tool Rental Data For 2016 and 2017
kaggle.com
Updated Feb 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Maillie (2019). Commercial Tool Rental Data For 2016 and 2017 [Dataset]. https://www.kaggle.com/dmaillie/commercial-tool-rental-data-for-2016-and-2017/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
David Maillie
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset

This dataset was created by David Maillie

Released under CC BY-SA 4.0

Contents
NYC Bike Sharing Network: Time-Series Enhanced Nodes and Edges Dataset
zenodo.org
json
Updated Sep 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Constantin Urbainsky; Constantin Urbainsky (2024). NYC Bike Sharing Network: Time-Series Enhanced Nodes and Edges Dataset [Dataset]. http://doi.org/10.5281/zenodo.13846868
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13846868
Dataset updated
Sep 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Constantin Urbainsky; Constantin Urbainsky
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
New York
Description
This dataset presents a comprehensive graph representation of the New York City Bike Sharing system, structured with nodes representing stations and edges delineating trips between these stations. The dataset is distinctive in integrating dynamic properties as time series data, which are meticulously updated using historical records (csv files) and live data feeds (gbfs files) provided by NYC Bike sharing system.

Nodes:

Source: Data is collected from the New York City Bike Station Information API.

Attributes:

ID: Unique identifier for each station.

Name: Name of the station.

Capacity: Number of bikes the station can accommodate.

Short ID: A condensed identifier used internally.

Time Series Data:

Updated every 5 minutes from the Station Status API.

Captures changes in bike availability, recording values only when they differ from previous data points.

Edges:

Source: Compiled from trip data provided in CSV format specific to NYC Bike Sharing.

Attributes:

Trip Counter: Total number of trips recorded.

Bike Type Counter: Counts trips made with electric versus classic bikes.

Trip Type Counter: Separates trips made by members versus casual riders.

Active Trips Tracker: Tracks the number of active trips at any given moment.

Aggregation: Trip data between identical start and end points, in the same direction, are aggregated into a single edge, with time-series tracking the frequency of these trips.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3

Data from: Nonparametric Anomaly Detection on Time Series of Graphs

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.13180181.v3

Dataset updated

May 31, 2023

Dataset provided by

Taylor & Francis

Authors

Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

Clear search

Close search

Google apps

Main menu

Data from: Nonparametric Anomaly Detection on Time Series of Graphs

CAMELS-FR time series dynamic graphs

Wikipedia time-series graph

Comparison of classification results.

Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

Replication Data for: visibility graphs algorithm in R language

HUN Mine Footprints Timeseries Graph v01

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

1000 Empirical Time series

Data underlying Ph.D. thesis: Large set of graphs and timeseries of supply...

Data from: Climate Prediction Center (CPC) Global Precipitation Time Series

Data from: Climate Prediction Center (CPC) Global Temperature Time Series

Statistical Data Analysis using R

Data from: Indicator from the graph Laplacian of stock market time series...

Enron Email Time-Series Network

MANUAL FOR VISIBILITY GRAPHS MODELING USING R-STUDIO

Surface-Water-Quality Data and Time-Series Plots to Support Implementation...

Time series plot (RKSI).xlsx

HUN groundwater flow rate time series v01

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

Commercial Tool Rental Data For 2016 and 2017

Dataset

Contents

NYC Bike Sharing Network: Time-Series Enhanced Nodes and Edges Dataset

Data from: Nonparametric Anomaly Detection on Time Series of Graphs