https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
These dynamic graphs are derived from the "CAMELS-FR dataset". A html file is provided for each catchment, where dynamic plots of hydroclimatic time series are displayed. The files are available in a few languages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wikipedia temporal graph.
The dataset is based on two Wikipedia SQL dumps: (1) English language articles and (2) user visit counts per page per hour (aka pagecounts). The original datasets are publicly available on the Wikimedia website.
Static graph structure is extracted from English language Wikipedia articles. Redirects are removed. Before building the Wikipedia graph we introduce thresholds on the minimum number of visits per hour and maximum in-degree. We remove the pages that have less than 500 visits per hour at least once during the specified period. Besides, we remove the nodes (pages) with in-degree higher than 8 000 to build a more meaningful initial graph. After cleaning, the graph contains 116 016 nodes (out of total 4 856 639 pages), 6 573 475 edges. The graph can be imported in two ways: (1) using edges.csv and vertices.csv or (2) using enwiki-20150403-graph.gt file that can be opened with open source Python library Graph-Tool.
Time-series data contains users' visit counts from 02:00, 23 September 2014 until 23:00, 30 April 2015. The total number of hours is 5278. The data is stored in two formats: CSV and H5. CSV file contains data in the following format [page_id :: count_views :: layer], where layer represents an hour. In H5 file, each layer corresponds to an hour as well.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.
Abstract The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The …Show full descriptionAbstract The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement. This dataset contains time series figures (shown in the report) generated for baseline and crdp mine footprints , which represent the footprints used in the surface water modelling. The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadhseet (HUN mine time series tables v01). Dataset History The footprints are contained within a single shapefile (HUN Mine footprints for timeseries) and the timelines contained within the the spreadsheet (HUN mine time series tables v01). Timelines for all mines were assembled into the spreadsheet Mine_files_summary_Final.xlsx. The script MineFootprint_TimeSeries_Final.m reads the data from the spreadsheet and creates the time series figures in png format which form the dataset. Dataset Citation Bioregional Assessment Programme (XXXX) HUN Mine Footprints Timeseries Graph v01. Bioregional Assessment Derived Dataset. Viewed 22 June 2018, http://data.bioregionalassessments.gov.au/dataset/11493517-df5f-49ed-84dc-23afdbe00c5e. Dataset Ancestors Derived From HUN Groundwater footprint polygons v01 Derived From HUN mine time series tables v01 Derived From BILO Gridded Climate Data: Daily Climate Data for each year from 1900 to 2012 Derived From HUN Historical Landsat Images Mine Foot Prints v01 Derived From Historical Mining footprints DTIRIS HUN 20150707 Derived From HUN Mine footprints for timeseries Derived From Climate model 0.05x0.05 cells and cell centroids Derived From HUN Historical Landsat Derived Mine Foot Prints v01 Derived From HUN SW footprint shapefiles v01 Derived From Mean Annual Climate Data of Australia 1981 to 2012
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Building TimeSeries (BTS) dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique Brick classes. Moreover, the metadata is standardized using the Brick schema.To get started, download the data and run the DIEF_inspect_raw.ipynb file.For more info, including data cards: https://github.com/cruiseresearchgroup/DIEF_BTS
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multivariate time series anomaly detection is a challenging problem because there can be a number of complex relationships between variables in multivariate time series. Although graph neural networks have been shown to be effective in capturing variable-variable relationships (i.e., relationships between two variables), they are hard to capture variable-group relationships (i.e., relationships between variables and groups of variables). To overcome this limitation, we propose a novel method called DHG-AD for multivariate time series anomaly detection. DHG-AD employs directed hypergraphs to model variable-group relationships within multivariate time series. For each time window, DHG-AD constructs two different directed hypergraphs to represent relationships between variables and groups of positively and negatively correlated variables, enabling the model to capture both types of relationships effectively. The directed hypergraph neural networks learn node representations from these hypergraphs, allowing comprehensive multivariate interaction modeling for anomaly detection. We show through experiments using various evaluation metrics that our proposed method achieves the best scores among the compared methods on two real-world datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We use the Enron email dataset to build a network of email addresses. It contains 614586 emails sent over the period from 6 January 1998 until 4 February 2004. During the pre-processing, we remove the periods of low activity and keep the emails from 1 January 1999 until 31 July 2002 which is 1448 days of email records in total. Also, we remove email addresses that sent less than three emails over that period. In total, the Enron email network contains 6 600 nodes and 50 897 edges.
To build a graph G = (V, E), we use email addresses as nodes V. Every node vi has an attribute which is a time-varying signal that corresponds to the number of emails sent from this address during a day. We draw an edge eij between two nodes i and j if there is at least one email exchange between the corresponding addresses.
Column 'Count' in 'edges.csv' file is the number of 'From'->'To' email exchanges between the two addresses. This column can be used as an edge weight.
The file 'nodes.csv' contains a dictionary that is a compressed representation of time-series. The format of the dictionary is Day->The Number Of Emails Sent By the Address During That Day. The total number of days is 1448.
'id-email.csv' is a file containing the actual email addresses.
The global temperature time series provides time series charts using station based observations of daily temperature. These charts provide information about the observations compared to the derived daily normal temperature for various time scales (30, 90, 365 days). Each station has a graphic that contains three charts. The first chart in the graphic is a time series in the format of a line graph, representing the daily average temperatures compared to the expected daily normal temperatures. The second chart is a bar graph displaying daily departures from normal, including a line depicting the mean departure for the period. The third chart is a time series of the observed daily maximum and minimum temperatures. The graphics are updated daily and the graphics reflect the updated observations including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.
The global precipitation time series provides time series charts showing observations of daily precipitation as well as accumulated precipitation compared to normal accumulated amounts for various stations around the world. These charts are created for different scales of time (30, 90, 365 days). Each station has a graphic that contains two charts. The first chart in the graphic is a time series in the format of a line graph, representing accumulated precipitation for each day in the time series compared to the accumulated normal amount of precipitation. The second chart is a bar graph displaying actual daily precipitation. The total accumulation and surplus or deficit amounts are displayed as text on the charts representing the entire time scale, in both inches and millimeters. The graphics are updated daily and the graphics reflect the updated observations and accumulated precipitation amounts including the latest daily data available. The available graphics are rotated, meaning that only the most recently created graphics are available. Previously made graphics are not archived.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A critical issue in intelligent building control is detecting energy consumption anomalies based on intelligent device status data. The building field is plagued by energy consumption anomalies caused by a number of factors, many of which are associated with one another in apparent temporal relationships. For the detection of abnormalities, most traditional detection methods rely solely on a single variable of energy consumption data and its time series changes. Therefore, they are unable to examine the correlation between the multiple characteristic factors that affect energy consumption anomalies and their relationship in time. The outcomes of anomaly detection are one-sided. To address the above problems, this paper proposes an anomaly detection method based on multivariate time series. Firstly, in order to extract the correlation between different feature variables affecting energy consumption, this paper introduces a graph convolutional network to build an anomaly detection framework. Secondly, as different feature variables have different influences on each other, the framework is enhanced by a graph attention mechanism so that time series features with higher influence on energy consumption are given more attention weights, resulting in better anomaly detection of building energy consumption. Finally, the effectiveness of this paper’s method and existing methods for detecting energy consumption anomalies in smart buildings are compared using standard data sets. The experimental results show that the model has better detection accuracy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents a comprehensive graph representation of the New York City Bike Sharing system, structured with nodes representing stations and edges delineating trips between these stations. The dataset is distinctive in integrating dynamic properties as time series data, which are meticulously updated using historical records (csv files) and live data feeds (gbfs files) provided by NYC Bike sharing system.
Nodes:
Edges:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
F1-score under the model without different components.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Version 4 important changes: - Added a compressed zip file "Evaluating different time-steps.zip" including the evaluation of performance for 10, 20, 30, 40, and 50 time-steps alternatives. - The "Generated Prediction Data of COVID-19's Daily Infections in Brazil.zip" compressed zip file includes all the files and folders in this dataset - except for the evaluation of time-steps alternatives.
Dataset general description:
• This dataset reports 4195 recurrent neural network models, their settings, and their generated prediction csv files, graphs, and metadata files, for predicting COVID-19's daily infections in Brazil by training on limited raw data (30 and 40 time-steps). The used code is developed by the author and located in the following online data repository link: http://dx.doi.org/10.17632/yp4d95pk7n.2
Dataset content:
• Models, Graphs, and csv predictions files: 1. Deterministic mode (DM): includes 1194 generated models' files (30 time-steps), and their generated 2835 graphs and 2835 predictions files. Similarly, this mode includes 1976 generated models' files (40 time-steps), and their generated 7301 graphs and 7301 predictions files. 2. Non-deterministic mode (NDM): includes 20 generated models' files (30 time-steps), and their generated 53 graphs and 53 predictions files. 3. Technical validation mode (TVM): includes 1001 generated models' files (30 time-steps), and their generated 3619 graphs and 3619 predictions files for 349 models (out of a 358 sample but 9 models didn't achieve the accuracy threshold), which are a sample of 1001 models. Also, all data of the control group - India. 4. 1 graph and 1 prediction files for each of DM and NDM, reporting evaluation till 2020-07-11.
• Settings and metadata for the above 3 categories: 1. Used settings during the training session in json files (files count in technical validation settings folder neglects the accuracy threshold - 5370 files, unlike the zip file - 3619 files). 2. Metadata: training - prediction setup and accuracy in csv files.
Raw data source used to train the models:
• The used raw data [1] for training the models is from: COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University) : https://github.com/CSSEGISandData/COVID-19 (accessed 2020-07-20)
• The models were trained on these versions of the raw data (both accessed 2020-07-08): 1. till 2020-06-29: https://github.com/CSSEGISandData/COVID-19/raw/78d91b2dbc2a26eb2b2101fa499c6798aa22fca8/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv 2. till 2020-06-13: https://github.com/CSSEGISandData/COVID-19/raw/02ea750a263f6d8b8945fdd3253b35d3fd9b1bee/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
References: 1- Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Inf Dis. 20(5):533-534. doi: 10.1016/S1473-3099(20)30120-1
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset was created by David Maillie
Released under CC BY-SA 4.0
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The compressed package (study code.zip) contains the code files implemented by an under review paper ("Predicting short-term PM2.5 concentrations at fine temporal resolutions using a multi-branch temporal graph convolutional neural network").
Among the study code.zip, main.py is the model code based on a multi-branch temporal graph convolutional neural network. tgcn.py is the temporal graph convolutional network. utils.py contains some functions of graph convolution process. input_data.py is data processing.
The zip file (study data.zip) provides an example of air quality data including PM2.5 concentrations and some meteorological data. input_data.zip also contains a N by N adjacency matrix, which describes the spatial relationship between air quality monitoring stations.
https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Graph data represents complex relationships across diverse domains, from social networks to healthcare and chemical sciences. However, real-world graph data often spans multiple modalities, including time-varying signals from sensors, semantic information from textual representations, and domain-specific encodings. This dissertation introduces innovative multimodal learning techniques for graph-based predictive modeling, addressing the intricate nature of these multidimensional data representations. The research systematically advances graph learning through innovative methodological approaches across three critical modalities. Initially, we establish robust graph-based methodological foundations through advanced techniques including prompt tuning for heterogeneous graphs and a comprehensive framework for imbalanced learning on graph data. we then extend these methods to time series analysis, demonstrating their practical utility through applications such as hierarchical spatio-temporal modeling for COVID-19 forecasting and graph-based density estimation for anomaly detection in unmanned aerial systems. Finally, we explore textual representations of graphs in the chemical domain, reformulating reaction yield prediction as an imbalanced regression problem to enhance performance in underrepresented high-yield regions critical to chemists.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
this graph was created in PowerBi and Tableau :
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F4971384824ecdf975f6f63bf341a34e5%2Ffoto1.png?generation=1737838005786071&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fce21ebb117d6564a8ba633599bae5f3a%2Ffoto2.jpg?generation=1737838012179281&alt=media" alt="">
The Human Capital Index (HCI) database provides data at the country level for each of the components of the Human Capital Index as well as for the overall index, disaggregated by gender. The index measures the amount of human capital that a child born today can expect to attain by age 18, given the risks of poor health and poor education that prevail in the country where she lives. It is designed to highlight how improvements in current health and education outcomes shape the productivity of the next generation of workers, assuming that children born today experience over the next 18 years the educational opportunities and health risks that children in this age range currently face.
This page presents Greenland's climate context for the current climatology, 1991-2020, derived from observed, historical data. Information should be used to build a strong understanding of current climate conditions in order to appreciate future climate scenarios and projected change. You can visualize data for the current climatology through spatial variation, the seasonal cycle, or as a time series. Analysis is available for both annual and seasonal data. Data presentation defaults to national-scale aggregation, however sub-national data aggregations can be accessed by clicking within a country, on a sub-national unit. Other historical climatologies can be selected from the Time Period dropdown list.
The Building TimeSeries (BTS) dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique ontologies. Moreover, the metadata is standardised in the formed of knowledge graph using the Brick schema.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
All headlines containing hurricane names Harvey, Irma, and Maria. Extracted from Mediacloud between 8/23/2017 and 10/01/2017.
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
These dynamic graphs are derived from the "CAMELS-FR dataset". A html file is provided for each catchment, where dynamic plots of hydroclimatic time series are displayed. The files are available in a few languages.