100+ datasets found

Real-World Signed Graphs Annotated for Whole Graph Classification
zenodo.org
zip
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noé Cécillon; Noé Cécillon; Vincent Labatut; Vincent Labatut; Richard Dufour; Richard Dufour; Nejat Arınık; Nejat Arınık (2025). Real-World Signed Graphs Annotated for Whole Graph Classification [Dataset]. http://doi.org/10.5281/zenodo.13851362
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13851362
Dataset updated
Jan 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Noé Cécillon; Noé Cécillon; Vincent Labatut; Vincent Labatut; Richard Dufour; Richard Dufour; Nejat Arınık; Nejat Arınık
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Warning: the ground truth is missing in certain of these datasets. This was fixed in version 1.0.1, which you should use instead.

Description: this corpus was designed as an experimental benchmark for a task of signed graph classification. It is composed of three datasets derived from external sources and adapted to our needs:

SpaceOrigin Conversations [1]: set of conversational graphs, each one associated to a situation of verbal abuse vs. normal situation. These conversations model interactions happening in chatrooms hosted by an MMORPG/ The graphs were originally unsigned: we attributed signed to the edges based on the polarity of the exchanged messages.

Correlation Clustering Instances [2]: set of graph generated randomly as instances of the Correlation Clustering problem, which consists in partitioning signed graphs. These graphs are not associated in any class in the original paper. We proposed a class based on certain features of the space of optimal solutions explored in [2].

European Parliament Roll-Calls [3]: vote networks extracted from the activity of French Members of the European Parliament. The original data does not have any class associated to the networks: we proposed one based on the number of political factions identified in each network in [3].

These data were used in [4] in order to train and assess various representation learning methods. The authors proposed Signed Graph2vec, a signed variant of Graph2vec; WSGCN, a whole-graph variant of Signed Graph Convolutional Networks (SGCN), and use an aggregated version of Signed Network Embeddings (SiNE) as a baseline. The article provides more information regarding the properties of the datasets, and how they were constituted.

Software: the software used to train the representation learning methods and classifiers is publicly available online: SWGE.

References:

Papegnies, É.; Labatut, V.; Dufour, R. & Linarès, G. Conversational Networks for Automatic Online Moderation. IEEE Transactions on Computational Social Systems, 2019, 6:38-55. DOI: 10.1109/TCSS.2018.2887240 ⟨hal-01999546⟩

Arınık, N.; Figueiredo, R. & Labatut, V. Multiplicity and Diversity: Analyzing the Optimal Solution Space of the Correlation Clustering Problem on Complete Signed Graphs. Journal of Complex Networks, 2020, 8(6):cnaa025. DOI: 10.1093/comnet/cnaa025 ⟨hal-02994011⟩

Arınık, N.; Figueiredo, R. & Labatut, V. Multiple partitioning of multiplex signed networks: Application to European parliament votes. Social Networks, 2020, 60:83-102. DOI: 10.1016/j.socnet.2019.02.001 ⟨hal-02082574⟩

Cécillon, N.; Labatut, V.; Dufour, R. & Arınık, N. Whole-Graph Representation Learning For the Classification of Signed Networks. IEEE Access, 2024, 12:151303-151316. DOI: 10.1109/ACCESS.2024.3472474 ⟨hal-04712854⟩

Funding: part of this work was funded by a grant from the Provence-Alpes-Côte-d'Azur region (PACA, France) and the Nectar de Code company.

Citation: If you use this data or the associated source code, please cite article [4]:

@Article{Cecillon2024,
author = {Cécillon, Noé and Labatut, Vincent and Dufour, Richard and Arınık, Nejat},
title = {Whole-Graph Representation Learning For the Classification of Signed Networks},
journal = {IEEE Access},
year = {2024},
volume = {12},
pages = {151303-151316},
doi = {10.1109/ACCESS.2024.3472474},
}
4
Code: Generating Graphs based on Real-World Port Data
data.4tu.nl
zip
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabelle van Schilt (2024). Code: Generating Graphs based on Real-World Port Data [Dataset]. http://doi.org/10.4121/72e97df0-147c-4228-a1b4-8bb8e8461317.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/72e97df0-147c-4228-a1b4-8bb8e8461317.v1
Dataset updated
Jul 22, 2024
Dataset provided by
4TU.ResearchData
Authors
Isabelle van Schilt
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This repository is part of the Ph.D. thesis of Isabelle M. van Schilt, Delft University of Technology.
This repository is used to generate a graph of open-source sea and airport data. For this, open-source data of the shipping schedules given by MSC, Maersk, HMM, and Evergreen is used. The data is collected from the websites of the shipping companies (see also https://github.com/EwoutH/shipping-data). The data is then processed to generate a graph of the shipping schedules, including the distributions of the shipping schedules. The graph is used to analyze the shipping schedules and to identify the most important ports in the network. Airport data is collected from the open-source OpenFlights database.
As case study, we collect data on CN-HK to main ports in the USA, and mostly MSC data on South America to NL-BE.
This repository is used for developing various graphs on open-source data and automatically running it as a simulation model in the repository: complex_stylized_supply_chain_model_generator
n
Data from: Empowering Graph Neural Networks for Real-World Tasks
curate.nd.edu
pdf
Updated Nov 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhichun Guo (2024). Empowering Graph Neural Networks for Real-World Tasks [Dataset]. http://doi.org/10.7274/25608504.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.7274/25608504.v1
Dataset updated
Nov 11, 2024
Dataset provided by
University of Notre Dame
Authors
Zhichun Guo
License
https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Description
Numerous types of real-world data can be naturally represented as graphs, such as social networks, trading networks, and biological molecules. This highlights the need for effective graph representations to support various tasks. In recent years, graph neural networks (GNNs) have demonstrated remarkable success in extracting information from graphs and enabling graph-related tasks. However, they still face a series of challenges in solving real-world problems, including scarcity of labeled data, scalability issues, potential bias, etc. These challenges stem from both domain-specific issues and inherent limitations of GNNs. This thesis introduces various strategies to tackle these challenges and empower GNNs on real-world tasks.

For the domain-specific challenges, in this thesis, we especially focus on challenges in the chemistry domain, which plays a pivotal role in the drug discovery process. Considering the significant resources needed for labeling through wet lab experiments, the AI for chemistry domain struggles with the scarcity of labeled datasets. To address this, we present a comprehensive set of strategies that span model-based and data-based strategies alongside a hybrid method. These methods ingeniously utilize the diversity of data, models, and molecular representations to compensate for the lack of labels in individual datasets. For the inherent challenges, this thesis introduces strategies to overcome two main challenges: scalability and degree-based issues, especially in the context of link prediction tasks. Both of these two challenges originate from the mechanism of GNNs, which involves the iterative aggregation of neighboring nodes' information to update each central node. For the scalability issue, our work not only preserves GNNs' prediction performance but also significantly boosts inference speed. Regarding degree bias, our work highly improves the effectiveness of GNNs for underrepresented nodes with very light additional computational costs. These contributions not only address critical gaps in applying GNNs to specific domains but also lay the groundwork for future exploration in the broader field of graph-based real-world tasks.
T
United States - Population Growth for World
tradingeconomics.com
csv, excel, json, xml
Updated Jun 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2019). United States - Population Growth for World [Dataset]. https://tradingeconomics.com/united-states/population-growth-for-world-fed-data.html
Explore at:
json, xml, csv, excelAvailable download formats
Dataset updated
Jun 6, 2019
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
United States
Description
United States - Population Growth for World was 0.89981 % Chg. at Annual Rate in January of 2023, according to the United States Federal Reserve. Historically, United States - Population Growth for World reached a record high of 2.13312 in January of 1971 and a record low of 0.82796 in January of 2021. Trading Economics provides the current actual value, an historical data chart and related indicators for United States - Population Growth for World - last updated from the United States Federal Reserve on August of 2025.
Real-World Graph Matching Dataset
zenodo.org
zip
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
binrui shen; binrui shen (2025). Real-World Graph Matching Dataset [Dataset]. http://doi.org/10.5281/zenodo.15803966
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15803966
Dataset updated
Jul 4, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
binrui shen; binrui shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
Attributed graphs are constructed from a public dataset \footnote{http://www.robots.ox.ac.uk/~vgg/research/affine/}, which contains eight sets of pictures, covers five common picture transformations: viewpoint changes, scale changes, image blur, JPEG compression, and illumination.

Nodes and corresponding features are extract by SIFT.

@misc{shen2024csgo,
title={CSGO: Constrained-Softassign Gradient Optimization For Large Graph Matching},
author={Binrui Shen and Qiang Niu and Shengxin Zhu},
year={2024},
eprint={2208.08233},
archivePrefix={arXiv},
primaryClass={math.CO},
url={https://arxiv.org/abs/2208.08233},
}
U
USA Percent of world population - data, chart | TheGlobalEconomy.com
theglobaleconomy.com
csv, excel, xml
Updated May 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globalen LLC (2016). USA Percent of world population - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/USA/population_share/
Explore at:
xml, csv, excelAvailable download formats
Dataset updated
May 18, 2016
Dataset authored and provided by
Globalen LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1960 - Dec 31, 2023
Area covered
United States
Description
The USA: Percent of world population: The latest value from 2023 is 4.2 percent, a decline from 4.21 percent in 2022. In comparison, the world average is 0.51 percent, based on data from 196 countries. Historically, the average for the USA from 1960 to 2023 is 4.93 percent. The minimum value, 4.2 percent, was reached in 2023 while the maximum of 6.04 percent was recorded in 1961.
T
World Coronavirus COVID-19 Deaths
tradingeconomics.com
csv, excel, json, xml
Updated Mar 9, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). World Coronavirus COVID-19 Deaths [Dataset]. https://tradingeconomics.com/world/coronavirus-deaths
Explore at:
excel, csv, xml, jsonAvailable download formats
Dataset updated
Mar 9, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 4, 2020 - May 17, 2023
Area covered
World, World
Description
The World Health Organization reported 6932591 Coronavirus Deaths since the epidemic began. In addition, countries reported 766440796 Coronavirus Cases. This dataset provides - World Coronavirus Deaths- actual values, historical data, forecast, chart, statistics, economic calendar and news.
i
MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets
ieee-dataport.org
Updated Jan 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohsen Koohi (2025). MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets [Dataset]. https://ieee-dataport.org/open-access/ms-biographs-trillion-scale-sequence-similarity-graph-datasets
Explore at:
Dataset updated
Jan 26, 2025
Authors
Mohsen Koohi
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
MS-BioGraphs are a family of sequence similarity graph datasets with up to 2.5 trillion edges. The graphs are weighted edges and presented in compressed WebGraph format. The dataset include symmetric and asymmetric graphs. The largest graph has been created by matching sequences in Metaclust dataset with 1.7 billion sequences. These real-world graph dataset are useful for measuring contributions in High-Performance Computing and High-Performance Graph Processing.
f
Precision, recall and F1-measure for TopNeighbors (TN), BestNeighbors (BN),...
plos.figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rania Ibrahim; David F. Gleich (2023). Precision, recall and F1-measure for TopNeighbors (TN), BestNeighbors (BN), Hyperlocal (HL) and HG-CRD on mathoverflow-answers hypergraph and five randomly chosen classes from stackoverflow-answers hypergraph dataset (which are relative-time-span, type-conversion, binary-data, zos and mainframe). [Dataset]. http://doi.org/10.1371/journal.pone.0243485.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0243485.t006
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Rania Ibrahim; David F. Gleich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Precision, recall and F1-measure for TopNeighbors (TN), BestNeighbors (BN), Hyperlocal (HL) and HG-CRD on mathoverflow-answers hypergraph and five randomly chosen classes from stackoverflow-answers hypergraph dataset (which are relative-time-span, type-conversion, binary-data, zos and mainframe).
C
China Percent of world population - data, chart | TheGlobalEconomy.com
theglobaleconomy.com
csv, excel, xml
Updated Apr 8, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globalen LLC (2016). China Percent of world population - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/China/population_share/
Explore at:
csv, excel, xmlAvailable download formats
Dataset updated
Apr 8, 2016
Dataset authored and provided by
Globalen LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1960 - Dec 31, 2023
Area covered
China
Description
China: Percent of world population: The latest value from 2023 is 17.6 percent, a decline from 17.78 percent in 2022. In comparison, the world average is 0.51 percent, based on data from 196 countries. Historically, the average for China from 1960 to 2023 is 20.86 percent. The minimum value, 17.6 percent, was reached in 2023 while the maximum of 22.76 percent was recorded in 1974.
f
Description of the real-world dataset.
plos.figshare.com
xls
Updated Jun 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fadi K. Dib; Peter Rodgers (2023). Description of the real-world dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0287744.t010
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0287744.t010
Dataset updated
Jun 27, 2023
Dataset provided by
PLOS ONE
Authors
Fadi K. Dib; Peter Rodgers
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Graph drawing, involving the automatic layout of graphs, is vital for clear data visualization and interpretation but poses challenges due to the optimization of a multi-metric objective function, an area where current search-based methods seek improvement. In this paper, we investigate the performance of Jaya algorithm for automatic graph layout with straight lines. Jaya algorithm has not been previously used in the field of graph drawing. Unlike most population-based methods, Jaya algorithm is a parameter-less algorithm in that it requires no algorithm-specific control parameters and only population size and number of iterations need to be specified, which makes it easy for researchers to apply in the field. To improve Jaya algorithm’s performance, we applied Latin Hypercube Sampling to initialize the population of individuals so that they widely cover the search space. We developed a visualization tool that simplifies the integration of search methods, allowing for easy performance testing of algorithms on graphs with weighted aesthetic metrics. We benchmarked the Jaya algorithm and its enhanced version against Hill Climbing and Simulated Annealing, commonly used graph-drawing search algorithms which have a limited number of parameters, to demonstrate Jaya algorithm’s effectiveness in the field. We conducted experiments on synthetic datasets with varying numbers of nodes and edges using the Erdős–Rényi model and real-world graph datasets and evaluated the quality of the generated layouts, and the performance of the methods based on number of function evaluations. We also conducted a scalability experiment on Jaya algorithm to evaluate its ability to handle large-scale graphs. Our results showed that Jaya algorithm significantly outperforms Hill Climbing and Simulated Annealing in terms of the quality of the generated graph layouts and the speed at which the layouts were produced. Using improved population sampling generated better layouts compared to the original Jaya algorithm using the same number of function evaluations. Moreover, Jaya algorithm was able to draw layouts for graphs with 500 nodes in a reasonable time.
T
United States - Rest of the World; Currency; Asset, Transactions
tradingeconomics.com
csv, excel, json, xml
Updated Mar 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). United States - Rest of the World; Currency; Asset, Transactions [Dataset]. https://tradingeconomics.com/united-states/rest-of-the-world-currency-asset-flow-mil-of-dollar-fed-data.html
Explore at:
excel, json, xml, csvAvailable download formats
Dataset updated
Mar 5, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
United States
Description
United States - Rest of the World; Currency; Asset, Transactions was 18408.00000 Mil. of $ in January of 2025, according to the United States Federal Reserve. Historically, United States - Rest of the World; Currency; Asset, Transactions reached a record high of 147444.00000 in July of 2020 and a record low of -38752.00000 in July of 2023. Trading Economics provides the current actual value, an historical data chart and related indicators for United States - Rest of the World; Currency; Asset, Transactions - last updated from the United States Federal Reserve on July of 2025.
G
Mammographs by country, around the world | TheGlobalEconomy.com
theglobaleconomy.com
csv, excel, xml
Updated Oct 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globalen LLC (2023). Mammographs by country, around the world | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/rankings/mammographs_per_million_people/
Explore at:
csv, xml, excelAvailable download formats
Dataset updated
Oct 11, 2023
Dataset authored and provided by
Globalen LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1980 - Dec 31, 2021
Area covered
World, World
Description
The average for 2019 based on 26 countries was 23.59 mammographs per million people. The highest value was in Greece: 66.78 mammographs per million people and the lowest value was in Poland: 10.11 mammographs per million people. The indicator is available from 1980 to 2021. Below is a chart for all countries where data are available.
f
Comparison between HG-CRD and MAPPR using undirected and directed graphs.
figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rania Ibrahim; David F. Gleich (2023). Comparison between HG-CRD and MAPPR using undirected and directed graphs. [Dataset]. http://doi.org/10.1371/journal.pone.0243485.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0243485.t004
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Rania Ibrahim; David F. Gleich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison between HG-CRD and MAPPR using undirected and directed graphs.
a
GEOGRAPHY TOOLKIT - 'TALLS' CHARTS AND GRAPHS HELPERS
sdgs.amerigeoss.org
library.ncge.org
Updated Jul 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCGE (2021). GEOGRAPHY TOOLKIT - 'TALLS' CHARTS AND GRAPHS HELPERS [Dataset]. https://sdgs.amerigeoss.org/documents/53be5b07485744138802846eb0d90173
Explore at:
Dataset updated
Jul 27, 2021
Dataset authored and provided by
NCGE
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Author: ANN WURST, educator, NGS TEACHER CONSULTANTGrade/Audience: grade 1, grade 2, grade 3, grade 4, grade 5, grade 6, grade 7, grade 8, high school, ap human geography, post secondary, professional developmentResource type: warm_upSubject topic(s): geographic thinkingRegion: worldStandards: (19) Social studies skills. The student applies critical-thinking skills to organize and use information acquired through established research methodologies from a variety of valid sources, including technology. The student is expected to: (A) analyze information by sequencing, categorizing, identifying cause-and-effect relationships, comparing, contrasting, finding the main idea, summarizing, making generalizations and predictions, and drawing inferences and conclusions;

(D) analyze and evaluate the validity of information, arguments, and counterarguments from primary and secondary sources for bias, propaganda, point of view, and frame of reference;

(E) evaluate government data using charts, tables, graphs, and maps. Objectives: Students will keep a list of the toolkit 'helpers' in their notebook and use the elements to process/apply information in various formats such as short answers responses, tickets out the door, setting up writing samples for World Cultures, World Geo, AP Human Geography and other courses involving the study of geographic concepts. Summary: Students can use these 'hooks' in their study of geography, can be applied in every unit where geography is studied. Helps further critical thinking skills. These specific helpers are for reading charts and graphs.
Data from: Lifelong Learning of Graph Neural Networks for Open-World Node...
zenodo.org
explore.openaire.eu
+1more
zip
Updated Sep 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lukas Galke; Lukas Galke; Benedikt Franke; Tobias Zielke; Ansgar Scherp; Ansgar Scherp; Benedikt Franke; Tobias Zielke (2021). Lifelong Learning of Graph Neural Networks for Open-World Node Classification [Dataset]. http://doi.org/10.5281/zenodo.3764770
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3764770
Dataset updated
Sep 29, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lukas Galke; Lukas Galke; Benedikt Franke; Tobias Zielke; Ansgar Scherp; Ansgar Scherp; Benedikt Franke; Tobias Zielke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Three temporal graph datasets for node classification under distribution shift.

DBLP-Easy and DBLP-Hard are citation graph datasets. PharmaBio is a collaboration graph dataset.

Vertices are scientific publications, edges are either citations (DBLP) or at-least-one-common-author relationships (PharmaBio).

The task is to classify the vertices of the graph into the respective conference/journal venues (DBLP) or journal categories (PharmaBio). In the DBLP datasets, new classes may appear over time.

Each dataset follows the structure:

- adjlist.txt -- the graph structure encoded as adjacency lists: in each row, the first entry is the source vertex, the remaining entries are adjacent vertices

- X.npy -- numpy serialized format for node features indexed by node id corresponding to adjlist.txt

- y.npy -- numpy serialized format for node labels indexed by node id corresponding to adjlist.txt

- t.npy -- numpy serialized format for time steps indexed by node id corresponding to adjlist.txt

A paper describing our incremental training and evaluation framework is published in IJCNN 2021 (Pre-print on arXiv: https://arxiv.org/abs/2006.14422).

If you use these datasets in your research, please cite the corresponding paper:

@inproceedings{galke2021lifelong, author={Galke, Lukas and Franke, Benedikt and Zielke, Tobias and Scherp, Ansgar}, booktitle={2021 International Joint Conference on Neural Networks (IJCNN)}, title={Lifelong Learning of Graph Neural Networks for Open-World Node Classification}, year={2021}, volume={}, number={}, pages={1-8}, doi={10.1109/IJCNN52387.2021.9533412} }
F
Infant Mortality Rate for the Arab World
fred.stlouisfed.org
json
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Infant Mortality Rate for the Arab World [Dataset]. https://fred.stlouisfed.org/series/SPDYNIMRTINARB
Explore at:
jsonAvailable download formats
Dataset updated
Apr 16, 2025
License
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Area covered
Arab world
Description
Graph and download economic data for Infant Mortality Rate for the Arab World (SPDYNIMRTINARB) from 1990 to 2023 about Arab World, mortality, infant, and rate.
f
Comparison between higher order CRD (HG-CRD) and motif-based approximate...
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rania Ibrahim; David F. Gleich (2023). Comparison between higher order CRD (HG-CRD) and motif-based approximate personalized pageRank (MAPPR) on directed Email-EU graph. [Dataset]. http://doi.org/10.1371/journal.pone.0243485.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0243485.t005
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Rania Ibrahim; David F. Gleich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison between higher order CRD (HG-CRD) and motif-based approximate personalized pageRank (MAPPR) on directed Email-EU graph.
d
Johns Hopkins COVID-19 Case Tracker
data.world
csv, zip
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
Explore at:
zip, csvAvailable download formats
Dataset updated
Aug 20, 2025
Authors
The Associated Press
Time period covered
Jan 22, 2020 - Mar 9, 2023
Area covered
Description
Updates

Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

CDC Weekly case and death counts (national and state level)

CDC County level cases and deaths

HHS New hospital admissions

CDC NowCast COVID variant proportions (national and regional level)

April 9, 2020

The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.

April 20, 2020

Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.

April 29, 2020

The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.

September 1st, 2020

Johns Hopkins is now providing counts for the five New York City counties individually.

February 12, 2021

The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."

Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.

February 16, 2021

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

The AP is updating this dataset hourly at 45 minutes past the hour.

To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

Queries

Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

Filter cases by state here

Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac

Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true

Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.

Pull the 100 counties with the highest per-capita confirmed cases here

Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.

Interactive

The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

@(https://datawrapper.dwcdn.net/nRyaf/15/)

Interactive Embed Code

<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>

Caveats

This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.

In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.

In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"

This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.

Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.

The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

Attribution

This data should be credited to Johns Hopkins University COVID-19 tracking project
Top 20 countries in the World Giving Index 2019
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Top 20 countries in the World Giving Index 2019 [Dataset]. https://www.statista.com/statistics/283351/top-20-countries-world-giving-index/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
This graph shows the top 20 countries as ranked by the World Giving Index in 2019. In that year, the United States was first with an index score of ** percent.

The 2019 score is the ten-year average from 2009 to 2018.

Facebook

Twitter

Click to copy link

Link copied

Cite

Noé Cécillon; Noé Cécillon; Vincent Labatut; Vincent Labatut; Richard Dufour; Richard Dufour; Nejat Arınık; Nejat Arınık (2025). Real-World Signed Graphs Annotated for Whole Graph Classification [Dataset]. http://doi.org/10.5281/zenodo.13851362

Real-World Signed Graphs Annotated for Whole Graph Classification

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13851362

Dataset updated

Jan 7, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Noé Cécillon; Noé Cécillon; Vincent Labatut; Vincent Labatut; Richard Dufour; Richard Dufour; Nejat Arınık; Nejat Arınık

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Warning: the ground truth is missing in certain of these datasets. This was fixed in version 1.0.1, which you should use instead.

Description: this corpus was designed as an experimental benchmark for a task of signed graph classification. It is composed of three datasets derived from external sources and adapted to our needs:

SpaceOrigin Conversations [1]: set of conversational graphs, each one associated to a situation of verbal abuse vs. normal situation. These conversations model interactions happening in chatrooms hosted by an MMORPG/ The graphs were originally unsigned: we attributed signed to the edges based on the polarity of the exchanged messages.
Correlation Clustering Instances [2]: set of graph generated randomly as instances of the Correlation Clustering problem, which consists in partitioning signed graphs. These graphs are not associated in any class in the original paper. We proposed a class based on certain features of the space of optimal solutions explored in [2].
European Parliament Roll-Calls [3]: vote networks extracted from the activity of French Members of the European Parliament. The original data does not have any class associated to the networks: we proposed one based on the number of political factions identified in each network in [3].

These data were used in [4] in order to train and assess various representation learning methods. The authors proposed Signed Graph2vec, a signed variant of Graph2vec; WSGCN, a whole-graph variant of Signed Graph Convolutional Networks (SGCN), and use an aggregated version of Signed Network Embeddings (SiNE) as a baseline. The article provides more information regarding the properties of the datasets, and how they were constituted.

Software: the software used to train the representation learning methods and classifiers is publicly available online: SWGE.

References:

Papegnies, É.; Labatut, V.; Dufour, R. & Linarès, G. Conversational Networks for Automatic Online Moderation. IEEE Transactions on Computational Social Systems, 2019, 6:38-55. DOI: 10.1109/TCSS.2018.2887240 ⟨hal-01999546⟩
Arınık, N.; Figueiredo, R. & Labatut, V. Multiplicity and Diversity: Analyzing the Optimal Solution Space of the Correlation Clustering Problem on Complete Signed Graphs. Journal of Complex Networks, 2020, 8(6):cnaa025. DOI: 10.1093/comnet/cnaa025 ⟨hal-02994011⟩
Arınık, N.; Figueiredo, R. & Labatut, V. Multiple partitioning of multiplex signed networks: Application to European parliament votes. Social Networks, 2020, 60:83-102. DOI: 10.1016/j.socnet.2019.02.001 ⟨hal-02082574⟩
Cécillon, N.; Labatut, V.; Dufour, R. & Arınık, N. Whole-Graph Representation Learning For the Classification of Signed Networks. IEEE Access, 2024, 12:151303-151316. DOI: 10.1109/ACCESS.2024.3472474 ⟨hal-04712854⟩

Funding: part of this work was funded by a grant from the Provence-Alpes-Côte-d'Azur region (PACA, France) and the Nectar de Code company.

Citation: If you use this data or the associated source code, please cite article [4]:

@Article{Cecillon2024,
author = {Cécillon, Noé and Labatut, Vincent and Dufour, Richard and Arınık, Nejat},
title = {Whole-Graph Representation Learning For the Classification of Signed Networks},
journal = {IEEE Access},
year = {2024},
volume = {12},
pages = {151303-151316},
doi = {10.1109/ACCESS.2024.3472474},
}

Clear search

Close search

Google apps

Main menu

Real-World Signed Graphs Annotated for Whole Graph Classification

Code: Generating Graphs based on Real-World Port Data

Data from: Empowering Graph Neural Networks for Real-World Tasks

United States - Population Growth for World

Real-World Graph Matching Dataset

USA Percent of world population - data, chart | TheGlobalEconomy.com

World Coronavirus COVID-19 Deaths

MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets

Precision, recall and F1-measure for TopNeighbors (TN), BestNeighbors (BN),...

China Percent of world population - data, chart | TheGlobalEconomy.com

Description of the real-world dataset.

United States - Rest of the World; Currency; Asset, Transactions

Mammographs by country, around the world | TheGlobalEconomy.com

Comparison between HG-CRD and MAPPR using undirected and directed graphs.

GEOGRAPHY TOOLKIT - 'TALLS' CHARTS AND GRAPHS HELPERS

Data from: Lifelong Learning of Graph Neural Networks for Open-World Node...

Infant Mortality Rate for the Arab World

Comparison between higher order CRD (HG-CRD) and motif-based approximate...

Johns Hopkins COVID-19 Case Tracker

Updates

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

Queries

Interactive

Interactive Embed Code

Caveats

Attribution

Top 20 countries in the World Giving Index 2019

Real-World Signed Graphs Annotated for Whole Graph Classification