85 datasets found

a
Maine Beach Profiling Graph Data Table
maine.hub.arcgis.com
mgs-maine.opendata.arcgis.com
+3more
Updated Apr 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
State of Maine (2023). Maine Beach Profiling Graph Data Table [Dataset]. https://maine.hub.arcgis.com/maps/maine-beach-profiling-graph-data-table
Explore at:
Dataset updated
Apr 3, 2023
Dataset authored and provided by
State of Maine
Area covered
Pacific Ocean, South Pacific Ocean
Description
All data approved by the beach profiling administrator is included in this table. The data is formatted for production of the beach profile graphs.

Sample Graph Datasets in CSV Format

zenodo.org

csv

Updated Dec 9, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Edwin Carreño; Edwin Carreño (2024). Sample Graph Datasets in CSV Format [Dataset]. http://doi.org/10.5281/zenodo.14335015

Explore at:

csvAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.14335015

Dataset updated

Dec 9, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Edwin Carreño; Edwin Carreño

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Sample Graph Datasets in CSV Format

Note: none of the data sets published here contain actual data, they are for testing purposes only.

Description

This data repository contains graph datasets, where each graph is represented by two CSV files: one for node information and another for edge details. To link the files to the same graph, their names include a common identifier based on the number of nodes. For example:

dataset_30_nodes_interactions.csv:contains 30 rows (nodes).
dataset_30_edges_interactions.csv: contains 47 rows (edges).
the common identifier dataset_30 refers to the same graph.

CSV nodes

Each dataset contains the following columns:

Name of the Column	Type	Description
UniProt ID	string	protein identification
label	string	protein label (type of node)
properties	string	a dictionary containing properties related to the protein.

CSV edges

Each dataset contains the following columns:

Name of the Column	Type	Description
Relationship ID	string	relationship identification
Source ID	string	identification of the source protein in the relationship
Target ID	string	identification of the target protein in the relationship
label	string	relationship label (type of relationship)
properties	string	a dictionary containing properties related to the relationship.

Metadata

Graph	Number of Nodes	Number of Edges	Sparse graph
dataset_30*	30	47	Y
dataset_60*	60	181	Y
dataset_120*	120	689	Y
dataset_240*	240	2819	Y
dataset_300*	300	4658	Y
dataset_600*	600	18004	Y
dataset_1200*	1200	71785	Y
dataset_2400*	2400	288600	Y
dataset_3000*	3000	449727	Y
dataset_6000*	6000	1799413	Y
dataset_12000*	12000	7199863	Y
dataset_24000*	24000	28792361	Y
dataset_30000*	30000	44991744	Y

This repository include two (2) additional tiny graph datasets to experiment before dealing with larger datasets.

CSV nodes (tiny graphs)

Each dataset contains the following columns:

Name of the Column	Type	Description
ID	string	node identification
label	string	node label (type of node)
properties	string	a dictionary containing properties related to the node.

CSV edges (tiny graphs)

Each dataset contains the following columns:

Name of the Column	Type	Description
ID	string	relationship identification
source	string	identification of the source node in the relationship
target	string	identification of the target node in the relationship
label	string	relationship label (type of relationship)
properties	string	a dictionary containing properties related to the relationship.

Metadata (tiny graphs)

Graph	Number of Nodes	Number of Edges	Sparse graph
dataset_dummy*	3	6	N
dataset_dummy2*	3	6	N

Graph Database Landscape Analysis by Graph Database Platform and Services...

futuremarketinsights.com

pdf

Updated Apr 30, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Graph Database Landscape Analysis by Graph Database Platform and Services from 2024 to 2034 [Dataset]. https://www.futuremarketinsights.com/reports/graph-database-market

Explore at:

pdfAvailable download formats

Dataset updated

Apr 30, 2024

Dataset authored and provided by

Future Market Insights

License

https://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy

Time period covered

2024 - 2034

Area covered

Worldwide

Description

The global graph database market growth will be propelled through 2034 at a massive CAGR of 19.4%. With the growing data usage and the rising data storage requirements, the global graph database market size will likely inflate from US$ 3.17 billion to US$ 18.68 billion in the next decade. Technological advancements also fuel the growth prospects of the industry.

Attributes	Key Insights
Estimated Industry Size in 2024	US$ 3.17 billion
Projected Industry Value in 2034	US$ 18.68 billion
Value-based CAGR from 2024 to 2034	19.4%

Growing Technology to Enlarge the Global Graph Database Market Size

Attributes	Values
Historical CAGR	16.5%
Valuation in 2019	US$ 1.46 billion
Valuation in 2023	US$ 2.69 billion

Country-wise Analysis

Countries	Forecasted CAGR
Germany	8.9%
Japan	9.2%
The United States of America	13.5%
China	19.9%
Australia	22.9%

Category-wise Insights

Category	Solution- Graph Database Platform
Industry Share in 2024	63.3%
Segment Drivers	Seamless integration with traditional systems increases their usability, enhancing the demand for these platforms. Excellent scalability of these platforms elevates the usability standards. The wider adaptability drives the segment’s popularity, fueling the global graph database market size.

Category	Application- Fraud & Risk Analytics
Industry Share in 2024	24.3%
Segment Drivers	Due to the ability of graph databases to deliver accurate risk forecasting, they save a lot of time and money. With the rising financial transactions, the demand for fraud detection and potential risk identification is rising. Therefore, the increasing popularity of the segment drives the global graph database market size.

KG20C Scholarly Knowledge Graph

kaggle.com

zip

Updated Nov 4, 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

T H N (2021). KG20C Scholarly Knowledge Graph [Dataset]. https://www.kaggle.com/tranhungnghiep/kg20c-scholarly-knowledge-graph

Explore at:

zip(851624 bytes)Available download formats

Dataset updated

Nov 4, 2021

Authors

T H N

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Context

This knowledge graph is constructed to aid research in scholarly data analysis. It can serve as a standard benchmark dataset for several tasks, including knowledge graph embedding, link prediction, recommendation systems, and question answering about high quality papers from 20 top computer science conferences.

This has been introduced and used in the PhD thesis Multi-Relational Embedding for Knowledge Graph Representation and Analysis and TPDL'19 paper Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space.

Content

Construction protocol

Scholarly data

From the Microsoft Academic Graph dataset, we extracted high quality computer science papers published in top conferences between 1990 and 2010. The top conference list are based on the CORE ranking A* conferences. The data was cleaned by removing conferences with less than 300 publications and papers with less than 20 citations. The final list includes 20 top conferences: AAAI, AAMAS, ACL, CHI, COLT, DCC, EC, FOCS, ICCV, ICDE, ICDM, ICML, ICSE, IJCAI, NIPS, SIGGRAPH, SIGIR, SIGMOD, UAI, and WWW.

Knowledge graph

The scholarly dataset was converted to a knowledge graph by defining the entities, the relations, and constructing the triples. The knowledge graph can be seen as a labeled multi-digraph between scholarly entities, where the edge labels express there relationships between the nodes. We use 5 intrinsic entity types including Paper, Author, Affiliation, Venue, and Domain. We also use 5 intrinsic relation types between the entities including author_in_affiliation, author_write_paper, paper_in_domain, paper_cite_paper, and paper_in_venue.

Benchmark data splitting

The knowledge graph was split uniformly at random into the training, validation, and test sets. We made sure that all entities and relations in the validation and test sets also appear in the training set so that their embeddings can be learned. We also made sure that there is no data leakage and no redundant triples in these splits, thus, constitute a challenging benchmark for link prediction similar to WN18RR and FB15K-237.

Data content

File format

All files are in tab-separated-values format, compatible with other popular benchmark datasets including WN18RR and FB15K-237. For example, train.txt includes "28674CFA author_in_affiliation 075CFC38", which denotes the author with id 28674CFA works in the affiliation with id 075CFC38. The repo includes these files: - all_entity_info.txt contains id name type of all entities - all_relation_info.txt contains id of all relations - train.txt contains training triples of the form entity_1_id relation_id entity_2_id - valid.txt contains validation triples - test.txt contains test triples

Statistics

Data statistics of the KG20C knowledge graph:

Author	Paper	Conference	Domain	Affiliation
8,680	5,047	20	1,923	692

Entities	Relations	Training triples	Validation triples	Test triples
16,362	5	48,213	3,670	3,724

Acknowledgements

For the dataset and semantic query method, please cite: - Hung Nghiep Tran and Atsuhiro Takasu. Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space. In Proceedings of International Conference on Theory and Practice of Digital Libraries (TPDL), 2019.

For the MEI knowledge graph embedding model, please cite: - Hung Nghiep Tran and Atsuhiro Takasu. Multi-Partition Embedding Interaction with Block Term Format for Knowledge Graph Completion. In Proceedings of the European Conference on Artificial Intelligence (ECAI), 2020.

For the baseline results and extended semantic query method, please cite: - Hung Nghiep Tran. Multi-Relational Embedding for Knowledge Graph Representation and Analysis. PhD Dissertation, The Graduate University for Advanced Studies, SOKENDAI, Japan, 2020.

For the Microsoft Academic Graph dataset, please cite: - Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the International Conference on World Wide Web (WWW), 2015.

Inspiration

We include the baseline results for two tasks on the KG20C dataset, link prediction and semantic queries. Link prediction is a relational query task given a relation and the head or tail entity to predict the corresponding tail or head entities. Semantic queries include human-friendly query on the scholarly data. MRR is the mean reciprocal rank, Hit@k is the percentage of correct predictions at top k.

For more information, please refer to the citations.

Link prediction results

We report results for 4 methods. Random, which is just random guess to show the task difficulty. Word2vec, which is the popular embedding method. SimplE/CP and MEI are two recent knowledge graph embedding methods.

All models are in small size settings, equivalent to total embedding size 100 (50x2 for Word2vec and SimplE/CP, 10x10 for MEI).

Models	MRR	Hit@1	Hit@3	Hit@10
Random	0.001	< 5e-4	< 5e-4	< 5e-4
Word2vec (small)	0.068	0.011	0.070	0.177
SimplE/CP (small)	0.215	0.148	0.234	0.348
MEI (small)	0.230	0.157	0.258	0.368

Semantic queries results

The following results demonstrate semantic queries on knowledge graph embedding space, using the above MEI (small) model.

Queries	MRR	Hit@1	Hit@3	Hit@10
Who may work at this organization?	0.299	0.221	0.342	0.440
Where may this author work at?	0.626	0.562	0.669	0.731
Who may write this paper?	0.247	0.164	0.283	0.405
What papers may this author write?	0.273	0.182	0.324	0.430
Which papers may cite this paper?	0.116	0.033	0.120	0.290
Which papers may this paper cite?	0.193	0.097	0.225	0.404
Which papers may belong to this domain?	0.052	0.025	0.049	0.100
Which may be the domains of this paper?	0.189	0.114	0.206	0.333
Which papers may publish in this conference?	0.148	0.084	0.168	0.257
Which conferences may this paper publish in?	0.693	0.542	0.810	0.976

Data from: The OREGANO knowledge graph for computational drug repurposing
zenodo.org
data.niaid.nih.gov
+1more
bin, tsv
Updated Nov 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marina Boudin; Marina Boudin; Fleur Mougin; Fleur Mougin; Gayo Diallo; Gayo Diallo; Martin Drancé; Martin Drancé (2023). The OREGANO knowledge graph for computational drug repurposing [Dataset]. http://doi.org/10.5281/zenodo.10103842
Explore at:
bin, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10103842
Dataset updated
Nov 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marina Boudin; Marina Boudin; Fleur Mougin; Fleur Mougin; Gayo Diallo; Gayo Diallo; Martin Drancé; Martin Drancé
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 16, 2023
Description
The files here are data files from the OREGANO project, which consists of building a holistic knowledge graph on drugs, including natural compounds. Here is the list of files:

- OREGANO_V2.tsv : The triplet file used for link prediction. 3 columns : Subjet ; Predicate ; Object
- oreganov2.1_metadata_complet.ttl : The OREGANO knowledge graph in turtle format with the names and cross-references of the various integrated entities.

The following files contain the cross-references of OREGANO entities according to their type. They are all organised as follows: the external sources are the titles of the columns and each line begins with the identifier of the entity in OREGANO :
- TARGET.tsv: Cross-reference table of the 22,096 targets.
- PHENOTYPES.tsv: Cross-reference table of the 11,605 phenotypes.
- DISEASES.tsv: Cross-reference table of the 18,333 diseases.
- PATHWAYS.tsv: Cross-reference table of the 2,129 pathways.
- GENES.tsv: Cross-reference table of the 35,794 genes.
- COMPOUND.tsv: Cross-reference table of the 90,868 compounds.
- INDICATIONS.tsv: Cross-reference table of the 2,714 indications.
- SIDE_EFFECT.tsv: Cross-reference table of the 6,060 side-effects.
- ACTIVITY.tsv: Names of the 78 activities.
- EFFECT.tsv: Names of the 171 effects.
The OREGANO knowledge graph is composed of 11 types of nodes and 19 types of links. The current version of the graph contains 88,937 nodes and 824,231 links.
A SPARQL endpoint has been provided to enable users to retrieve and explore the knowledge graph at OREGANO SPARQL endpoint .

The integration files and the knowledge graph are available on the GitHub of the OREGANO project in the Integration folder: Gitub repository .
d
Key generic technology prediction in patent citation using graph neural...
dataone.org
data.niaid.nih.gov
+1more
Updated Jun 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. L. Ding (2024). Key generic technology prediction in patent citation using graph neural networks [Dataset]. http://doi.org/10.5061/dryad.nk98sf803
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.nk98sf803
Dataset updated
Jun 5, 2024
Dataset provided by
Dryad Digital Repository
Authors
M. L. Ding
Time period covered
Jan 11, 2024
Description
With the rapid advancement of the Fourth Industrial Revolution, international competition in technology and industry is intensifying. However, in the era of big data and large-scale science, making accurate judgments about the key areas of technology and innovative trends has become exceptionally difficult. This paper constructs a patent indicator evaluation system based on the dimensions of key and generic patent citation, integrates graph neural network modeling to predict key common technologies, and confirms the effectiveness of the method using the field of genetic engineering as an example. According to the LDA topic model, the main technical R&D directions in genetic engineering are genetic analysis and detection technologies, the application of microorganisms in industrial production, virology research involving vaccine development and immune responses, high-throughput sequencing and analysis technologies in genomics, targeted drug design and molecular therapeutic strategies..., These datasets were obtained by the Incopat patent database for cited patents (2013-2022) in the field of genetic engineering. Details for the datasets are provided in the README file. This directory contains the selection of the patent datasets. 1) Table of key generic indicators for nodes (partial 1).csv This file consists of 10 indicators of patents: technical coverage, patent families, patent family citation, patent cooperation, enterprise-enterprise cooperation, industry-university-research cooperation, claims, citation frequency, layout countries, and layout countries. 2) Table of key generic indicators for nodes (partial 2).csv This file consists of 10 indicators of patents: technical convergence, cited countries, inventors, citations, homologous countries/areas, degree centrality, closeness centrality, betweenness centrality, eigenvector centrality, and PageRank. 3) patent.content The content file contains descriptions of the patents in the following format:

This README file was generated on 2023-11-25 by Mingli Ding.

GENERAL INFORMATION

Author Information Investigators Contact Information Name: Mingli Ding; Wangke Yu; Shuhua Wang Institution: Jingdezhen Ceramic University Address: Jingdezhen, Jiangxi, China Email: mlding1@163.com

Date of data collection:2013-2022

DATA & FILE OVERVIEW

File List:

A) Table of key generic indicators for nodes (partial 1).csv

B) Table of key generic indicators for nodes (partial 2).csv

C) patent.content

D) patent.cites

E) Graph neural network modeling highest accuracy for different dimensions.csv

F) Prediction effects of key generic technologies.csv

DATA-SPECIFIC INFORMATION FOR: Table of key generic indicators for nodes (partial 1).csv

Number of variables: 10

Number of cases/rows: 72489

Variable List:

technical coverage: number ...
H
CDC's PRAMS Online Data for Epidemiological Research (CPONDER)
dataverse.harvard.edu
data.niaid.nih.gov
Updated Nov 30, 2010
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2010). CDC's PRAMS Online Data for Epidemiological Research (CPONDER) [Dataset]. http://doi.org/10.7910/DVN/1JPCH8
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/1JPCH8
Dataset updated
Nov 30, 2010
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This interactive tool allows users to generate tables and graphs on information relating to pregnancy and childbirth. All data comes from the CDC's PRAMS. Topics include: breastfeeding, prenatal care, insurance coverage and alcohol use during pregnancy. Background CPONDER is the interaction online data tool for the Center's for Disease Control and Prevention (CDC)'s Pregnancy Risk Assessment Monitoring System (PRAMS). PRAMS gathers state and national level data on a variety of topics related to pregnancy and childbirth. Examples of information include: breastfeeding, alcohol use, multivitamin use, prenatal care, and contraception. User Functionality Users select choices from three drop down menus to search for d ata. The menus are state, year and topic. Users can then select the specific question from PRAMS they are interested in, and the data table or graph will appear. Users can then compare that question to another state or to another year to generate a new data table or graph. Data Notes The data source for CPONDER is PRAMS. The data is from every year between 2000 and 2008, and data is available at the state and national level. However, states must have participated in PRAMS to be part of CPONDER. Not every state, and not every year for every state, is available.
h
anatomical-systems (v1.1) graph data
purl.humanatlas.io
application/n-quads +4
Updated Dec 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HRA Digital Object Processor (2024). anatomical-systems (v1.1) graph data [Dataset]. https://purl.humanatlas.io/asct-b/anatomical-systems/v1.1
Explore at:
ttl, jsonld, rdf, application/n-quads, application/n-triplesAvailable download formats
Dataset updated
Dec 12, 2024
Dataset authored and provided by
HRA Digital Object Processor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The graph representation of the Anatomical Structures, Cell Types, plus Biomarkers (ASCT+B) table for Anatomical Systems dataset.

Data from: NeMig - A Bilingual News Collection and Knowledge Graph about...

zenodo.org

Updated May 9, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Andreea Iana; Andreea Iana; Mehwish Alam; Mehwish Alam; Alexander Grote; Katharina Ludwig; Philipp Müller; Christof Weinhardt; Heiko Paulheim; Heiko Paulheim; Alexander Grote; Katharina Ludwig; Philipp Müller; Christof Weinhardt (2023). NeMig - A Bilingual News Collection and Knowledge Graph about Migration [Dataset]. http://doi.org/10.5281/zenodo.7442425

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.7442425

Dataset updated

May 9, 2023

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Description

NeMig are two English and German knowledge graphs constructed from news articles on the topic of migration, collected from online media outlets from Germany and the US, respectively. NeMIg contains rich textual and metadata information, sub-topics and sentiment annotations, as well as named entities extracted from the articles' content and metadata and linked to Wikidata. The graphs are expanded with up to two-hop neighbors from Wikidata of the initial set of linked entities.

NeMig comes in four flavors, for both the German, and the English corpora:

Base NeMig: contains literals and entities from the corresponding annotated news corpus;
Entities NeMig: derived from the Base NeMIg by removing all literal nodes, it contains only resource nodes;
Enriched Entities NeMig: derived from the Entities NeMig by enriching it with up to two-hop neighbors from Wikidata, it contains only resource nodes and Wikidata triples;
Complete NeMig: the combination of the Base and Enriched Entities NeMig, it contains both literals and resources.

Information about uploaded files:

(all files are b-zipped and in the N-Triples format.)

File	Description
nemig_${language}_ ${graph_type}-metadata.nt.bz2	Metadata about the dataset, described using void vocabulary.
nemig_${language}_ ${graph_type}-instances_types.nt.bz2	Class definitions of news and event instances.
nemig_${language}_ ${graph_type}-instances_labels.nt.bz2	Labels of instances.
nemig_${language}_ ${graph_type}-instances_related.nt.bz2	Relations between news instances based on one another.
nemig_${language}_ ${graph_type}-instances_metadata_literals.nt.bz2	Relations between news instances and metadata literals (e.g. URL, publishing date, modification date, sentiment label, political orientation of news outlets).
nemig_${language}_ ${graph_type}-instances_content_mapping.nt.bz2	Mapping of news instances to content instances (e.g. title, abstract, body).
nemig_${language}_ ${graph_type}-instances_topic_mapping.nt.bz2	Mapping of news instances to sub-topic instances.
nemig_${language}_ ${graph_type}-instances_content_literals.nt.bz2	Relations between content instances and corresponding literals (e.g. text of title, abstract, body).
nemig_${language}_ ${graph_type}-instances_metadata_resources.nt.bz2	Relations between news or sub-topic instances and entities extracted from metadata (i.e. publishers, authors, keywords).
nemig_${language}_ ${graph_type}-instances_event_mapping.nt.bz2	Mapping of news instances to event instances.
nemig_${language}_ ${graph_type}-event_resources.nt.bz2	Relations between event instances and entities extracted from the text of the news (i.e. actors, places, mentions).
nemig_${language}_ ${graph_type}-resources_provenance.nt.bz2	Provenance information about the entities extracted from the text of the news (e.g. title, abstract, body).
nemig_${language}_ ${graph_type}-wiki_resources.nt.bz2	Relations between Wikidata entities from news and their k-hop entity neighbors from Wikidata.

OGBN-Products (Processed for PyG)
kaggle.com
Updated Feb 27, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redao da Taupl (2021). OGBN-Products (Processed for PyG) [Dataset]. https://www.kaggle.com/datasets/dataup1/ogbn-products/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Redao da Taupl
Description
OGBN-Products

Webpage: https://ogb.stanford.edu/docs/nodeprop/#ogbn-products

Usage in Python

import os.path as osp import pandas as pd import datatable as dt import torch import torch_geometric as pyg from ogb.nodeproppred import PygNodePropPredDataset class PygOgbnProducts(PygNodePropPredDataset): def _init_(self, meta_csv = None): root, name, transform = '/kaggle/input', 'ogbn-products', None if meta_csv is None: meta_csv = osp.join(root, name, 'ogbn-master.csv') master = pd.read_csv(meta_csv, index_col = 0) meta_dict = master[name] meta_dict['dir_path'] = osp.join(root, name) super()._init_(name = name, root = root, transform = transform, meta_dict = meta_dict) def get_idx_split(self, split_type = None): if split_type is None: split_type = self.meta_info['split'] path = osp.join(self.root, 'split', split_type) if osp.isfile(os.path.join(path, 'split_dict.pt')): return torch.load(os.path.join(path, 'split_dict.pt')) if self.is_hetero: train_idx_dict, valid_idx_dict, test_idx_dict = read_nodesplitidx_split_hetero(path) for nodetype in train_idx_dict.keys(): train_idx_dict[nodetype] = torch.from_numpy(train_idx_dict[nodetype]).to(torch.long) valid_idx_dict[nodetype] = torch.from_numpy(valid_idx_dict[nodetype]).to(torch.long) test_idx_dict[nodetype] = torch.from_numpy(test_idx_dict[nodetype]).to(torch.long) return {'train': train_idx_dict, 'valid': valid_idx_dict, 'test': test_idx_dict} else: train_idx = dt.fread(osp.join(path, 'train.csv'), header = None).to_numpy().T[0] train_idx = torch.from_numpy(train_idx).to(torch.long) valid_idx = dt.fread(osp.join(path, 'valid.csv'), header = None).to_numpy().T[0] valid_idx = torch.from_numpy(valid_idx).to(torch.long) test_idx = dt.fread(osp.join(path, 'test.csv'), header = None).to_numpy().T[0] test_idx = torch.from_numpy(test_idx).to(torch.long) return {'train': train_idx, 'valid': valid_idx, 'test': test_idx}

dataset = PygOgbnProducts() split_idx = dataset.get_idx_split() train_idx, valid_idx, test_idx = split_idx['train'], split_idx['valid'], split_idx['test'] graph = dataset[0] # PyG Graph object

Description

Graph: The ogbn-products dataset is an undirected and unweighted graph, representing an Amazon product co-purchasing network [1]. Nodes represent products sold in Amazon, and edges between two products indicate that the products are purchased together. The authors follow [2] to process node features and target categories. Specifically, node features are generated by extracting bag-of-words features from the product descriptions followed by a Principal Component Analysis to reduce the dimension to 100.

Prediction task: The task is to predict the category of a product in a multi-class classification setup, where the 47 top-level categories are used for target labels.

Dataset splitting: The authors consider a more challenging and realistic dataset splitting that differs from the one used in [2] Instead of randomly assigning 90% of the nodes for training and 10% of the nodes for testing (without use of a validation set), use the sales ranking (popularity) to split nodes into training/validation/test sets. Specifically, the authors sort the products according to their sales ranking and use the top 8% for training, next top 2% for validation, and the rest for testing. This is a more challenging splitting procedure that closely matches the real-world application where labels are first assigned to important nodes in the network and ML models are subsequently used to make predictions on less important ones.

Note 1: A very small number of self-connecting edges are repeated (see here); you may remove them if necessary.

Note 2: For undirected graphs, the loaded graphs will have the doubled number of edges because the bidirectional edges will be added automatically.

Summary

Package #Nodes #Edges Split Type Task Type Metric
ogb>=1.1.1 2,449,029 61,859,140 Sales rank Multi-class classification Accuracy

Open Graph Benchmark

Website: https://ogb.stanford.edu

The Open Graph Benchmark (OGB) [3] is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner.

References

[1] http://manikvarma.org/downloads/XC/XMLRepository.html [2] Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. Cluster-GCN: An efficient algorithm for training deep and large graph convolutional networks. ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 257–266, 2019. [3] Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. Advances in Neural Information Processing Systems, pp. 22118–22133, 2020.

License: Amazon License

By accessing the Amazon Customer Reviews Library ("Reviews Library"), you agree that the Reviews Library is an Amazon Service subject to the Amazon.com Conditions of Use (https://www.amazon.com/gp/help/customer/display.html/ref=footer_cou?ie=UTF8&nodeId=508088) and you agree to be bound by them, with the following additional conditions: In addition to the license rights granted under the Conditions of Use, Amazon or its content providers grant you a limited, non-exclusive, non-transferable, non-sublicensable, revocable license to access and use the Reviews Library for purposes of academic research. You may not resell, republish, or make any commercial use of the Reviews Library or its contents, including use of the Reviews Library for commercial research, such as research related to a funding or consultancy contract, internship, or other relationship in which the results are provided for a fee or delivered to a for-profit organization. You may not (a) link or associate content in the Reviews Library with any personal information (including Amazon customer accounts), or (b) attempt to determine the identity of the author of any content in the Reviews Library. If you violate any of the foregoing conditions, your license to access and use the Reviews Library will automatically terminate without prejudice to any of the other rights or remedies Amazon may have.

Disclaimer

I am NOT the author of this dataset. It was downloaded from its official website. I assume no responsibility or liability for the content in this dataset. Any questions, problems or issues, please contact the original authors at their website or their GitHub repo.
T
United States - Assets: Other: Other Assets, Consolidated Table: Wednesday...
tradingeconomics.com
csv, excel, json, xml
Updated Feb 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). United States - Assets: Other: Other Assets, Consolidated Table: Wednesday Level [Dataset]. https://tradingeconomics.com/united-states/assets-other-assets-fed-data.html
Explore at:
xml, csv, json, excelAvailable download formats
Dataset updated
Feb 3, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
United States
Description
United States - Assets: Other: Other Assets, Consolidated Table: Wednesday Level was 35274.00000 Mil. of $ in March of 2025, according to the United States Federal Reserve. Historically, United States - Assets: Other: Other Assets, Consolidated Table: Wednesday Level reached a record high of 50550.00000 in August of 2023 and a record low of 5958.00000 in November of 2006. Trading Economics provides the current actual value, an historical data chart and related indicators for United States - Assets: Other: Other Assets, Consolidated Table: Wednesday Level - last updated from the United States Federal Reserve on March of 2025.
Z
Life tables and graphs for Bahry (2022) - Equilibrium conditions in the...
data.niaid.nih.gov
zenodo.org
Updated Sep 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bahry, David (2022). Life tables and graphs for Bahry (2022) - Equilibrium conditions in the evolution of senescence [MSc thesis, Carleton Univeristy] [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7069069
Explore at:
Dataset updated
Sep 12, 2022
Dataset authored and provided by
Bahry, David
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Life table data, and derived quantities, for Equilibrium Conditions in the Evolution of Senescence (Bahry, 2022, MSc thesis); adapted from the supplementary data of (Jones et al., 2014). Life table data for human (Japan 2009), human (Aché hunter-gatherer), fruit fly, Soay sheep, freshwater hydra, and desert tortoise.

Basic life table quantities: age interval ((X)); survival function ((l_X)); and age-specific interval fecundity ((m_X)). Derived quantities include interval average force of mortality; reproductive value; residual reproductive value; Hamilton's indicators of the age-specific forces of selection; and actual age-specific mortality vs. predicted age-specific mortality based on models treated in (Bahry, 2022).

In the original life tables of Jones et al. (2014), desert tortoises negatively senesce over the range of observed ages, but had a final observed cut-off age of 74; this causes reproductive value to artifactually fall to 0 as age-approached the cutoff. To get around this, I also used an extrapolated desert tortoise life table, assuming the age-74 mortality and fecundity rates remained constant until age 1000, then using the extrapolated life table to calculate reproductive value (and Hamilton's indicators) up to the cutoff age 74.

References

Bahry, D. (2022). Equilibrium Conditions in the Evolution of Senescence [Master's thesis, Carleton University].

Jones, O. R. et al. (2014). Diversity of ageing across the tree of life. Nature 505: 169–174. https://doi.org/10.1038/nature12789
d
Data Visualization in Social Work Research
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rothwell, David; Esposito, Tonino; Wegner-Lohin (2023). Data Visualization in Social Work Research [Dataset]. http://doi.org/10.7910/DVN/I6IIXL
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/I6IIXL
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Rothwell, David; Esposito, Tonino; Wegner-Lohin
Time period covered
Jan 1, 2009 - Jan 1, 2012
Description
Research dissemination and knowledge translation are imperative in social work. Methodological developments in data visualization techniques have improved the ability to convey meaning and reduce erroneous conclusions. The purpose of this project is to examine: (1) How are empirical results presented visually in social work research?; (2) To what extent do top social work journals vary in the publication of data visualization techniques?; (3) What is the predominant type of analysis presented in tables and graphs?; (4) How can current data visualization methods be improved to increase understanding of social work research? Method: A database was built from a systematic literature review of the four most recent issues of Social Work Research and 6 other highly ranked journals in social work based on the 2009 5-year impact factor (Thomson Reuters ISI Web of Knowledge). Overall, 294 articles were reviewed. Articles without any form of data visualization were not included in the final database. The number of articles reviewed by journal includes : Child Abuse & Neglect (38), Child Maltreatment (30), American Journal of Community Psychology (31), Family Relations (36), Social Work (29), Children and Youth Services Review (112), and Social Work Research (18). Articles with any type of data visualization (table, graph, other) were included in the database and coded sequentially by two reviewers based on the type of visualization method and type of analyses presented (descriptive, bivariate, measurement, estimate, predicted value, other). Additional revi ew was required from the entire research team for 68 articles. Codes were discussed until 100% agreement was reached. The final database includes 824 data visualization entries.
m
Dataset of development of business during the COVID-19 crisis
data.mendeley.com
narcis.nl
Updated Nov 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tatiana N. Litvinova (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
Explore at:
Unique identifier
https://doi.org/10.17632/9vvrd34f8t.1
Dataset updated
Nov 9, 2020
Authors
Tatiana N. Litvinova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
Station-B Biological Knowledge Graph Data
zenodo.org
data.niaid.nih.gov
zip
Updated Sep 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashant Vaidyanathan; Prashant Vaidyanathan; Boyan Yordanov; Boyan Yordanov; Paul K. Grant; Paul K. Grant; Colin Gravill; Neil Dalchau; Neil Dalchau; Colin Gravill (2021). Station-B Biological Knowledge Graph Data [Dataset]. http://doi.org/10.5281/zenodo.5245860
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5245860
Dataset updated
Sep 27, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Prashant Vaidyanathan; Prashant Vaidyanathan; Boyan Yordanov; Boyan Yordanov; Paul K. Grant; Paul K. Grant; Colin Gravill; Neil Dalchau; Neil Dalchau; Colin Gravill
Description
This dataset contains all the experimental data and metadata collected as part of the Station-B project at Microsoft Research Cambridge. The data has been structured using the Biological Knowledge Graph Schema and was stored in Azure Tables and Azure Blobs. This data includes two files:

blobs.zip: This zipped file primarily contains blobs that stored raw and processed fluorescence data from the Microplate Reader at the Station-B wet lab. This zip also contains bundles compatible with the Synthace Platform to enable lab automation with Liquid handling robots.

tables.zip: This zipped file contains all the data and metadata associated with the Assembly and Characterization experiments conducted at Station-B. Each CSV in this zipped file represents data stored in an Azure Table. The columns in each CSV are based on the Biological Knowledge Graph Schema.
T
United States Stock Market Index Data
tradingeconomics.com
ar.tradingeconomics.com
+15more
csv, excel, json, xml
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, United States Stock Market Index Data [Dataset]. https://tradingeconomics.com/united-states/stock-market
Explore at:
excel, xml, json, csvAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 3, 1928 - Mar 27, 2025
Area covered
United States
Description
The main stock market index in the United States (US500) decreased 176 points or 2.99% since the beginning of 2025, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on March of 2025.
h
skeleton (v1.0) graph data
lod.humanatlas.io
application/n-quads +4
Updated Dec 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HRA Digital Object Processor (2024). skeleton (v1.0) graph data [Dataset]. https://lod.humanatlas.io/asct-b/skeleton/v1.0/
Explore at:
rdf, application/n-triples, application/n-quads, jsonld, ttlAvailable download formats
Dataset updated
Dec 12, 2024
Dataset authored and provided by
HRA Digital Object Processor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The graph representation of the Anatomical Structures, Cell Types, plus Biomarkers (ASCT+B) table for Skeleton dataset.
T
United States - Commercial and Industrial Loans, Domestically Chartered...
tradingeconomics.com
csv, excel, json, xml
Updated Apr 28, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). United States - Commercial and Industrial Loans, Domestically Chartered Commercial Banks [Dataset]. https://tradingeconomics.com/united-states/commercial-and-industrial-loans-domestically-chartered-commercial-banks-bil-of-u-s-dollar-sa-fed-data.html
Explore at:
excel, xml, csv, jsonAvailable download formats
Dataset updated
Apr 28, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
United States
Description
United States - Commercial and Industrial Loans, Domestically Chartered Commercial Banks was 2098.26530 Bil. of U.S. $ in March of 2022, according to the United States Federal Reserve. Historically, United States - Commercial and Industrial Loans, Domestically Chartered Commercial Banks reached a record high of 2515.61760 in May of 2020 and a record low of 128.86610 in January of 1973. Trading Economics provides the current actual value, an historical data chart and related indicators for United States - Commercial and Industrial Loans, Domestically Chartered Commercial Banks - last updated from the United States Federal Reserve on March of 2025.
d
Model predictions for heterogeneous stream-reservoir graph networks with...
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Model predictions for heterogeneous stream-reservoir graph networks with data assimilation [Dataset]. https://catalog.data.gov/dataset/model-predictions-for-heterogeneous-stream-reservoir-graph-networks-with-data-assimilation
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This data release provides the predictions from stream temperature models described in Chen et al. 2021. Briefly, various deep learning and process-guided deep learning models were built to test improved performance of stream temperature predictions below reservoirs in the Delaware River Basin. The spatial extent of predictions was restricted to streams above the Delaware River at Lordville, NY, and includes the West Branch of the Delaware River below Cannonsville Reservoir and the East Branch of the Delaware River below Pepacton Reservoir. Various model architectures, training schemes, and data assimilation methods were used to generate the table and figures in Chen et a.l (2021) and predictions of each model are captured in this release. For each model, there are test period predictions for 56 river reaches from 2006-10-01 through 2020-09-30. Model input and validation data can be found in Oliver et al. (2021).
The publication associated with this data release is Chen, S., Appling, A.P., Oliver, S.K., Corson-Dosch, H.R., Read, J.S., Sadler, J.M., Zwart, J.A., Jia, X, 2021, Heterogeneous stream-reservoir graph networks with data assimilation. International Conference on Data Mining (ICDM). DOI: https://doi.org/10.1109/ICDM51629.2021.00117.
NetVotes ENIC Dataset
zenodo.org
txt, zip
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Israel Mendonça; Vincent Labatut; Vincent Labatut; Rosa Figueiredo; Rosa Figueiredo; Israel Mendonça (2024). NetVotes ENIC Dataset [Dataset]. http://doi.org/10.5281/zenodo.6815510
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6815510
Dataset updated
Oct 1, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Israel Mendonça; Vincent Labatut; Vincent Labatut; Rosa Figueiredo; Rosa Figueiredo; Israel Mendonça
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description. The NetVote dataset contains the outputs of the NetVote program when applied to voting data coming from VoteWatch (http://www.votewatch.eu/).

These results were used in the following conference papers:

I. Mendonça, R. Figueiredo, V. Labatut, and P. Michelon, “Relevance of Negative Links in Graph Partitioning: A Case Study Using Votes From the European Parliament,” in 2nd European Network Intelligence Conference, 2015, pp. 122–129. ⟨hal-01176090⟩ DOI: 10.1109/ENIC.2015.25

I. Mendonça, R. Figueiredo, V. Labatut, and P. Michelon, “Informative Value of Negative Links for Graph Partitioning, with an application to European Parliament Votes,” in 6ème Conférence sur les modèles et lánalyse de réseaux : approches mathématiques et informatiques, 2015, p. 12p. ⟨hal-02055158⟩

Source code. The NetVote source code is available on GitHub: https://github.com/CompNet/NetVotes.

Citation. If you use our dataset or tool, please cite article [1] above.

@InProceedings{Mendonca2015,
author = {Mendonça, Israel and Figueiredo, Rosa and Labatut, Vincent and Michelon, Philippe},

title = {Relevance of Negative Links in Graph Partitioning: A Case Study Using Votes From the {E}uropean {P}arliament},
booktitle = {2\textsuperscript{nd} European Network Intelligence Conference ({ENIC})},
year = {2015},
pages = {122-129},
address = {Karlskrona, SE},
publisher = {IEEE Publishing},
doi = {10.1109/ENIC.2015.25},
}

-------------------------

Details. This archive contains the following folders:

`votewatch_data`: the raw data extracted from the VoteWatch website.

`VoteWatch Europe European Parliament, Council of the EU.csv`: list of the documents voted during the considered term, with some details such as the date and topic.

`votes_by_document`: this folder contains a collection of CSV files, each one describing the outcome of the vote session relatively to one specific document.

`intermediate_files`: this folder contains several CSV files:

`allvotes.csv`: concatenation of all vote outcomes for all documents and all MEPS. Can be considered as a compact representation of the data contained in the folder `votes_by_document`.

`loyalty.csv`: same thing than allvotes.csv, but for the loyalty (i.e. whether or not the MEP voted like the majority of the MEPs in his political group).

`MPs.csv`: list of the MEPs having voted at least once in the considered term, with their details.

`policies.csv`: list of the topics considered during the term.

`qtd_docs.csv`: list of the topics with the corresponding number of documents.

`parallel_ils_results`: contains the raw results of the ILS tool. This is an external algorithm able to estimate the optimal partition of the network nodes in terms of structural balance. It was applied to all the networks extracted by our scripts (from the VoteWatch data), and the produced files were placed here for postprocessing. Each subfolder corresponds to one of the topic-year pair.

`output_files`: contains the file produced by our scripts.

`agreement`: histograms representing the distributions of agreement and rebellion indices. Each subfolder corresponds to a specific topic.

`community_algorithms_csv`: Performances obtained by the partitioning algorithms (for both community detection and correlation clustering). Each subfolder corresponds to a specific topic.

`xxxx_cluster_information.csv`: table containing several variants of the imbalance measure, for the considered algorithms.

`community_algorithms_results`: Comparison of the partitions detected by the various algorithms considered, and distribution of the cluster/community sizes. Each subfolder corresponds to a specific topic.

`xxxx_cluster_comparison.csv`: table comparing the partitions detected by the community detection algorithms, in terms of Rand index and other measures.

`xxxx_ils_cluster_comparison.csv`: like `xxxx_cluster_comparison.csv`, except we compare the partition of community detection algorithms with that of the ILS.

`xxxx_yyyy_distribution.pdf`: histogram of the community (or cluster) sizes detected by algorithm `yyyy`.

`graphs`: the networks extracted from the vote data. Each subfolder corresponds to a specific topic.

`xxxx_complete_graph.graphml`: network at the Graphml format, with all the information: nodes, edges, nodal attributes (including communities), weights, etc.

`xxxx_edges_Gephi.csv`: only the links, with their weights (i.e. vote similarity).

`xxxx_graph.g`: network at the g format (for ILS).

`xxxx_net_measures.csv`: table containing some stats on the network (number of links, etc.).

`xxxx_nodes_Gephi.csv`: list of nodes (i.e. MEPs), with details.

`plots`: synthesis plots from the paper.

-------------------------

License. These data are shared under a Creative Commons 0 license.

Contact. Vincent Labatut <vincent.labatut@univ-avignon.fr> & Rosa Figueiredo <rosa.figueiredo@univ-avignon.fr>

Package	#Nodes	#Edges	Split Type	Task Type	Metric
`ogb>=1.1.1`	2,449,029	61,859,140	Sales rank	Multi-class classification	Accuracy

Facebook

Twitter

Click to copy link

Link copied

Cite

State of Maine (2023). Maine Beach Profiling Graph Data Table [Dataset]. https://maine.hub.arcgis.com/maps/maine-beach-profiling-graph-data-table

Maine Beach Profiling Graph Data Table

Explore at:

Dataset updated

Apr 3, 2023

Dataset authored and provided by

State of Maine

Area covered

Pacific Ocean, South Pacific Ocean

Description

All data approved by the beach profiling administrator is included in this table. The data is formatted for production of the beach profile graphs.

Clear search

Close search

Google apps

Main menu

Maine Beach Profiling Graph Data Table

Sample Graph Datasets in CSV Format

Sample Graph Datasets in CSV Format

Description

CSV nodes

CSV edges

Metadata

CSV nodes (tiny graphs)

CSV edges (tiny graphs)

Metadata (tiny graphs)

Graph Database Landscape Analysis by Graph Database Platform and Services...

KG20C Scholarly Knowledge Graph

Context

Content

Construction protocol

Scholarly data

Knowledge graph

Benchmark data splitting

Data content

File format

Statistics

Acknowledgements

Inspiration

Link prediction results

Semantic queries results

Data from: The OREGANO knowledge graph for computational drug repurposing

Key generic technology prediction in patent citation using graph neural...

GENERAL INFORMATION

DATA & FILE OVERVIEW

DATA-SPECIFIC INFORMATION FOR: Table of key generic indicators for nodes (partial 1).csv

CDC's PRAMS Online Data for Epidemiological Research (CPONDER)

anatomical-systems (v1.1) graph data

Data from: NeMig - A Bilingual News Collection and Knowledge Graph about...

OGBN-Products (Processed for PyG)

OGBN-Products

Usage in Python

Description

Summary

Open Graph Benchmark

References

License: Amazon License

Disclaimer

United States - Assets: Other: Other Assets, Consolidated Table: Wednesday...

Life tables and graphs for Bahry (2022) - Equilibrium conditions in the...

Data Visualization in Social Work Research

Dataset of development of business during the COVID-19 crisis

Station-B Biological Knowledge Graph Data

United States Stock Market Index Data

skeleton (v1.0) graph data

United States - Commercial and Industrial Loans, Domestically Chartered...

Model predictions for heterogeneous stream-reservoir graph networks with...

NetVotes ENIC Dataset

Maine Beach Profiling Graph Data Table