Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
A comprehensive Wikipedia dataset containing 100,000 pages with 28.9 million links, collected using breadth-first search crawling algorithm. This dataset includes complete page metadata, link relationships, and a network graph representation suitable for network analysis, graph algorithms, NLP research, and machine learning applications.
pages_export.csvComplete page metadata including:
- id: Unique page ID
- title: Page title
- language: Language code (en)
- content_length: Content length in characters
- word_count: Word count
- categories: JSON array of categories
- infobox: JSON object of infobox data
- created_at: Timestamp
- url: Full Wikipedia URL
Size: ~70 MB | Rows: 100,000
links_export.csvComplete link graph with URLs:
- id: Unique link ID
- source_title: Source page title
- target_title: Target page title
- language: Language code
- position: Link position on page
- depth: Crawl depth where link was discovered
- created_at: Timestamp
- source_url: Full source page URL
- target_url: Full target page URL
Size: ~4.5 GB | Rows: 28,855,738
graph.jsonNetwork graph in JSON format:
- nodes: Array of node objects with id field
- edges: Array of edge objects with source and target fields
Size: ~2.1 GB | Edges: 28,855,738
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 540 images of popular graphs from the world of graph theory. Many different types of graphs from various graph families and complexities and even more hidden stories and questions.
There are various tasks when researching less trivial graphs and usually very computationally expensive, especially when dealing with higher-order graphs. Can we use our state of the art computer vision pipeline and algorithms to extract insight from graph images? Insight such as smaller and simpler parts like the number of vertices and edges to more difficult questions like clique sizes and graph radius and paths.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group. The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
MS-BioGraphs are a family of sequence similarity graph datasets with up to 2.5 trillion edges. The graphs are weighted edges and presented in compressed WebGraph format. The dataset include symmetric and asymmetric graphs. The largest graph has been created by matching sequences in Metaclust dataset with 1.7 billion sequences. These real-world graph dataset are useful for measuring contributions in High-Performance Computing and High-Performance Graph Processing.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Here you find the History of Work resources as Linked Open Data. It enables you to look ups for HISCO and HISCAM scores for an incredible amount of occupational titles in numerous languages.
Data can be queried (obtained) via the SPARQL endpoint or via the example queries. If the Linked Open Data format is new to you, you might enjoy these data stories on History of Work as Linked Open Data and this user question on Is there a list of female occupations?.
This version is dated Apr 2025 and is not backwards compatible with the previous version (Feb 2021). The major changes are: - incredible simplification of graph representation (from 81 to 12); - use of sdo (https://schema.org/) rather than schema (http://schema.org); - replacement of prov:wasDerivedFrom with sdo:isPartOf to link occupational titles to originating datasets; - etl files (used for conversion to Linked Data) now publicly available via https://github.com/rlzijdeman/rdf-hisco; - update of issues with language tags; - specfication of language tags for english (eg. @en-gb, instead of @en); - new preferred API: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/sparql (old API will be deprecated at some point: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/services/historyOfWork-all-latest/sparql ) .
There are bound to be some issues. Please leave report them here.
Figure 1. Part of model illustrating the basic relation between occupations, schema.org and HISCO.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca5521" alt="hisco-basic">
Figure 2. Part of model illustrating the relation between occupation, provenance and HISCO auxiliary variables.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca551e" alt="hisco-aux">
Facebook
Twittermmpr/graph dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://www.emergenresearch.com/privacy-policyhttps://www.emergenresearch.com/privacy-policy
The global Graph Database market size reached USD 1.59 Billion in 2020 and revenue is forecasted to reach USD 11.25 Billion in 2030 registering a CAGR of 21.9%. Graph Database (GDB) industry report classifies global market by share, trend, growth and on the basis of component, deployment, graph type...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This project is inspired on https://github.com/neo4j-graph-examples/twitter-v2.
Show data from your personal Twitter account
The Graph Your Network application inserts your Twitter activity into Neo4j.
https://neo4jsandbox.com/guides/twitter/img/twitter-data-model.svg" alt="">
~10 MB of graphs data (CSV)
43.325 node labels - Hashtag - Link - Me - Source - Tweet - User
57.896 relationship types - AMPLIFIES - CONTAINS - FOLLOWS - INTERACTS_WITH - MENTIONS - POSTS - REPLY_TO - RETWEETS - RT_MENTIONS - SIMILAR_TO - TAGS - USING
Facebook
TwitterThe dataset used in the paper is a Waxman random graph dataset, which includes graphs with features and edge features.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Discover the explosive growth of the graph technology market! Our in-depth analysis reveals key drivers, trends, and challenges impacting this dynamic sector, including leading players like Neo4j, Amazon AWS, and more. Explore market size projections, CAGR forecasts, and regional breakdowns to understand investment opportunities in graph databases and AI/ML.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset information
9 graphs of Autonomous Systems (AS) peering information inferred from Oregon
route-views between March 31 2001 and May 26 2001.
Dataset statistics are calculated for the graph with the lowest (March 31 2001)
and highest (from May 26 2001) number of nodes: Dataset statistics for graph
witdh lowest number of nodes - 3 31 2001)
Nodes 10670
Edges 22002
Nodes in largest WCC 10670 (1.000)
Edges in largest WCC 22002 (1.000)
Nodes in largest SCC 10670 (1.000)
Edges in largest SCC 22002 (1.000)
Average clustering coefficient 0.4559
Number of triangles 17144
Fraction of closed triangles 0.009306
Diameter (longest shortest path) 9
90-percentile effective diameter 4.5
Dataset statistics for graph with highest number of nodes - 5 26 2001
Nodes 11174
Edges 23409
Nodes in largest WCC 11174 (1.000)
Edges in largest WCC 23409 (1.000)
Nodes in largest SCC 11174 (1.000)
Edges in largest SCC 23409 (1.000)
Average clustering coefficient 0.4532
Number of triangles 19894
Fraction of closed triangles 0.009636
Diameter (longest shortest path) 10
90-percentile effective diameter 4.4
Source (citation)
J. Leskovec, J. Kleinberg and C. Faloutsos. Graphs over Time: Densification
Laws, Shrinking Diameters and Possible Explanations. ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD), 2005.
Files
File Description
* AS peering information inferred from Oregon route-views ...
oregon1_010331.txt.gz from March 31 2001
oregon1_010407.txt.gz from April 7 2001
oregon1_010414.txt.gz from April 14 2001
oregon1_010421.txt.gz from April 21 2001
oregon1_010428.txt.gz from April 28 2001
oregon1_010505.txt.gz from May 05 2001
oregon1_010512.txt.gz from May 12 2001
oregon1_010519.txt.gz from May 19 2001
oregon1_010526.txt.gz from May 26 2001
NOTE: for the UF Sparse Matrix Collection, the primary matrix in this problem
set (Problem.A) is the last matrix in the sequence, oregon1_010526, from May 26
2001.
The nodes are uniform across all graphs in the sequence in the UF collection.
That is, nodes do...
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Discover the booming Knowledge Graph Technology market! This comprehensive analysis reveals key trends, growth drivers, and regional market shares from 2025-2033. Learn about market size, CAGR, and top players shaping this transformative technology.
Facebook
Twitterzkchen/GOOD-Graph dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://univdatos.com/privacy-policyhttps://univdatos.com/privacy-policy
The Global Graph Database Market was valued at USD 2,257.78 million in 2024 and is expected to grow at a strong CAGR of around 17.5% during 2025-2033.
Facebook
Twitter
According to our latest research, the global graph database for telecom networks market size is valued at USD 1.34 billion in 2024, reflecting a robust adoption rate across the telecom sector. The market is experiencing a strong upward trajectory with a CAGR of 22.7% from 2025 to 2033. By 2033, the market is projected to reach a substantial USD 10.15 billion, driven by the increasing complexity of telecom networks and the urgent need for advanced data management and analytics solutions. The primary growth factor is the surging demand for real-time network analytics and fraud detection capabilities, which are critical for telecom operators seeking operational efficiency and competitive advantage.
The rapid proliferation of connected devices, 5G rollouts, and the exponential growth of data traffic are fundamentally transforming the telecom industry landscape. Telecom networks are evolving into highly complex, dynamic ecosystems that generate vast amounts of interconnected data. Traditional relational databases are often inadequate for handling such intricate relationships and real-time analytics requirements. Graph database solutions are uniquely positioned to address these challenges by enabling telecom operators to model, analyze, and visualize complex network topologies, customer interactions, and transactional data with unparalleled speed and flexibility. This technological shift is a key growth driver, as telecom providers increasingly seek scalable, agile, and intelligent data management platforms to enhance customer experience, optimize network performance, and accelerate digital transformation initiatives.
Another significant growth factor for the graph database for telecom networks market is the escalating threat landscape, particularly in the domain of fraud detection and cybersecurity. Telecom operators are frequent targets of sophisticated fraud schemes, including SIM card cloning, subscription fraud, and network intrusion attempts. Graph databases excel at identifying hidden patterns, relationships, and anomalies within massive datasets, enabling telecom companies to detect and mitigate fraud in real time. The ability to perform advanced analytics on interconnected data sets is empowering telecom operators to proactively safeguard their networks, reduce financial losses, and comply with stringent regulatory requirements. As the complexity of cyber threats intensifies, the adoption of graph database solutions for security and fraud prevention is expected to surge, further fueling market growth.
The growing emphasis on customer-centricity and personalized service delivery is also propelling market expansion. Telecom operators are leveraging graph databases to gain a 360-degree view of customer journeys, preferences, and interactions across multiple touchpoints. This holistic understanding facilitates targeted marketing, churn prediction, and tailored service offerings, which are essential for customer retention and revenue growth in a highly competitive market. The convergence of telecom networks with emerging technologies such as artificial intelligence, machine learning, and the Internet of Things (IoT) is amplifying the need for graph-based analytics, as these technologies rely on real-time, context-aware insights derived from complex data relationships. As a result, the integration of graph databases into telecom network architectures is becoming a strategic imperative for industry leaders.
From a regional perspective, North America currently leads the global graph database for telecom networks market, accounting for the largest revenue share in 2024. The region’s dominance is attributed to the early adoption of advanced analytics technologies, robust digital infrastructure, and the presence of major telecom and technology companies. Asia Pacific is emerging as the fastest-growing region, driven by massive investments in 5G networks, expanding mobile subscriber base, and increasing focus on digital transformation across telecom operators. Europe is also witnessing significant adoption of graph database solutions, particularly in the context of regulatory compliance and network optimization. Meanwhile, Latin America and the Middle East & Africa are gradually catching up, supported by ongoing telecom sector modernization and rising demand for advanced data analytics. The global market outlook remains highly promising, with all regions poised to contribute to sustained growth over the forecast period.<b
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The OpenAIRE Graph is exported as several files, so you can download the parts you are interested into.
publication_[part].tar: metadata records about research literature (includes types of publications listed here)dataset_[part].tar: metadata records about research data (includes the subtypes listed here) software.tar: metadata records about research software (includes the subtypes listed here)otherresearchproduct_[part].tar: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed here)organization.tar: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.datasource.tar: metadata records about data sources whose content is available in the OpenAIRE Graph. They include institutional and thematic repositories, journals, aggregators, funders' databases.project.tar: metadata records about project grants.relation_[part].tar: metadata records about relations between entities in the graph.communities_infrastructures.tar: metadata records about research communities and research infrastructures
Each file is a tar archive containing gz files, each with one json per line. Each json is compliant to the schema available at http://doi.org/10.5281/zenodo.14608526. The documentation for the model is available at https://graph.openaire.eu/docs/data-model/
Learn more about the OpenAIRE Graph at https://graph.openaire.eu.
Discover the graph's content on OpenAIRE EXPLORE and our API for developers.
This deposition contains:
192,934,523 publications,
73,443,566 datasets,
596,316 software,
24,797,142 other research products,
141,568 datasources,
3,482,537 projects,
454,601 organizations,
34 communities,
7,241,517,003 relations
Facebook
TwitterThis dataset was created by Jam0222
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15. The dataset comprises two compressed (.xz) files. 1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types: • openalex_id: A unique identifier from the Open Alex catalog. • integer_id: An integer representing the new identifier (assigned by the authors) • hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes). 2) filename: citation_table.tsv.xz This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively. Summary Features • Total Nodes (Documents): 256,997,006 • Total Edges (citations): 2,148,871,058 • Documents with DOIs: 163,495,446 • Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025] • Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025] Note: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset Note: Nov 13, 2025. The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the database file for the Encyclopedia of Finite Graphs and the upcoming paper Integer sequence discovery from small graphs. It contains a collection of invariants for all simple connected graphs up to order 10 and the integer sequences one can make.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
A comprehensive Wikipedia dataset containing 100,000 pages with 28.9 million links, collected using breadth-first search crawling algorithm. This dataset includes complete page metadata, link relationships, and a network graph representation suitable for network analysis, graph algorithms, NLP research, and machine learning applications.
pages_export.csvComplete page metadata including:
- id: Unique page ID
- title: Page title
- language: Language code (en)
- content_length: Content length in characters
- word_count: Word count
- categories: JSON array of categories
- infobox: JSON object of infobox data
- created_at: Timestamp
- url: Full Wikipedia URL
Size: ~70 MB | Rows: 100,000
links_export.csvComplete link graph with URLs:
- id: Unique link ID
- source_title: Source page title
- target_title: Target page title
- language: Language code
- position: Link position on page
- depth: Crawl depth where link was discovered
- created_at: Timestamp
- source_url: Full source page URL
- target_url: Full target page URL
Size: ~4.5 GB | Rows: 28,855,738
graph.jsonNetwork graph in JSON format:
- nodes: Array of node objects with id field
- edges: Array of edge objects with source and target fields
Size: ~2.1 GB | Edges: 28,855,738