Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We disribute here the datasets used in the tests for the paper:
«Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation.»
by Lissandrini, Matteo; Brugnara, Martin; and Velegrakis, Yannis.
In PVLDB, 12(4):390-403, 2018.
From the official webpage: https://graphbenchmark.com/
The original files where stored on Google Drive. Now going to be discontinued.
The datasets used in the tests are stored in GraphSON format for the versions of the engines supporting Tinkerpop 3. System using Tinkerpop 2 support instead GraphSON 1.0. Our datasets can be easily converted to an updated or older version. For an example see our Docker image.
The MiCo Dataset comes from the authors of GraMi
For more details, you can read:
«GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph»
by Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis.
In PVLDB, 7(7):517-528, 2014.
The Yeast Dataset has been converted from the one transformed in Pajek format by V. Batagelj. The original dataset comes from
«Topological structure analysis of the protein-protein interaction network in budding yeast»
by Shiwei Sun, Lunjiang Ling, Nan Zhang, Guojie Li and Runsheng Chen.
In Nucleic Acids Research, 2003, Vol. 31, No. 9 2443-2450
Moreover you can read about the details of our Freebase ExQ datasets, or you can use our Docker image to generate the LDBC synthetic dataset.
Name | Files | Size (bytes) | Graph Size (Nodes/Edges) |
---|---|---|---|
Yeast | yeast.json yeast.json.gz | 1.5M 180K | 2.3K / 7.1K |
MiCo | mico.json mico.json.gz | 84M 12M | 0.1M / 1.1M |
Frb-O | freebase_org.json freebase_org.json.gz | 584M 81M | 1.9M / 4.3M |
Frb-S | freebase_small.json freebase_small.json.gz | 87M 12M | 0.5M / 0.3M |
Frb-M | freebase_medium.json freebase_medium.json.gz | 816M 117M | 4M / 3.1M |
Frb-L | freebase_large.json freebase_large.json.gz | 6.3G 616M | 28.4M / 31.2M |
LDBC | ldbc.json ldbc.json.gz | 144M 13M | 0.18M / 1.5M |
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The Graph Database Market size was valued at USD 1.9 USD billion in 2023 and is projected to reach USD 7.91 USD billion by 2032, exhibiting a CAGR of 22.6 % during the forecast period. A graph database is one form of NoSQL database that contains and represents relationships as graphs. Graph databases do not presuppose the data as relations as most contemporary relational databases do, applying nodes, edges, and properties instead. The primary types include property graphs that permit attributes on the nodes and edges and RDF triplestores that center on subject-predicate-object triplets. Some of the features include; the method's ability to traverse relationships at high rates, the schema change is easy and the method is scalable. Some of the familiar use cases are social media, recommendations, anomalies or fraud detection, and knowledge graphs where the relationships are complex and require higher comprehension. These databases are considered valuable where the future connection between the items of data is as significant as the data themselves. Key drivers for this market are: Increasing Adoption of Cloud-based Managed Services to Drive Market Growth. Potential restraints include: Adverse Health Effect May Hamper Market Growth. Notable trends are: Growing Implementation of Touch-based and Voice-based Infotainment Systems to Increase Adoption of Intelligent Cars.
https://www.emergenresearch.com/privacy-policyhttps://www.emergenresearch.com/privacy-policy
The global Graph Database market size reached USD 1.59 Billion in 2020 and revenue is forecasted to reach USD 11.25 Billion in 2030 registering a CAGR of 21.9%. Graph Database (GDB) industry report classifies global market by share, trend, growth and on the basis of component, deployment, graph type...
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Graph Database Market size was valued at USD 2.86 Billion in 2024 and is projected to reach USD 14.58 Billion by 2032, growing at a CAGR of 22.6% from 2026 to 2032.
Global Graph Database Market Drivers
The growth and development of the Graph Database Market is attributed to certain main market drivers. These factors have a big impact on how Graph Database are demanded and adopted in different sectors. Several of the major market forces are as follows:
Growth of Connected Data: Graph databases are excellent at expressing and querying relationships as businesses work with datasets that are more complex and interconnected. Graph databases are becoming more and more in demand as connected data gains significance across multiple industries.
Knowledge Graph Emergence: In fields like artificial intelligence, machine learning, and data analytics, knowledge graphs—which arrange information in a graph structure—are becoming more and more popular. Knowledge graphs can only be created and queried via graph databases, which is what is causing their widespread use.
Analytics and Machine Learning Advancements: Graph databases handle relationships and patterns in data effectively, enabling applications related to advanced analytics and machine learning. Graph databases are becoming more and more in demand when combined with analytics and machine learning as businesses want to extract more insights from their data.
Real-Time Data Processing: Graph databases can process data in real-time, which makes them appropriate for applications that need quick answers and insights. In situations like fraud detection, recommendation systems, and network analysis, this is especially helpful.
Increasing Need for Security and Fraud Detection: Graph databases are useful for fraud security and detection applications because they can identify patterns and abnormalities in linked data. The growing need for graph databases in security solutions is a result of the ongoing evolution of cybersecurity threats.
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
The size of the Graph Database Market was valued at USD 19942.01 million in 2023 and is projected to reach USD 64282.28 million by 2032, with an expected CAGR of 18.20% during the forecast period. A Graph Database is a type of NoSQL database designed to represent and store data in the form of graphs, consisting of nodes, edges, and properties. This database model is optimized for handling data that is highly interconnected, allowing for the representation of relationships and networks with ease. The nodes in a graph database represent entities such as people, places, or events, while the edges represent the relationships or connections between these entities. Properties can be attached to both nodes and edges to store additional information, providing a rich structure for complex data sets. Unlike traditional relational databases, which use tables to organize data in rows and columns, graph databases use graph theory to model the relationships between data points, which enables more efficient querying and analysis, especially for large and complex data structures. This growth is attributed to factors such as increased data complexity, need for real-time insights, and advancements in AI and ML. Graph databases provide efficient storage and analysis of highly interconnected data, making them valuable for fraud detection, social network analysis, and recommendation systems. Key players include Oracle Corporation, IBM Corporation, and Amazon Web Services, Inc. Recent developments include: June 2021: Neo4j has released its most recent graph database version, 4.3. Graph data analysis, relationship asset indexes, new smart 10 scheduling, and parallelized backup are some of the features included in the most recent version of the graph database., April 2021: The MarkLogic Data Hub Central low-code/no-code user interface was introduced by MarkLogic Corp. With the ease and agility of using the data infrastructure, MarkLogic's launch provides organizations with a clear roadmap for cloud modernization., October 2020: Microsoft Corporation unveiled a brand-new artificial intelligence platform that can caption and describe photos. Azure Cognitive Services offers the system..
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group.
The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.The database is organized in two section: labeled and unlabeled graphs.
Both labeled and unlabeled graphs have been randomly generated according to six different generation models, each involving different possible parameter settings. As a result, 168 diverse kinds of graphs are contained in the database. Each type of unlabeled graph is represented by thousands of pairs of graphs for which an isomorphism or a graph-subgraph isomorphism relation holds, for a total of 143,600 graphs. Furthermore, each type of labeled graph is represented by thousands of pairs of graphs holding a not trivial common subgraph, for a total of 166,000 graphs.
For more details follow this link: https://mivia.unisa.it/datasets/graph-database/arg-database/documentation/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The EURIO (EUropean Research Information Ontology) Knowledge Graph is the knowledge graph containing CORDIS data about research projects funded by the H2020 and FP7 framework programmes. The EURIO Knowledge Graph can be accessed via its SPARQL endpoint at this link: https://cordis.europa.eu/datalab/sparql-endpoint/en. This dataset provides both a database dump of the EURIO Knowledge Graph and subsets of the EURIO Knowledge Graph in the form of named graphs.
The schema defining the structure of the named graphs is the EURIO ontology, available at https://op.europa.eu/en/web/eu-vocabularies/eurio. All files are available in the following formats: RDF, TTL, N-Quads, JSONLD, and N-Triples. For other formats (xlsx,csv etc ;…), please refer to these links: https://data.europa.eu/data/datasets/cordish2020projects and https://data.europa.eu/data/datasets/cordisfp7projects
The file EURIO Knowledge Graph contains a database dump of all CORDIS data about research projects funded under the H2020 and FP7 framework programmes. The file Project contains all projects funded under the H2020 and FP7 framework programmes. The file Organisation contains all organisations funded under the H2020 and FP7 framework programmes.
Reference data (countries, funding schemes/types of action, etc....) can be found in this dataset https://data.europa.eu/euodp/en/data/dataset/cordisref-data, while the EuroSciVoc taxonomy can be freely downloaded or browsed on the EuVocabularies website at this link: https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/euroscivoc
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description of data sets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Distance-1 coloring.
https://www.gnu.org/licenses/lgpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/lgpl-3.0-standalone.html
Data sets and json files (describing the semantic header and dataset description) to build an Event Knowledge Graph (EKG) using OCED-PG as used in [1].
Provides input data for 6 datasets (BPIC14, BPIC15, BPIC16, BPIC17, BPIC19 and a simulated libraray example).
EKGs are built using OCED-PG, implemented in PromgG v0.1.25. The source code can be found at Github.
To build EKGs using OCED-PG
[1] Swevels, A., Fahland, D., Montali, M.: Implementing Object-Centric Event Data Models in Event Knowledge Graphs (2023)
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Knowledge Graph Technology market is experiencing robust growth, driven by the increasing need for enhanced data organization, improved search capabilities, and the rise of artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by several key factors, including the growing volume of unstructured data, the need for better data integration across disparate sources, and the demand for more intelligent and context-aware applications. Businesses across various sectors, including healthcare, finance, and e-commerce, are adopting knowledge graphs to enhance decision-making, improve customer experiences, and gain a competitive advantage. The market is witnessing significant advancements in graph database technologies, semantic technologies, and knowledge representation techniques, further accelerating its growth trajectory. While challenges such as data quality issues and the complexity of implementing and maintaining knowledge graphs exist, the substantial benefits are driving widespread adoption. We project a substantial increase in market size over the next decade, with particular growth anticipated in regions with advanced digital infrastructures and strong investments in AI and data analytics. The segmentation of the market by application (e.g., customer relationship management, fraud detection, supply chain optimization) and type (e.g., ontology-based, rule-based) reflects the diverse use cases driving adoption across different sectors. The forecast for Knowledge Graph Technology demonstrates continued, albeit potentially moderating, growth through 2033. While the initial years will likely see strong expansion driven by early adoption and technological advancements, the growth rate might stabilize as the market matures. However, continued innovation, particularly in areas like integrating knowledge graphs with emerging technologies such as the metaverse and Web3, and expansion into new applications within industries like personalized medicine and smart manufacturing, will ensure sustained, though potentially less rapid, growth. Geographical expansion, particularly into developing economies with increasing digitalization, presents a significant opportunity for market expansion. Competitive pressures among vendors will drive further innovation and potentially lead to consolidation within the market. Therefore, a thorough understanding of market segmentation, competitive dynamics, and technological advancements is crucial for stakeholders to navigate the evolving landscape and capitalize on emerging opportunities.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database defined from the AIDS Antiviral Screen Database of Active Compounds is composed of 2000 chemical compounds some of them being disconnected. These chemical compounds have been screened as active or inactive against HIV and they are split into three different sets:
Results on AIDS dataset.
Method | Classification accuracy (%) | |
(1) | Riesen and Bunke (2008) | 97.3 |
(2) | Suard et al. (2002) | 98.5 |
(3) | Vishwanathan et al. (2010) | 98.5 |
(4) | Neuhaus and Bunke (2007) | 99.7 |
(5) | Riesen et al. (2007) | 98.2 |
(6) | Graph Laplacian kernel | 99.3 |
(7) | Gauzere el al. (2012) | 99.1 |
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Knowledge Graph Market size was valued at USD 7.19 Billion in 2024 and is expected to reach USD 4.1 Billion by 2032, growing at a CAGR of 18.1% from 2025 to 2032.
Knowledge Graph Market Drivers
Enhanced Data Integration and Analysis: Knowledge graphs excel at integrating and analyzing data from diverse sources, including structured, semi-structured, and unstructured data. This enables organizations to gain a holistic view of information and make more informed decisions. Improved Search and Information Retrieval: Knowledge graphs provide a more semantic understanding of information, enabling more accurate and relevant search results. Instead of just keyword matching, knowledge graphs understand the relationships between entities and provide more contextually relevant information. Personalized Experiences: Knowledge graphs can be used to personalize user experiences by understanding individual preferences, interests, and behaviors. This is crucial for applications like personalized recommendations, targeted advertising, and customer service. AI and Machine Learning: Knowledge graphs are essential for powering AI and machine learning applications, such as chatbots, recommendation systems, and fraud detection. They provide a structured representation of knowledge that AI/ML models can easily understand and utilize. Business Intelligence and Decision Making: Knowledge graphs can help businesses gain deeper insights into their customers, markets, and operations. They can be used to identify trends, predict future outcomes, and make more informed business decisions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains the graph representation of structures in the Materials Project (www.materialsproject.org) and target properties, including formation energy per atom, band gap, and for a subset of 5830 structures, the shear moduli G_{VRH} and bulk moduli K_{VRH}. This data is part of the our paper "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals". Change log:v5. Minor change of filenamev4. Add dummy state variables.v3. For the graph dictionaries, we modify the "node" key to "atom" and "distance" key to "bond" to match the latest MEGNet API. v2. Minor change of descriptionv1. Initial upload
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
According to Cognitive Market Research, the global Graph Analytics market size will be USD 2522 million in 2024 and will expand at a compound annual growth rate (CAGR) of 34.0% from 2024 to 2031. Market Dynamics of Graph Analytics Market
Key Drivers for Graph Analytics Market
Increasing Recognition of the Advantages of Graph Databases- One of the main reasons for the Graph Analytics market is the increasing recognition of the advantages of graph databases. Unlike traditional relational databases, graph databases excel at handling complex relationships and interconnected data, making them ideal for use cases such as fraud detection, recommendation engines, and social network analysis. Businesses are leveraging these capabilities to uncover insights and patterns that were previously difficult to detect. The rise of big data and the need for real-time analytics are further driving the adoption of graph databases, as they offer enhanced performance and scalability for large-scale data sets. Additionally, advancements in artificial intelligence and machine learning are amplifying the value of graph databases, enabling more sophisticated data modeling and predictive analytics.
Growing Uptake of Big Data Tools to Drive the Graph Analytics Market's Expansion in the Years Ahead.
Key Restraints for Graph Analytics Market
Limited Awareness and Understanding pose a serious threat to the Graph Analytics industry.
The market also faces significant difficulties related to data security and privacy.
Introduction of the Graph Analytics Market
The Graph Analytics Market is rapidly expanding, driven by the growing need for advanced data analysis techniques in various sectors. Graph analytics leverages graph structures to represent and analyze relationships and dependencies, providing deeper insights than traditional data analysis methods. Key factors propelling this market include the rise of big data, the increasing adoption of artificial intelligence and machine learning, and the demand for real-time data processing. Industries such as finance, healthcare, telecommunications, and retail are major contributors, utilizing graph analytics for fraud detection, personalized recommendations, network optimization, and more. Leading vendors are continually innovating to offer scalable, efficient solutions, incorporating advanced features like graph databases and visualization tools.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
jats:titleAbstract/jats:title jats:pThe speed and accuracy of new scientific discoveries – be it by humans or artificial intelligence – depends on the quality of the underlying data and on the technology to connect, search and share the data efficiently. In recent years, we have seen the rise of graph databases and semi-formal data models such as knowledge graphs to facilitate software approaches to scientific discovery. These approaches extend work based on formalised models, such as the Semantic Web. In this paper, we present our developments to connect, search and share data about genome-scale knowledge networks (GSKN). We have developed a simple application ontology based on OWL/RDF with mappings to standard schemas. We are employing the ontology to power data access services like resolvable URIs, SPARQL endpoints, JSON-LD web APIs and Neo4j-based knowledge graphs. We demonstrate how the proposed ontology and graph databases considerably improve search and access to interoperable and reusable biological knowledge (i.e. the FAIRness data principles)./jats:p
Frequent subgraph mining, i.e., the identification of relevant patterns in graph databases, is a well-known data mining problem with high practical relevance, since next to summarizing the data, the resulting patterns can also be used to define powerful domain-specific similarity functions for prediction. In recent years, significant progress has been made towards subgraph mining algorithms that scale to complex graphs by focusing on tree patterns and probabilistically allowing a small amount of incompleteness in the result. Nonetheless, the complexity of the pattern matching component used for deciding subtree isomorphism on arbitrary graphs has significantly limited the scalability of existing approaches. In this paper, we adapt sampling techniques from mathematical combinatorics to the problem of probabilistic subtree mining in arbitrary databases of many small to medium-size graphs or a single large graph. By restricting on tree patterns, we provide an algorithm that approximately counts or decides subtree isomorphism for arbitrary transaction graphs in sub-linear time with one-sided error. Our empirical evaluation on a range of benchmark graph datasets shows that the novel algorithm substantially outperforms state-of-the-art approaches both in the task of approximate counting of embeddings in single large graphs and in probabilistic frequent subtree mining in large databases of small to medium sized graphs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is based on the model developed with the Ph.D. students of the Communication and Information Sciences Ph.D. program at the University of Hawaii at Manoa, intended to help new students get relevant information. The model was first presented at the iConference 2023, in a paper "Community Design of a Knowledge Graph to Support Interdisciplinary Ph.D. Students " by Stanislava Gardasevic and Rich Gazan (available at: https://scholarspace.manoa.hawaii.edu/server/api/core/bitstreams/9eebcea7-06fd-4db3-b420-347883e6379e/content)The database is created in Neo4J, and the .dump file can be imported to the cloud instance of this software. The dataset (.dump) contains publically available data collected from multiple web locations and indexes of the sample of publications from the people in this domain. Except for that, it contains my (first author's) personal graph demonstrating progress through a student's program in this degree, and activities they have done while in the program. This dataset was made possible with the huge help of my collaborator, Petar Popovic, who ingested the data in the database.The model and dataset were developed while involving the end users in the design and are based on the actual information needs of a population. It is intended to allow researchers to investigate multigraph visualization of the data modeled by the said model.The knowledge graph was evaluated with CIS student population, and the study results show that it is very helpful for decision-making, information discovery, and identification of people in one's surroundings who might be good collaborators or information points. We provide the .json file containing the Neo4J Bloom perspective with styling and queries used in these evaluation sessions.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This repository is part of the Ph.D. thesis of Isabelle M. van Schilt, Delft University of Technology.
This repository is used to generate a graph of open-source sea and airport data. For this, open-source data of the shipping schedules given by MSC, Maersk, HMM, and Evergreen is used. The data is collected from the websites of the shipping companies (see also https://github.com/EwoutH/shipping-data). The data is then processed to generate a graph of the shipping schedules, including the distributions of the shipping schedules. The graph is used to analyze the shipping schedules and to identify the most important ports in the network. Airport data is collected from the open-source OpenFlights database.
As case study, we collect data on CN-HK to main ports in the USA, and mostly MSC data on South America to NL-BE.
This repository is used for developing various graphs on open-source data and automatically running it as a simulation model in the repository: complex_stylized_supply_chain_model_generator
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
daily updated ERC database for seasonal ERC graphs
averaged ERC value for 14 PSA based on the historical and today's value
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We disribute here the datasets used in the tests for the paper:
«Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation.»
by Lissandrini, Matteo; Brugnara, Martin; and Velegrakis, Yannis.
In PVLDB, 12(4):390-403, 2018.
From the official webpage: https://graphbenchmark.com/
The original files where stored on Google Drive. Now going to be discontinued.
The datasets used in the tests are stored in GraphSON format for the versions of the engines supporting Tinkerpop 3. System using Tinkerpop 2 support instead GraphSON 1.0. Our datasets can be easily converted to an updated or older version. For an example see our Docker image.
The MiCo Dataset comes from the authors of GraMi
For more details, you can read:
«GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph»
by Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis.
In PVLDB, 7(7):517-528, 2014.
The Yeast Dataset has been converted from the one transformed in Pajek format by V. Batagelj. The original dataset comes from
«Topological structure analysis of the protein-protein interaction network in budding yeast»
by Shiwei Sun, Lunjiang Ling, Nan Zhang, Guojie Li and Runsheng Chen.
In Nucleic Acids Research, 2003, Vol. 31, No. 9 2443-2450
Moreover you can read about the details of our Freebase ExQ datasets, or you can use our Docker image to generate the LDBC synthetic dataset.
Name | Files | Size (bytes) | Graph Size (Nodes/Edges) |
---|---|---|---|
Yeast | yeast.json yeast.json.gz | 1.5M 180K | 2.3K / 7.1K |
MiCo | mico.json mico.json.gz | 84M 12M | 0.1M / 1.1M |
Frb-O | freebase_org.json freebase_org.json.gz | 584M 81M | 1.9M / 4.3M |
Frb-S | freebase_small.json freebase_small.json.gz | 87M 12M | 0.5M / 0.3M |
Frb-M | freebase_medium.json freebase_medium.json.gz | 816M 117M | 4M / 3.1M |
Frb-L | freebase_large.json freebase_large.json.gz | 6.3G 616M | 28.4M / 31.2M |
LDBC | ldbc.json ldbc.json.gz | 144M 13M | 0.18M / 1.5M |