100+ datasets found

m
Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2<m<5,...
mostwiedzy.pl
zip
Updated Dec 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Fidytek (2020). Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2
Explore at:
zip(3836)Available download formats
Unique identifier
https://doi.org/10.34808/nsea-xb70
Dataset updated
Dec 17, 2020
Authors
Robert Fidytek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
For K4 and Km-e graphs, a coloring type (K4,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K4 subgraph in the first color (representing by no edges in the graph) or the Km-e subgraph in the second color (representing by edges in the graph). Km-e means the full Km graph with one edge removed.The Ramsey number R(K4,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K4 in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K4,Km-e;n) exist for n<R(K4,Km-e).The dataset consists of:a) 5 files containing all non-isomorphic graphs that are coloring types (K4,K3-e;n) for 1<n<7,b) 9 files containing all non-isomorphic graphs that are coloring types (K4,K4-e;n) for 1<n<11.
d
Data from: Grammar transformations of topographic feature type annotations...
catalog.data.gov
data.usgs.gov
+2more
Updated Oct 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Grammar transformations of topographic feature type annotations of the U.S. to structured graph data. [Dataset]. https://catalog.data.gov/dataset/grammar-transformations-of-topographic-feature-type-annotations-of-the-u-s-to-structured-g
Explore at:
Dataset updated
Oct 29, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
These data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-speech, tag, dependency relation, dependency head, and lemma. Each word within the lexicon was tokenized. A stop-words list was utilized only to remove punctuation and symbols from the text, excluding hyphenated words (ex. bowl-shaped) which remained as such. The tokens’ lemmas were then aggregated and totaled to find their recurrences within the lexicon. This procedure was repeated for tokenizing noun chunks using the same glossary definitions.
f
Statistics of datasets used in the experiments.
plos.figshare.com
xls
Updated Jun 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee (2023). Statistics of datasets used in the experiments. [Dataset]. http://doi.org/10.1371/journal.pone.0277887.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277887.t002
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistics of datasets used in the experiments.
Purchase Dataset
kaggle.com
zip
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anshuman_Tiwari2005 (2025). Purchase Dataset [Dataset]. https://www.kaggle.com/datasets/anshumantiwari2005/purchase-dataset
Explore at:
zip(2715 bytes)Available download formats
Dataset updated
May 9, 2025
Authors
Anshuman_Tiwari2005
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Anshuman_Tiwari2005

Released under CC0: Public Domain

Contents
d
Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA...
datarade.ai
Updated Aug 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DRAKO (2024). Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA Compliant [Dataset]. https://datarade.ai/data-products/drako-device-graph-data-usa-canada-comprehensive-insi-drako
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Aug 21, 2024
Dataset authored and provided by
DRAKO
Area covered
Mozambique, Philippines, Cyprus, Bahamas, South Sudan, Eritrea, Brazil, Aruba, Lao People's Democratic Republic, Tonga
Description
DRAKO is a leader in providing Device Graph Data, focusing on understanding the relationships between consumer devices and identities. Our data allows businesses to create holistic profiles of users, track engagement across platforms, and measure the effectiveness of advertising efforts.

Device Graph Data is essential for accurate audience targeting, cross-device attribution, and understanding consumer journeys. By integrating data from multiple sources, we provide a unified view of user interactions, helping businesses make informed decisions.

Key Features: - Comprehensive device mapping to understand user behaviour across multiple platforms - Detailed Identity Graph Data for cross-device identification and engagement tracking - Integration with Connected TV Data for enhanced insights into video consumption habits - Mobile Attribution Data to measure the effectiveness of mobile campaigns - Customizable analytics to segment audiences based on device usage and demographics - Some ID types offered: AAID, idfa, Unified ID 2.0, AFAI, MSAI, RIDA, AAID_CTV, IDFA_CTV

Use Cases: - Cross-device marketing strategies - Attribution modelling and campaign performance measurement - Audience segmentation and targeting - Enhanced insights for Connected TV advertising - Comprehensive consumer journey mapping

Data Compliance: All of our Device Graph Data is sourced responsibly and adheres to industry standards for data privacy and protection. We ensure that user identities are handled with care, providing insights without compromising individual privacy.

Data Quality: DRAKO employs robust validation techniques to ensure the accuracy and reliability of our Device Graph Data. Our quality assurance processes include continuous monitoring and updates to maintain data integrity and relevance.
m
Dataset of non-isomorphic graphs being coloring types (K6-e,Km-e;n), 2<m<5,...
mostwiedzy.pl
zip
Updated Mar 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Fidytek (2021). Dataset of non-isomorphic graphs being coloring types (K6-e,Km-e;n), 2
Explore at:
zip(29865232)Available download formats
Unique identifier
https://doi.org/10.34808/z6c2-cj70
Dataset updated
Mar 17, 2021
Authors
Robert Fidytek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
For K6-e and Km-e graphs, the type coloring (K6-e,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K6-e subgraph in the first color (no edge in the graph) or the Km-e subgraph in the second color (exists edge in the graph). Km-e means the full Km graph with one edge removed. The Ramsey number R(K6-e,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K6-e in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K6-e,Km-e;n) exist for n<R(K6-e,Km-e).
zapatillas data
kaggle.com
zip
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brayan Alejandro Valencia lopez (2025). zapatillas data [Dataset]. https://www.kaggle.com/datasets/alejandro2025/zapatillas-data
Explore at:
zip(6938 bytes)Available download formats
Dataset updated
Jul 13, 2025
Authors
Brayan Alejandro Valencia lopez
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Brayan Alejandro Valencia lopez

Released under Apache 2.0

Contents
SeaLiT Knowledge Graphs - Maritime History Data in RDF using a CIDOC-CRM...
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Jul 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios; Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios (2022). SeaLiT Knowledge Graphs - Maritime History Data in RDF using a CIDOC-CRM extension (SeaLiT Ontology) [Dataset]. http://doi.org/10.5281/zenodo.6460841
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6460841
Dataset updated
Jul 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios; Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
SeaLiT Knowledge Graphs is an RDF dataset of maritime history data that has been transcribed (and then transformed) from original archival sources in the context of the SeaLiT Project (Seafaring Lives in Transition, Mediterranean Maritime Labour and Shipping, 1850s-1920s). The underlying data model is the SeaLiT Ontology, an extension of the ISO standard CIDOC-CRM (ISO 21127:2014) for the modelling and integration of maritime history information.

The knowledge graphs integrate data of totally 16 different types of archival sources:

Crew Lists

Crew and displacement list (Roll)

Crew List (Ruoli di Equipaggio)

General Spanish Crew List

Registers / Lists

Students Register

Civil Register

Register of Maritime Personnel

Register of Maritime Workers (Matricole della gente di mare)

Sailors Register (Libro de registro de marineros)

Naval Ship Register List

Seagoing Personnel

Lists of ships

Censuses

Census La Ciotat

First National all-Russian Census of the Russian Empire

Payrolls

Payrolls of private archives and libraries in Greece

Payrolls of Russian Steam Navigation and Trading Company

Employment records

Shipyards of Messageries Maritimes, La Ciotat

More information about the archival sources are available through the SeaLiT website. Data exploration applications over these sources are also publicly available (SeaLiT Catalogues, SeaLiT ResearchSpace).

Data from these archival sources has been transcribed in tabular form and then curated by historians of SeaLiT using the FAST CAT system. The transcripts (records), together with the curated vocabulary terms and entity instances (ships, persons, locations, organizations), are then transformed to RDF using the SeaLiT Ontology as the target (domain) model. To this end, the corresponding schema mappings between the original schemata and the ontology were defined using the X3ML mapping definition language, that were subsequently used for delivering the RDF datasets.

More information about the FAST CAT system and the data transcription, curation and transformation processes can be found in the following paper:

P. Fafalios, K. Petrakis, G. Samaritakis, K. Doerr, A. Kritsotaki, Y. Tzitzikas, M. Doerr, "FAST CAT: Collaborative Data Entry and Curation for Semantic Interoperability in Digital Humanities", ACM Journal on Computing and Cultural Heritage, 2021. https://doi.org/10.1145/3461460 [pdf, bib]

The RDF dataset is provided as a set of TriG files per record per archival source. For each record, the dataset provides: i) one trig file for the record's data (records.trig), ii) one trig file for the record's (curated) vocabulary terms (vocabularies.trig), and iii) four trig files for the record's (curated) entity instances (ships.trig, persons.trig, persons.trig, organizations.trig).

We also provide the RDFS files of the used ontologies (SeaLiT Ontology verson 1.0, CIDOC-CRM version 7.1.1).
S
Data from: XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction...
scidb.cn
Updated Oct 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hailong Jin; Chengjiang Li; Jing Zhang; Lei Hou; Juanzi Li; Peng Zhang (2020). XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction and Application [Dataset]. http://doi.org/10.11922/sciencedb.j00104.00022
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.11922/sciencedb.j00104.00022
Dataset updated
Oct 15, 2020
Dataset provided by
Science Data Bank
Authors
Hailong Jin; Chengjiang Li; Jing Zhang; Lei Hou; Juanzi Li; Peng Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
One table and 11 figures. Table 1 shows XLORE2 statistics. Figure 1 shows the framework of XLORE2. Figure 2 is an example of cross-lingual knowledge linking. Figure 3 presents the framework of cross-lingual knowledge linking. Figure 4 is an example of cross-lingual property matching (attribute matching). Figure 5 shows the framework of cross-lingual property matching. Figure 6 presents an example of mistakenly derived facts. Figure 7 is the framework of cross-lingual knowledge validation. Figure 8 shows an example of fine-grained type inference. Figure 9 depicts the framework of fine-grained type inference. Figure 10 is an illustration of XLink. Figure 11 shows the interface of XLORE2 and XLink.
T
Hungary - Distribution of population by household types: Single person
tradingeconomics.com
csv, excel, json, xml
Updated Sep 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). Hungary - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/hungary/distribution-of-population-by-household-types-single-person-eurostat-data.html
Explore at:
xml, json, excel, csvAvailable download formats
Dataset updated
Sep 15, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
Hungary
Description
Hungary - Distribution of population by household types: Single person was 13.80% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Hungary - Distribution of population by household types: Single person - last updated from the EUROSTAT on November of 2025. Historically, Hungary - Distribution of population by household types: Single person reached a record high of 14.50% in December of 2017 and a record low of 9.20% in December of 2010.
GCN_model
kaggle.com
zip
Updated Jun 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thida Khim (2025). GCN_model [Dataset]. https://www.kaggle.com/datasets/thidakhim/gcn-model
Explore at:
zip(78857239 bytes)Available download formats
Dataset updated
Jun 13, 2025
Authors
Thida Khim
Description
Dataset

This dataset was created by Thida Khim

Contents
S
CBCD:A Chinese Bar Chart Dataset for Data Extraction
scidb.cn
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan (2025). CBCD:A Chinese Bar Chart Dataset for Data Extraction [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00052
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.j00240.00052
Dataset updated
Nov 14, 2025
Dataset provided by
Science Data Bank
Authors
Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.
f
Performance comparison by dataset and node size.
figshare.com
xls
Updated Oct 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu (2025). Performance comparison by dataset and node size. [Dataset]. http://doi.org/10.1371/journal.pone.0334724.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0334724.t002
Dataset updated
Oct 23, 2025
Dataset provided by
PLOS ONE
Authors
Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Graphs are a representative type of fundamental data structures. They are capable of representing complex association relationships in diverse domains. For large-scale graph processing, the stream graphs have become efficient tools to process dynamically evolving graph data. When processing stream graphs, the subgraph counting problem is a key technique, which faces significant computational challenges due to its #P-complete nature. This work introduces StreamSC, a novel framework that efficiently estimate subgraph counting results on stream graphs through two key innovations: (i) It’s the first learning-based framework to address the subgraph counting problem focused on stream graphs; and (ii) this framework addresses the challenges from dynamic changes of the data graph caused by the insertion or deletion of edges. Experiments on 5 real-word graphs show the priority of StreamSC on accuracy and efficiency.
4
Event Graph of BPI Challenge 2015
data.4tu.nl
zip
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dirk Fahland; Stefan Esser, Event Graph of BPI Challenge 2015 [Dataset]. http://doi.org/10.4121/14169569.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/14169569.v1
Dataset provided by
4TU.ResearchData
Authors
Dirk Fahland; Stefan Esser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Business process event data modeled as labeled property graphs

Data Format
-----------

The dataset comprises one labeled property graph in two different file formats.

#1) Neo4j .dump format

A neo4j (https://neo4j.com) database dump that contains the entire graph and can be imported into a fresh neo4j database instance using the following command, see also the neo4j documentation: https://neo4j.com/docs/

/bin/neo4j-admin.(bat|sh) load --database=graph.db --from=

The .dump was created with Neo4j v3.5.

#2) .graphml format

A .zip file containing a .graphml file of the entire graph

Data Schema
-----------

The graph is a labeled property graph over business process event data. Each graph uses the following concepts

:Event nodes - each event node describes a discrete event, i.e., an atomic observation described by attribute "Activity" that occurred at the given "timestamp"

:Entity nodes - each entity node describes an entity (e.g., an object or a user), it has an EntityType and an identifier (attribute "ID")

:Log nodes - describes a collection of events that were recorded together, most graphs only contain one log node

:Class nodes - each class node describes a type of observation that has been recorded, e.g., the different types of activities that can be observed, :Class nodes group events into sets of identical observations

:CORR relationships - from :Event to :Entity nodes, describes whether an event is correlated to a specific entity; an event can be correlated to multiple entities

:DF relationships - "directly-followed by" between two :Event nodes describes which event is directly-followed by which other event; both events in a :DF relationship must be correlated to the same entity node. All :DF relationships form a directed acyclic graph.

:HAS relationship - from a :Log to an :Event node, describes which events had been recorded in which event log

:OBSERVES relationship - from an :Event to a :Class node, describes to which event class an event belongs, i.e., which activity was observed in the graph

:REL relationship - placeholder for any structural relationship between two :Entity nodes

The concepts a further defined in Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552

Data Contents
-------------

neo4j-bpic15-2021-02-17 (.dump|.graphml.zip)

An integrated graph describing the raw event data of the entire BPI Challenge 2015 dataset.
van Dongen, B.F. (Boudewijn) (2015): BPI Challenge 2015. 4TU.ResearchData. Collection. https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1

This data is provided by five Dutch municipalities. The data contains all building permit applications over a period of approximately four years. There are many different activities present, denoted by both codes (attribute concept:name) and labels, both in Dutch (attribute taskNameNL) and in English (attribute taskNameEN). The cases in the log contain information on the main application as well as objection procedures in various stages. Furthermore, information is available about the resource that carried out the task and on the cost of the application (attribute SUMleges). The processes in the five municipalities should be identical, but may differ slightly. Especially when changes are made to procedures, rules or regulations the time at which these changes are pushed into the five municipalities may differ. Of course, over the four year period, the underlying processes have changed. The municipalities have a number of questions, namely: What are the roles of the people involved in the various stages of the process and how do these roles differ across municipalities? What are the possible points for improvement on the organizational structure for each of the municipalities? The employees of two of the five municipalities have physically moved into the same location recently. Did this lead to a change in the processes and if so, what is different? Some of the procedures will be outsourced from 2018, i.e. they will be removed from the process and the applicant needs to have these activities performed by an external party before submitting the application. What will be the effect of this on the organizational structures in the five municipalities? Where are differences in throughput times between the municipalities and how can these be explained? What are the differences in control flow between the municipalities? There are five different log files available in this collection. Events are labeled with both a code and a Dutch and English label. Each activity code consists of three parts: two digits, a variable number of characters, and then three digits. The first two digits as well as the characters indicate the subprocess the activity belongs to. For instance ‘01_HOOFD_xxx’ indicates the main process and ‘01_BB_xxx’ indicates the ‘objections and complaints’ (‘Beroep en Bezwaar’ in Dutch) subprocess. The last three digits hint on the order in which activities are executed, where the first digit often indicates a phase within a process. Each trace and each event, contain several data attributes that can be used for various checks and predictions. Furthermore, some employees may have performed tasks for different municipalities, i.e. if the employee number is the same, it is safe to assume the same person is being identified.

The data contains the following entities and their events

- Application - a building permit application handled in one of five Dutch municipalities
- Case_R - a user or worker involved in handling the application
- Responsible_actor - a user or worker designated as responsible actor for an activity
- monitoringResource - a user or worker designated as monitoring resource for an activity

The data contains 5 event log nodes as the data was integrated from 5 different event logs from 5 different systems.

Data Size
---------

BPIC15, nodes: 268851, relationships: 2620418
T
Sweden - Distribution of population by household types: Single person
tradingeconomics.com
csv, excel, json, xml
Updated Sep 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). Sweden - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/sweden/distribution-of-population-by-household-types-single-person-eurostat-data.html
Explore at:
csv, excel, xml, jsonAvailable download formats
Dataset updated
Sep 15, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
Sweden
Description
Sweden - Distribution of population by household types: Single person was 22.20% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Sweden - Distribution of population by household types: Single person - last updated from the EUROSTAT on December of 2025. Historically, Sweden - Distribution of population by household types: Single person reached a record high of 24.10% in December of 2023 and a record low of 19.80% in December of 2012.
Z
Data from: CLARA Knowledge Graph of licensed educational resources
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Oct 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kieffer, Manoé; Fakih, Ghinwa; Serrano Alvarado, Patricia (2023). CLARA Knowledge Graph of licensed educational resources [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_8403141
Explore at:
Dataset updated
Oct 20, 2023
Dataset provided by
LS2N, UMR6004, Nantes Université
Authors
Kieffer, Manoé; Fakih, Ghinwa; Serrano Alvarado, Patricia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CLARAThis deposit is part of the CLARA project. The CLARA project aims to empower teachers in the task of creating new educational resources. And in particular with the task of handling the licenses of reused educational resources. The present deposit contains the RDF files created using an RDF mapping (RML) and a mapper (Morph-KGC). It also contains the files JSON used as input. The corresponding pipeline can be found on Gitlab. The data used in that pipeline originate from X5GON, a European project aiming to generate and gather open educational resources. Knowledge graph contentThe present Knowledge Graph contains information about 45K Educational Resources (ERs) and 135K subjects (extracted from DBpedia).That information contains

the author, its title and description the license, a URL to the resource itself, the language of the ER, its mimetype, and finally which subject it talks about, and to what extent. That extent is given by two scores: a PageRank score and a Cosinus score. A particularity of the knowledge graph is its heavy use of RDF reification, across large multi-valued properties.Thus four versions of the knowledge graph exist, using Standard reification, Singleton property, Named graphs, and RDF-star. The Knowledge Graph also contains categories originating from DBpedia. They help precise the subjects that are also extracted from DBpedia. The KG.zip files contain five types of files:

Authors_[X].nt - Those contain the authors' nodes, their type, and name. ER_[X].nt/nq/ttl - Those contain the ERs and their information using the respective RDF reification model. categories_skos_[X].ttl - Those contain the hierarchy of DBpedia categories. categories_labels.ttl - This file contains additional information about the categories. categories_article.ttl - This file contains the RDF triples that link the DBpedia subjects to the DBpedia categories.

JSON content The original dataset was cut into multiple JSON files in order to make its processing easier. DBpedia categories were extracted as RDF and aren't present in the JSON files.There are two types of files in the input-json.zip file:

authors_[X].json - Which lists the authors names ER_[X].json - Which lists the ERs and their related information.That information contains:

their title. their description. their language (and language_detected, only the first one is used in the pipeline here). their license. their mimetype. the authors. the date of creation of the resource. a url linking to the resource itself. the subjects (named concepts) associated with the resource. With the corresponding scores.

If you do use this dataset, you can cite this repository:

Kieffer, M., Fakih, G., & Serrano Alvarado, P. (2023). CLARA Knowledge Graph of licensed educational resources [Data set]. Semantics, Leipzig, Germany. Zenodo. https://doi.org/10.5281/zenodo.8403142 Or the corresponding paper

Kieffer, M., Fakih, G. & Serrano-Alvarado, P. (2023). Evaluating Reification with Multi-valued Properties in a Knowledge Graph of Licensed Educational Resources. Semantics, Leipzig, Germany.
Data from: OpenAIRE Research Graph Dump
zenodo.org
pub.uni-bielefeld.de
+1more
application/gzip
Updated Aug 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Manghi; Paolo Manghi; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Jochen Schirrwagen; Jochen Schirrwagen; Harry Dimitropoulos; Sandro La Bruzzo; Sandro La Bruzzo; Ioannis Foufoulas; Aenne Löhden; Amelie Bäcker; Amelie Bäcker; Andrea Mannocci; Andrea Mannocci; Marek Horst; Miriam Baglioni; Miriam Baglioni; Andreas Czerniak; Andreas Czerniak; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Lars Holm Nielsen; Lars Holm Nielsen; Alexandros Ioannidis; Chiara Bigarella; Friedrich Summann; Friedrich Summann; Harry Dimitropoulos; Ioannis Foufoulas; Aenne Löhden; Marek Horst; Katerina Iatropoulou; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Alexandros Ioannidis; Chiara Bigarella (2023). OpenAIRE Research Graph Dump [Dataset]. http://doi.org/10.5281/zenodo.3516918
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3516918
Dataset updated
Aug 17, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paolo Manghi; Paolo Manghi; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Jochen Schirrwagen; Jochen Schirrwagen; Harry Dimitropoulos; Sandro La Bruzzo; Sandro La Bruzzo; Ioannis Foufoulas; Aenne Löhden; Amelie Bäcker; Amelie Bäcker; Andrea Mannocci; Andrea Mannocci; Marek Horst; Miriam Baglioni; Miriam Baglioni; Andreas Czerniak; Andreas Czerniak; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Lars Holm Nielsen; Lars Holm Nielsen; Alexandros Ioannidis; Chiara Bigarella; Friedrich Summann; Friedrich Summann; Harry Dimitropoulos; Ioannis Foufoulas; Aenne Löhden; Marek Horst; Katerina Iatropoulou; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Alexandros Ioannidis; Chiara Bigarella
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
November 2020: Please check out the newer version of the OpenAIRE Research Graph dump available at https://doi.org/10.5281/zenodo.4201546. The newer version contains json files that are more compact and easy to process. learn more about the OpenAIRE Research Graph at https://graph.openaire.eu.

The OpenAIRE Research Graph is exported as several dumps, so you can download the parts you are interested into.

publication.gz: metadata records about research literature (includes types of publications listed here)

dataset.gz:: metadata records about research data (includes the subtypes listed here)

software.gz:: metadata records about research software (includes the subtypes listed here)

orp.gz: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed here)

organization.gz: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.

datasource.gz: metadata records about providers whose content is available in the OpenAIRE Research Graph. They includes institutional and thematic repositories, journals, aggregators, funders' databases.

project.gz: metadata records about projects funded by a given funder.

: metadata records about research results (publications, datasets, software, and other research products) funded by a given funder.

Please go to http://develop.openaire.eu/graph-dumps.html for instructions on how to consume the dumps.

Libraries: this blog describes the openairegraph libraries, which can be used to perform analytics on this dataset.
OAG Dataset for H2GB
kaggle.com
zip
Updated Jun 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junhong Lin (2024). OAG Dataset for H2GB [Dataset]. https://www.kaggle.com/datasets/junhonglin/oag-dataset-for-h2gb
Explore at:
zip(15456920438 bytes)Available download formats
Dataset updated
Jun 11, 2024
Authors
Junhong Lin
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
oag-cs, oag-eng, oag-chem are new heterogeneous networks composed of subsets of the Open Academic Graph (OAG). Each of the datasets contains papers from three different subject domains -- computer science, engineering, and chemistry. These datasets also contain four types of entities -- papers, authors, institutions, and fields of study. Each paper is associated with a 768-dimensional feature vector generated from a pre-trained XLNet applying on the paper titles. The representation of each word in the title are weighted by each word's attention to get the title representation for each paper. Each paper node is labeled with its published venue (paper or conference). We split the papers published up to 2016 as the training set, papers published in 2017 as the validation set, and papers published in 2018 and 2019 as the test set. The publication year of each paper is also included in these datasets. This means those datasets can also be converted to use the publication year as class labels.
DataSheet5_Classifying breast cancer using multi-view graph neural network...
frontiersin.figshare.com
txt
Updated Feb 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanjiao Ren; Yimeng Gao; Wei Du; Weibo Qiao; Wei Li; Qianqian Yang; Yanchun Liang; Gaoyang Li (2024). DataSheet5_Classifying breast cancer using multi-view graph neural network based on multi-omics data.CSV [Dataset]. http://doi.org/10.3389/fgene.2024.1363896.s005
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2024.1363896.s005
Dataset updated
Feb 20, 2024
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Yanjiao Ren; Yimeng Gao; Wei Du; Weibo Qiao; Wei Li; Qianqian Yang; Yanchun Liang; Gaoyang Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction: As the evaluation indices, cancer grading and subtyping have diverse clinical, pathological, and molecular characteristics with prognostic and therapeutic implications. Although researchers have begun to study cancer differentiation and subtype prediction, most of relevant methods are based on traditional machine learning and rely on single omics data. It is necessary to explore a deep learning algorithm that integrates multi-omics data to achieve classification prediction of cancer differentiation and subtypes.Methods: This paper proposes a multi-omics data fusion algorithm based on a multi-view graph neural network (MVGNN) for predicting cancer differentiation and subtype classification. The model framework consists of a graph convolutional network (GCN) module for learning features from different omics data and an attention module for integrating multi-omics data. Three different types of omics data are used. For each type of omics data, feature selection is performed using methods such as the chi-square test and minimum redundancy maximum relevance (mRMR). Weighted patient similarity networks are constructed based on the selected omics features, and GCN is trained using omics features and corresponding similarity networks. Finally, an attention module integrates different types of omics features and performs the final cancer classification prediction.Results: To validate the cancer classification predictive performance of the MVGNN model, we conducted experimental comparisons with traditional machine learning models and currently popular methods based on integrating multi-omics data using 5-fold cross-validation. Additionally, we performed comparative experiments on cancer differentiation and its subtypes based on single omics data, two omics data, and three omics data.Discussion: This paper proposed the MVGNN model and it performed well in cancer classification prediction based on multiple omics data.
Global NoSQL Database Market By Type (Key-Value Store, Document Database,...
verifiedmarketresearch.com
Updated Oct 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2025). Global NoSQL Database Market By Type (Key-Value Store, Document Database, Column Based Store, Graph Database), By Application (Data Storage, Mobile Apps, Web Apps, Data Analytics), By End-User Industry (Retail, Gaming, IT), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/nosql-database-market/
Explore at:
Dataset updated
Oct 14, 2025
Dataset provided by
Verified Market Researchhttps://www.verifiedmarketresearch.com/
Authors
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2026 - 2032
Area covered
Global
Description
NoSQL Database Market size was valued at USD 6.47 Billion in 2024 and is expected to reach USD 44.66 Billion by 2032, growing at a CAGR of 30.14% from 2026 to 2032.Global NoSQL Database Market DriversExponential Growth of Big Data and IoT: The explosion of Big Data and Internet of Things (IoT) applications is a primary catalyst for NoSQL adoption, requiring database solutions that can ingest and process colossal volumes of unstructured and semi-structured data from diverse sources like sensors, social media, and web logs. Unlike rigid relational systems, Increasing Demand for Real-Time Web and Mobile Applications: The surging demand for real-time web and mobile applications is significantly fueling the NoSQL market, as these modern applications require sub-millisecond latency and exceptionally high throughput to deliver a seamless user experience. NoSQL database types, particularly key-value stores and document databases, are architecturally optimized for rapid read/write operations and horizontal scaling,.

Facebook

Twitter

Click to copy link

Link copied

Cite

Robert Fidytek (2020). Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2

Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2

Explore at:

zip(3836)Available download formats

Unique identifier

https://doi.org/10.34808/nsea-xb70

Dataset updated

Dec 17, 2020

Authors

Robert Fidytek

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

For K4 and Km-e graphs, a coloring type (K4,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K4 subgraph in the first color (representing by no edges in the graph) or the Km-e subgraph in the second color (representing by edges in the graph). Km-e means the full Km graph with one edge removed.The Ramsey number R(K4,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K4 in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K4,Km-e;n) exist for n<R(K4,Km-e).The dataset consists of:a) 5 files containing all non-isomorphic graphs that are coloring types (K4,K3-e;n) for 1<n<7,b) 9 files containing all non-isomorphic graphs that are coloring types (K4,K4-e;n) for 1<n<11.

Clear search

Close search

Google apps

Main menu

Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2<m<5,...

Data from: Grammar transformations of topographic feature type annotations...

Statistics of datasets used in the experiments.

Purchase Dataset

Dataset

Contents

Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA...

Dataset of non-isomorphic graphs being coloring types (K6-e,Km-e;n), 2<m<5,...

zapatillas data

Dataset

Contents

SeaLiT Knowledge Graphs - Maritime History Data in RDF using a CIDOC-CRM...

Data from: XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction...

Hungary - Distribution of population by household types: Single person

GCN_model

Dataset

Contents

CBCD:A Chinese Bar Chart Dataset for Data Extraction

Performance comparison by dataset and node size.

Event Graph of BPI Challenge 2015

Sweden - Distribution of population by household types: Single person

Data from: CLARA Knowledge Graph of licensed educational resources

Data from: OpenAIRE Research Graph Dump

OAG Dataset for H2GB

DataSheet5_Classifying breast cancer using multi-view graph neural network...

Global NoSQL Database Market By Type (Key-Value Store, Document Database,...

Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2See More Versions

Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2