100+ datasets found
  1. m

    Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2<m<5,...

    • mostwiedzy.pl
    zip
    Updated Dec 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Fidytek (2020). Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2
    Explore at:
    zip(3836)Available download formats
    Dataset updated
    Dec 17, 2020
    Authors
    Robert Fidytek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For K4 and Km-e graphs, a coloring type (K4,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K4 subgraph in the first color (representing by no edges in the graph) or the Km-e subgraph in the second color (representing by edges in the graph). Km-e means the full Km graph with one edge removed.The Ramsey number R(K4,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K4 in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K4,Km-e;n) exist for n<R(K4,Km-e).The dataset consists of:a) 5 files containing all non-isomorphic graphs that are coloring types (K4,K3-e;n) for 1<n<7,b) 9 files containing all non-isomorphic graphs that are coloring types (K4,K4-e;n) for 1<n<11.

  2. d

    Data from: Grammar transformations of topographic feature type annotations...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Grammar transformations of topographic feature type annotations of the U.S. to structured graph data. [Dataset]. https://catalog.data.gov/dataset/grammar-transformations-of-topographic-feature-type-annotations-of-the-u-s-to-structured-g
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    These data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-speech, tag, dependency relation, dependency head, and lemma. Each word within the lexicon was tokenized. A stop-words list was utilized only to remove punctuation and symbols from the text, excluding hyphenated words (ex. bowl-shaped) which remained as such. The tokens’ lemmas were then aggregated and totaled to find their recurrences within the lexicon. This procedure was repeated for tokenizing noun chunks using the same glossary definitions.

  3. f

    Statistics of datasets used in the experiments.

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee (2023). Statistics of datasets used in the experiments. [Dataset]. http://doi.org/10.1371/journal.pone.0277887.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics of datasets used in the experiments.

  4. Purchase Dataset

    • kaggle.com
    zip
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anshuman_Tiwari2005 (2025). Purchase Dataset [Dataset]. https://www.kaggle.com/datasets/anshumantiwari2005/purchase-dataset
    Explore at:
    zip(2715 bytes)Available download formats
    Dataset updated
    May 9, 2025
    Authors
    Anshuman_Tiwari2005
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Anshuman_Tiwari2005

    Released under CC0: Public Domain

    Contents

  5. d

    Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA...

    • datarade.ai
    Updated Aug 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DRAKO (2024). Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA Compliant [Dataset]. https://datarade.ai/data-products/drako-device-graph-data-usa-canada-comprehensive-insi-drako
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Aug 21, 2024
    Dataset authored and provided by
    DRAKO
    Area covered
    Mozambique, Philippines, Cyprus, Bahamas, South Sudan, Eritrea, Brazil, Aruba, Lao People's Democratic Republic, Tonga
    Description

    DRAKO is a leader in providing Device Graph Data, focusing on understanding the relationships between consumer devices and identities. Our data allows businesses to create holistic profiles of users, track engagement across platforms, and measure the effectiveness of advertising efforts.

    Device Graph Data is essential for accurate audience targeting, cross-device attribution, and understanding consumer journeys. By integrating data from multiple sources, we provide a unified view of user interactions, helping businesses make informed decisions.

    Key Features: - Comprehensive device mapping to understand user behaviour across multiple platforms - Detailed Identity Graph Data for cross-device identification and engagement tracking - Integration with Connected TV Data for enhanced insights into video consumption habits - Mobile Attribution Data to measure the effectiveness of mobile campaigns - Customizable analytics to segment audiences based on device usage and demographics - Some ID types offered: AAID, idfa, Unified ID 2.0, AFAI, MSAI, RIDA, AAID_CTV, IDFA_CTV

    Use Cases: - Cross-device marketing strategies - Attribution modelling and campaign performance measurement - Audience segmentation and targeting - Enhanced insights for Connected TV advertising - Comprehensive consumer journey mapping

    Data Compliance: All of our Device Graph Data is sourced responsibly and adheres to industry standards for data privacy and protection. We ensure that user identities are handled with care, providing insights without compromising individual privacy.

    Data Quality: DRAKO employs robust validation techniques to ensure the accuracy and reliability of our Device Graph Data. Our quality assurance processes include continuous monitoring and updates to maintain data integrity and relevance.

  6. m

    Dataset of non-isomorphic graphs being coloring types (K6-e,Km-e;n), 2<m<5,...

    • mostwiedzy.pl
    zip
    Updated Mar 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Fidytek (2021). Dataset of non-isomorphic graphs being coloring types (K6-e,Km-e;n), 2
    Explore at:
    zip(29865232)Available download formats
    Dataset updated
    Mar 17, 2021
    Authors
    Robert Fidytek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For K6-e and Km-e graphs, the type coloring (K6-e,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K6-e subgraph in the first color (no edge in the graph) or the Km-e subgraph in the second color (exists edge in the graph). Km-e means the full Km graph with one edge removed. The Ramsey number R(K6-e,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K6-e in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K6-e,Km-e;n) exist for n<R(K6-e,Km-e).

  7. zapatillas data

    • kaggle.com
    zip
    Updated Jul 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brayan Alejandro Valencia lopez (2025). zapatillas data [Dataset]. https://www.kaggle.com/datasets/alejandro2025/zapatillas-data
    Explore at:
    zip(6938 bytes)Available download formats
    Dataset updated
    Jul 13, 2025
    Authors
    Brayan Alejandro Valencia lopez
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Brayan Alejandro Valencia lopez

    Released under Apache 2.0

    Contents

  8. SeaLiT Knowledge Graphs - Maritime History Data in RDF using a CIDOC-CRM...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jul 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios; Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios (2022). SeaLiT Knowledge Graphs - Maritime History Data in RDF using a CIDOC-CRM extension (SeaLiT Ontology) [Dataset]. http://doi.org/10.5281/zenodo.6460841
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 4, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios; Athina Kritsotaki; Yannis Marketakis; Pavlos Fafalios
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SeaLiT Knowledge Graphs is an RDF dataset of maritime history data that has been transcribed (and then transformed) from original archival sources in the context of the SeaLiT Project (Seafaring Lives in Transition, Mediterranean Maritime Labour and Shipping, 1850s-1920s). The underlying data model is the SeaLiT Ontology, an extension of the ISO standard CIDOC-CRM (ISO 21127:2014) for the modelling and integration of maritime history information.

    The knowledge graphs integrate data of totally 16 different types of archival sources:

    • Crew Lists
      • Crew and displacement list (Roll)
      • Crew List (Ruoli di Equipaggio)
      • General Spanish Crew List
    • Registers / Lists
      • Students Register
      • Civil Register
      • Register of Maritime Personnel
      • Register of Maritime Workers (Matricole della gente di mare)
      • Sailors Register (Libro de registro de marineros)
      • Naval Ship Register List
      • Seagoing Personnel
      • Lists of ships
    • Censuses
      • Census La Ciotat
      • First National all-Russian Census of the Russian Empire
    • Payrolls
      • Payrolls of private archives and libraries in Greece
      • Payrolls of Russian Steam Navigation and Trading Company
    • Employment records
      • Shipyards of Messageries Maritimes, La Ciotat

    More information about the archival sources are available through the SeaLiT website. Data exploration applications over these sources are also publicly available (SeaLiT Catalogues, SeaLiT ResearchSpace).

    Data from these archival sources has been transcribed in tabular form and then curated by historians of SeaLiT using the FAST CAT system. The transcripts (records), together with the curated vocabulary terms and entity instances (ships, persons, locations, organizations), are then transformed to RDF using the SeaLiT Ontology as the target (domain) model. To this end, the corresponding schema mappings between the original schemata and the ontology were defined using the X3ML mapping definition language, that were subsequently used for delivering the RDF datasets.

    More information about the FAST CAT system and the data transcription, curation and transformation processes can be found in the following paper:

    P. Fafalios, K. Petrakis, G. Samaritakis, K. Doerr, A. Kritsotaki, Y. Tzitzikas, M. Doerr, "FAST CAT: Collaborative Data Entry and Curation for Semantic Interoperability in Digital Humanities", ACM Journal on Computing and Cultural Heritage, 2021. https://doi.org/10.1145/3461460 [pdf, bib]

    The RDF dataset is provided as a set of TriG files per record per archival source. For each record, the dataset provides: i) one trig file for the record's data (records.trig), ii) one trig file for the record's (curated) vocabulary terms (vocabularies.trig), and iii) four trig files for the record's (curated) entity instances (ships.trig, persons.trig, persons.trig, organizations.trig).

    We also provide the RDFS files of the used ontologies (SeaLiT Ontology verson 1.0, CIDOC-CRM version 7.1.1).

  9. S

    Data from: XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction...

    • scidb.cn
    Updated Oct 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hailong Jin; Chengjiang Li; Jing Zhang; Lei Hou; Juanzi Li; Peng Zhang (2020). XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction and Application [Dataset]. http://doi.org/10.11922/sciencedb.j00104.00022
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 15, 2020
    Dataset provided by
    Science Data Bank
    Authors
    Hailong Jin; Chengjiang Li; Jing Zhang; Lei Hou; Juanzi Li; Peng Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    One table and 11 figures. Table 1 shows XLORE2 statistics. Figure 1 shows the framework of XLORE2. Figure 2 is an example of cross-lingual knowledge linking. Figure 3 presents the framework of cross-lingual knowledge linking. Figure 4 is an example of cross-lingual property matching (attribute matching). Figure 5 shows the framework of cross-lingual property matching. Figure 6 presents an example of mistakenly derived facts. Figure 7 is the framework of cross-lingual knowledge validation. Figure 8 shows an example of fine-grained type inference. Figure 9 depicts the framework of fine-grained type inference. Figure 10 is an illustration of XLink. Figure 11 shows the interface of XLORE2 and XLink.

  10. T

    Hungary - Distribution of population by household types: Single person

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Sep 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Hungary - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/hungary/distribution-of-population-by-household-types-single-person-eurostat-data.html
    Explore at:
    xml, json, excel, csvAvailable download formats
    Dataset updated
    Sep 15, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Hungary
    Description

    Hungary - Distribution of population by household types: Single person was 13.80% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Hungary - Distribution of population by household types: Single person - last updated from the EUROSTAT on November of 2025. Historically, Hungary - Distribution of population by household types: Single person reached a record high of 14.50% in December of 2017 and a record low of 9.20% in December of 2010.

  11. GCN_model

    • kaggle.com
    zip
    Updated Jun 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thida Khim (2025). GCN_model [Dataset]. https://www.kaggle.com/datasets/thidakhim/gcn-model
    Explore at:
    zip(78857239 bytes)Available download formats
    Dataset updated
    Jun 13, 2025
    Authors
    Thida Khim
    Description

    Dataset

    This dataset was created by Thida Khim

    Contents

  12. S

    CBCD:A Chinese Bar Chart Dataset for Data Extraction

    • scidb.cn
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan (2025). CBCD:A Chinese Bar Chart Dataset for Data Extraction [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00052
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.

  13. f

    Performance comparison by dataset and node size.

    • figshare.com
    xls
    Updated Oct 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu (2025). Performance comparison by dataset and node size. [Dataset]. http://doi.org/10.1371/journal.pone.0334724.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 23, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Graphs are a representative type of fundamental data structures. They are capable of representing complex association relationships in diverse domains. For large-scale graph processing, the stream graphs have become efficient tools to process dynamically evolving graph data. When processing stream graphs, the subgraph counting problem is a key technique, which faces significant computational challenges due to its #P-complete nature. This work introduces StreamSC, a novel framework that efficiently estimate subgraph counting results on stream graphs through two key innovations: (i) It’s the first learning-based framework to address the subgraph counting problem focused on stream graphs; and (ii) this framework addresses the challenges from dynamic changes of the data graph caused by the insertion or deletion of edges. Experiments on 5 real-word graphs show the priority of StreamSC on accuracy and efficiency.

  14. 4

    Event Graph of BPI Challenge 2015

    • data.4tu.nl
    zip
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dirk Fahland; Stefan Esser, Event Graph of BPI Challenge 2015 [Dataset]. http://doi.org/10.4121/14169569.v1
    Explore at:
    zipAvailable download formats
    Dataset provided by
    4TU.ResearchData
    Authors
    Dirk Fahland; Stefan Esser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Business process event data modeled as labeled property graphs

    Data Format
    -----------

    The dataset comprises one labeled property graph in two different file formats.

    #1) Neo4j .dump format

    A neo4j (https://neo4j.com) database dump that contains the entire graph and can be imported into a fresh neo4j database instance using the following command, see also the neo4j documentation: https://neo4j.com/docs/

    /bin/neo4j-admin.(bat|sh) load --database=graph.db --from=

    The .dump was created with Neo4j v3.5.

    #2) .graphml format

    A .zip file containing a .graphml file of the entire graph


    Data Schema
    -----------

    The graph is a labeled property graph over business process event data. Each graph uses the following concepts

    :Event nodes - each event node describes a discrete event, i.e., an atomic observation described by attribute "Activity" that occurred at the given "timestamp"

    :Entity nodes - each entity node describes an entity (e.g., an object or a user), it has an EntityType and an identifier (attribute "ID")

    :Log nodes - describes a collection of events that were recorded together, most graphs only contain one log node

    :Class nodes - each class node describes a type of observation that has been recorded, e.g., the different types of activities that can be observed, :Class nodes group events into sets of identical observations

    :CORR relationships - from :Event to :Entity nodes, describes whether an event is correlated to a specific entity; an event can be correlated to multiple entities

    :DF relationships - "directly-followed by" between two :Event nodes describes which event is directly-followed by which other event; both events in a :DF relationship must be correlated to the same entity node. All :DF relationships form a directed acyclic graph.

    :HAS relationship - from a :Log to an :Event node, describes which events had been recorded in which event log

    :OBSERVES relationship - from an :Event to a :Class node, describes to which event class an event belongs, i.e., which activity was observed in the graph

    :REL relationship - placeholder for any structural relationship between two :Entity nodes

    The concepts a further defined in Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552


    Data Contents
    -------------

    neo4j-bpic15-2021-02-17 (.dump|.graphml.zip)

    An integrated graph describing the raw event data of the entire BPI Challenge 2015 dataset.
    van Dongen, B.F. (Boudewijn) (2015): BPI Challenge 2015. 4TU.ResearchData. Collection. https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1

    This data is provided by five Dutch municipalities. The data contains all building permit applications over a period of approximately four years. There are many different activities present, denoted by both codes (attribute concept:name) and labels, both in Dutch (attribute taskNameNL) and in English (attribute taskNameEN). The cases in the log contain information on the main application as well as objection procedures in various stages. Furthermore, information is available about the resource that carried out the task and on the cost of the application (attribute SUMleges). The processes in the five municipalities should be identical, but may differ slightly. Especially when changes are made to procedures, rules or regulations the time at which these changes are pushed into the five municipalities may differ. Of course, over the four year period, the underlying processes have changed. The municipalities have a number of questions, namely: What are the roles of the people involved in the various stages of the process and how do these roles differ across municipalities? What are the possible points for improvement on the organizational structure for each of the municipalities? The employees of two of the five municipalities have physically moved into the same location recently. Did this lead to a change in the processes and if so, what is different? Some of the procedures will be outsourced from 2018, i.e. they will be removed from the process and the applicant needs to have these activities performed by an external party before submitting the application. What will be the effect of this on the organizational structures in the five municipalities? Where are differences in throughput times between the municipalities and how can these be explained? What are the differences in control flow between the municipalities? There are five different log files available in this collection. Events are labeled with both a code and a Dutch and English label. Each activity code consists of three parts: two digits, a variable number of characters, and then three digits. The first two digits as well as the characters indicate the subprocess the activity belongs to. For instance ‘01_HOOFD_xxx’ indicates the main process and ‘01_BB_xxx’ indicates the ‘objections and complaints’ (‘Beroep en Bezwaar’ in Dutch) subprocess. The last three digits hint on the order in which activities are executed, where the first digit often indicates a phase within a process. Each trace and each event, contain several data attributes that can be used for various checks and predictions. Furthermore, some employees may have performed tasks for different municipalities, i.e. if the employee number is the same, it is safe to assume the same person is being identified.

    The data contains the following entities and their events

    - Application - a building permit application handled in one of five Dutch municipalities
    - Case_R - a user or worker involved in handling the application
    - Responsible_actor - a user or worker designated as responsible actor for an activity
    - monitoringResource - a user or worker designated as monitoring resource for an activity

    The data contains 5 event log nodes as the data was integrated from 5 different event logs from 5 different systems.


    Data Size
    ---------

    BPIC15, nodes: 268851, relationships: 2620418

  15. T

    Sweden - Distribution of population by household types: Single person

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Sep 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Sweden - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/sweden/distribution-of-population-by-household-types-single-person-eurostat-data.html
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Sep 15, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Sweden
    Description

    Sweden - Distribution of population by household types: Single person was 22.20% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Sweden - Distribution of population by household types: Single person - last updated from the EUROSTAT on December of 2025. Historically, Sweden - Distribution of population by household types: Single person reached a record high of 24.10% in December of 2023 and a record low of 19.80% in December of 2012.

  16. Z

    Data from: CLARA Knowledge Graph of licensed educational resources

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Oct 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kieffer, Manoé; Fakih, Ghinwa; Serrano Alvarado, Patricia (2023). CLARA Knowledge Graph of licensed educational resources [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_8403141
    Explore at:
    Dataset updated
    Oct 20, 2023
    Dataset provided by
    LS2N, UMR6004, Nantes Université
    Authors
    Kieffer, Manoé; Fakih, Ghinwa; Serrano Alvarado, Patricia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CLARAThis deposit is part of the CLARA project. The CLARA project aims to empower teachers in the task of creating new educational resources. And in particular with the task of handling the licenses of reused educational resources. The present deposit contains the RDF files created using an RDF mapping (RML) and a mapper (Morph-KGC). It also contains the files JSON used as input. The corresponding pipeline can be found on Gitlab. The data used in that pipeline originate from X5GON, a European project aiming to generate and gather open educational resources. Knowledge graph contentThe present Knowledge Graph contains information about 45K Educational Resources (ERs) and 135K subjects (extracted from DBpedia).That information contains

    the author, its title and description the license, a URL to the resource itself, the language of the ER, its mimetype, and finally which subject it talks about, and to what extent. That extent is given by two scores: a PageRank score and a Cosinus score. A particularity of the knowledge graph is its heavy use of RDF reification, across large multi-valued properties.Thus four versions of the knowledge graph exist, using Standard reification, Singleton property, Named graphs, and RDF-star. The Knowledge Graph also contains categories originating from DBpedia. They help precise the subjects that are also extracted from DBpedia. The KG.zip files contain five types of files:

    Authors_[X].nt - Those contain the authors' nodes, their type, and name. ER_[X].nt/nq/ttl - Those contain the ERs and their information using the respective RDF reification model. categories_skos_[X].ttl - Those contain the hierarchy of DBpedia categories. categories_labels.ttl - This file contains additional information about the categories. categories_article.ttl - This file contains the RDF triples that link the DBpedia subjects to the DBpedia categories.

    JSON content The original dataset was cut into multiple JSON files in order to make its processing easier. DBpedia categories were extracted as RDF and aren't present in the JSON files.There are two types of files in the input-json.zip file:

    authors_[X].json - Which lists the authors names ER_[X].json - Which lists the ERs and their related information.That information contains:

    their title. their description. their language (and language_detected, only the first one is used in the pipeline here). their license. their mimetype. the authors. the date of creation of the resource. a url linking to the resource itself. the subjects (named concepts) associated with the resource. With the corresponding scores.

    If you do use this dataset, you can cite this repository:

    Kieffer, M., Fakih, G., & Serrano Alvarado, P. (2023). CLARA Knowledge Graph of licensed educational resources [Data set]. Semantics, Leipzig, Germany. Zenodo. https://doi.org/10.5281/zenodo.8403142 Or the corresponding paper

    Kieffer, M., Fakih, G. & Serrano-Alvarado, P. (2023). Evaluating Reification with Multi-valued Properties in a Knowledge Graph of Licensed Educational Resources. Semantics, Leipzig, Germany.

  17. Data from: OpenAIRE Research Graph Dump

    • zenodo.org
    • pub.uni-bielefeld.de
    • +1more
    application/gzip
    Updated Aug 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paolo Manghi; Paolo Manghi; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Jochen Schirrwagen; Jochen Schirrwagen; Harry Dimitropoulos; Sandro La Bruzzo; Sandro La Bruzzo; Ioannis Foufoulas; Aenne Löhden; Amelie Bäcker; Amelie Bäcker; Andrea Mannocci; Andrea Mannocci; Marek Horst; Miriam Baglioni; Miriam Baglioni; Andreas Czerniak; Andreas Czerniak; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Lars Holm Nielsen; Lars Holm Nielsen; Alexandros Ioannidis; Chiara Bigarella; Friedrich Summann; Friedrich Summann; Harry Dimitropoulos; Ioannis Foufoulas; Aenne Löhden; Marek Horst; Katerina Iatropoulou; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Alexandros Ioannidis; Chiara Bigarella (2023). OpenAIRE Research Graph Dump [Dataset]. http://doi.org/10.5281/zenodo.3516918
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paolo Manghi; Paolo Manghi; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Jochen Schirrwagen; Jochen Schirrwagen; Harry Dimitropoulos; Sandro La Bruzzo; Sandro La Bruzzo; Ioannis Foufoulas; Aenne Löhden; Amelie Bäcker; Amelie Bäcker; Andrea Mannocci; Andrea Mannocci; Marek Horst; Miriam Baglioni; Miriam Baglioni; Andreas Czerniak; Andreas Czerniak; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Lars Holm Nielsen; Lars Holm Nielsen; Alexandros Ioannidis; Chiara Bigarella; Friedrich Summann; Friedrich Summann; Harry Dimitropoulos; Ioannis Foufoulas; Aenne Löhden; Marek Horst; Katerina Iatropoulou; Michele De Bonis; Michele Artini; Enrico Ottonello; Antonis Lempesis; Alexandros Ioannidis; Chiara Bigarella
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    November 2020: Please check out the newer version of the OpenAIRE Research Graph dump available at https://doi.org/10.5281/zenodo.4201546. The newer version contains json files that are more compact and easy to process. learn more about the OpenAIRE Research Graph at https://graph.openaire.eu.

    The OpenAIRE Research Graph is exported as several dumps, so you can download the parts you are interested into.

    • publication.gz: metadata records about research literature (includes types of publications listed here)
    • dataset.gz:: metadata records about research data (includes the subtypes listed here)
    • software.gz:: metadata records about research software (includes the subtypes listed here)
    • orp.gz: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed here)
    • organization.gz: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.
    • datasource.gz: metadata records about providers whose content is available in the OpenAIRE Research Graph. They includes institutional and thematic repositories, journals, aggregators, funders' databases.
    • project.gz: metadata records about projects funded by a given funder.
    • : metadata records about research results (publications, datasets, software, and other research products) funded by a given funder.

    Please go to http://develop.openaire.eu/graph-dumps.html for instructions on how to consume the dumps.

    Libraries: this blog describes the openairegraph libraries, which can be used to perform analytics on this dataset.

  18. OAG Dataset for H2GB

    • kaggle.com
    zip
    Updated Jun 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junhong Lin (2024). OAG Dataset for H2GB [Dataset]. https://www.kaggle.com/datasets/junhonglin/oag-dataset-for-h2gb
    Explore at:
    zip(15456920438 bytes)Available download formats
    Dataset updated
    Jun 11, 2024
    Authors
    Junhong Lin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    oag-cs, oag-eng, oag-chem are new heterogeneous networks composed of subsets of the Open Academic Graph (OAG). Each of the datasets contains papers from three different subject domains -- computer science, engineering, and chemistry. These datasets also contain four types of entities -- papers, authors, institutions, and fields of study. Each paper is associated with a 768-dimensional feature vector generated from a pre-trained XLNet applying on the paper titles. The representation of each word in the title are weighted by each word's attention to get the title representation for each paper. Each paper node is labeled with its published venue (paper or conference). We split the papers published up to 2016 as the training set, papers published in 2017 as the validation set, and papers published in 2018 and 2019 as the test set. The publication year of each paper is also included in these datasets. This means those datasets can also be converted to use the publication year as class labels.

  19. DataSheet5_Classifying breast cancer using multi-view graph neural network...

    • frontiersin.figshare.com
    txt
    Updated Feb 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanjiao Ren; Yimeng Gao; Wei Du; Weibo Qiao; Wei Li; Qianqian Yang; Yanchun Liang; Gaoyang Li (2024). DataSheet5_Classifying breast cancer using multi-view graph neural network based on multi-omics data.CSV [Dataset]. http://doi.org/10.3389/fgene.2024.1363896.s005
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Yanjiao Ren; Yimeng Gao; Wei Du; Weibo Qiao; Wei Li; Qianqian Yang; Yanchun Liang; Gaoyang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction: As the evaluation indices, cancer grading and subtyping have diverse clinical, pathological, and molecular characteristics with prognostic and therapeutic implications. Although researchers have begun to study cancer differentiation and subtype prediction, most of relevant methods are based on traditional machine learning and rely on single omics data. It is necessary to explore a deep learning algorithm that integrates multi-omics data to achieve classification prediction of cancer differentiation and subtypes.Methods: This paper proposes a multi-omics data fusion algorithm based on a multi-view graph neural network (MVGNN) for predicting cancer differentiation and subtype classification. The model framework consists of a graph convolutional network (GCN) module for learning features from different omics data and an attention module for integrating multi-omics data. Three different types of omics data are used. For each type of omics data, feature selection is performed using methods such as the chi-square test and minimum redundancy maximum relevance (mRMR). Weighted patient similarity networks are constructed based on the selected omics features, and GCN is trained using omics features and corresponding similarity networks. Finally, an attention module integrates different types of omics features and performs the final cancer classification prediction.Results: To validate the cancer classification predictive performance of the MVGNN model, we conducted experimental comparisons with traditional machine learning models and currently popular methods based on integrating multi-omics data using 5-fold cross-validation. Additionally, we performed comparative experiments on cancer differentiation and its subtypes based on single omics data, two omics data, and three omics data.Discussion: This paper proposed the MVGNN model and it performed well in cancer classification prediction based on multiple omics data.

  20. Global NoSQL Database Market By Type (Key-Value Store, Document Database,...

    • verifiedmarketresearch.com
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2025). Global NoSQL Database Market By Type (Key-Value Store, Document Database, Column Based Store, Graph Database), By Application (Data Storage, Mobile Apps, Web Apps, Data Analytics), By End-User Industry (Retail, Gaming, IT), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/nosql-database-market/
    Explore at:
    Dataset updated
    Oct 14, 2025
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    Global
    Description

    NoSQL Database Market size was valued at USD 6.47 Billion in 2024 and is expected to reach USD 44.66 Billion by 2032, growing at a CAGR of 30.14% from 2026 to 2032.Global NoSQL Database Market DriversExponential Growth of Big Data and IoT: The explosion of Big Data and Internet of Things (IoT) applications is a primary catalyst for NoSQL adoption, requiring database solutions that can ingest and process colossal volumes of unstructured and semi-structured data from diverse sources like sensors, social media, and web logs. Unlike rigid relational systems, Increasing Demand for Real-Time Web and Mobile Applications: The surging demand for real-time web and mobile applications is significantly fueling the NoSQL market, as these modern applications require sub-millisecond latency and exceptionally high throughput to deliver a seamless user experience. NoSQL database types, particularly key-value stores and document databases, are architecturally optimized for rapid read/write operations and horizontal scaling,.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Robert Fidytek (2020). Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2

Dataset of non-isomorphic graphs of the coloring types (K4,Km-e;n), 2

Explore at:
zip(3836)Available download formats
Dataset updated
Dec 17, 2020
Authors
Robert Fidytek
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

For K4 and Km-e graphs, a coloring type (K4,Km-e;n) is such an edge coloring of the full Kn graph, which does not have the K4 subgraph in the first color (representing by no edges in the graph) or the Km-e subgraph in the second color (representing by edges in the graph). Km-e means the full Km graph with one edge removed.The Ramsey number R(K4,Km-e) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K4 in the first color (no edge in the graph) or isomorphic with Km-e in the second color (exists edge in the graph). Coloring types (K4,Km-e;n) exist for n<R(K4,Km-e).The dataset consists of:a) 5 files containing all non-isomorphic graphs that are coloring types (K4,K3-e;n) for 1<n<7,b) 9 files containing all non-isomorphic graphs that are coloring types (K4,K4-e;n) for 1<n<11.

Search
Clear search
Close search
Google apps
Main menu