2 datasets found
  1. u

    Data from: Dataset for Collective Intelligence Architecture for IoT Using...

    • produccioncientifica.uca.es
    • portalcientifico.universidadeuropea.com
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa (2025). Dataset for Collective Intelligence Architecture for IoT Using Federated Process Mining [Dataset]. https://produccioncientifica.uca.es/documentos/67bc32b6478fbf5d29390c94
    Explore at:
    Dataset updated
    2025
    Authors
    Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa
    Description

    This dataset contains the key elements used in the paper Collective Intelligence Architecture for IoT Using Federated Process Mining which range from complex event processing to process mining applied over multiple datasets. The information included is organized into the following sections:

    1.- CEPApp.siddhi: It contains the rules and configurations used for pattern detection and real-time event processing.

    2.- ProcessStorage.sol: Smart contract code used in the case study implemented on solidity using Polygon blockchain platform.

    3.- Datasets Used ({adlinterweave_dataset, adlmr_dataset, twor_dataset}.zip): Three datasets used in the study, each with events that have been processed using the CEP engine. The datasets are divided according to the rooms of the house:

    _room.csv: CSV file with the data related to the interactions of the room stay.

    _bathroom.csv: CSV file with the data related to the interactions of the bathroom stay.

    _other.csv: CSV file with the data related to the interactions of the rest of the rooms.

    4.- CEP Engine Processing Results ({cepresult_adlinterweave, cepresult_adlmr, cepresult_twor}.json): Output generated by the Siddhi CEP engine, stored in JSON format. The data is categorized into different files based on the type of detected activity:

    _room.json: Contains the events related to the stay in the room.

    _bathroom.json: Contains the events related to the bathing stay.

    _other.json: Contains the events related to the rest of the rooms.

    5.- Federated Event Logs ({xesresult_adlinterweave, xesresult_adlmr, xesresult_twor}.xes): Federated event logs in XES format, standard in process mining. Contains event traces obtained after the execution of the Event Log Integrator.

    6.- Process Mining Results: Models generated from the processed event logs:

    Process Trees ({procestree_adlinterweave, procestree_adlmr, procestree_twor}.svg): structured representation of the detected workflows.

    Petri Nets ({petrinet_adlinterweave, petrinet_adlmr, petrinet_twor}.svg): Mathematical model of the discovered processes, useful for compliance analysis and simulations.

    Disco Results ({disco_adlinterweave, disco_adlmr, disco_twor}.pdf): Process models discovered with the Disco tool.

    ProM Results ({prom_adlinterweave, prom_adlmr, prom_twor}.pdf): Models generated with ProM tool.

  2. f

    Data from: PDFDataExtractor: A Tool for Reading Scientific Text and...

    • acs.figshare.com
    zip
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miao Zhu; Jacqueline M. Cole (2023). PDFDataExtractor: A Tool for Reading Scientific Text and Interpreting Metadata from the Typeset Literature in the Portable Document Format [Dataset]. http://doi.org/10.1021/acs.jcim.1c01198.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    ACS Publications
    Authors
    Miao Zhu; Jacqueline M. Cole
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The layout of portable document format (PDF) files is constant to any screen, and the metadata therein are latent, compared to mark-up languages such as HTML and XML. No semantic tags are usually provided, and a PDF file is not designed to be edited or its data interpreted by software. However, data held in PDF files need to be extracted in order to comply with open-source data requirements that are now government-regulated. In the chemical domain, related chemical and property data also need to be found, and their correlations need to be exploited to enable data science in areas such as data-driven materials discovery. Such relationships may be realized using text-mining software such as the “chemistry-aware” natural-language-processing tool, ChemDataExtractor; however, this tool has limited data-extraction capabilities from PDF files. This study presents the PDFDataExtractor tool, which can act as a plug-in to ChemDataExtractor. It outperforms other PDF-extraction tools for the chemical literature by coupling its functionalities to the chemical-named entity-recognition capabilities of ChemDataExtractor. The intrinsic PDF-reading abilities of ChemDataExtractor are much improved. The system features a template-based architecture. This enables semantic information to be extracted from the PDF files of scientific articles in order to reconstruct the logical structure of articles. While other existing PDF-extracting tools focus on quantity mining, this template-based system is more focused on quality mining on different layouts. PDFDataExtractor outputs information in JSON and plain text, including the metadata of a PDF file, such as paper title, authors, affiliation, email, abstract, keywords, journal, year, document object identifier (DOI), reference, and issue number. With a self-created evaluation article set, PDFDataExtractor achieved promising precision for all key assessed metadata areas of the document text.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa (2025). Dataset for Collective Intelligence Architecture for IoT Using Federated Process Mining [Dataset]. https://produccioncientifica.uca.es/documentos/67bc32b6478fbf5d29390c94

Data from: Dataset for Collective Intelligence Architecture for IoT Using Federated Process Mining

Related Article
Explore at:
Dataset updated
2025
Authors
Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa
Description

This dataset contains the key elements used in the paper Collective Intelligence Architecture for IoT Using Federated Process Mining which range from complex event processing to process mining applied over multiple datasets. The information included is organized into the following sections:

1.- CEPApp.siddhi: It contains the rules and configurations used for pattern detection and real-time event processing.

2.- ProcessStorage.sol: Smart contract code used in the case study implemented on solidity using Polygon blockchain platform.

3.- Datasets Used ({adlinterweave_dataset, adlmr_dataset, twor_dataset}.zip): Three datasets used in the study, each with events that have been processed using the CEP engine. The datasets are divided according to the rooms of the house:

_room.csv: CSV file with the data related to the interactions of the room stay.

_bathroom.csv: CSV file with the data related to the interactions of the bathroom stay.

_other.csv: CSV file with the data related to the interactions of the rest of the rooms.

4.- CEP Engine Processing Results ({cepresult_adlinterweave, cepresult_adlmr, cepresult_twor}.json): Output generated by the Siddhi CEP engine, stored in JSON format. The data is categorized into different files based on the type of detected activity:

_room.json: Contains the events related to the stay in the room.

_bathroom.json: Contains the events related to the bathing stay.

_other.json: Contains the events related to the rest of the rooms.

5.- Federated Event Logs ({xesresult_adlinterweave, xesresult_adlmr, xesresult_twor}.xes): Federated event logs in XES format, standard in process mining. Contains event traces obtained after the execution of the Event Log Integrator.

6.- Process Mining Results: Models generated from the processed event logs:

Process Trees ({procestree_adlinterweave, procestree_adlmr, procestree_twor}.svg): structured representation of the detected workflows.

Petri Nets ({petrinet_adlinterweave, petrinet_adlmr, petrinet_twor}.svg): Mathematical model of the discovered processes, useful for compliance analysis and simulations.

Disco Results ({disco_adlinterweave, disco_adlmr, disco_twor}.pdf): Process models discovered with the Disco tool.

ProM Results ({prom_adlinterweave, prom_adlmr, prom_twor}.pdf): Models generated with ProM tool.

Search
Clear search
Close search
Google apps
Main menu