83 datasets found
  1. R

    Dataflow(test) Dataset

    • universe.roboflow.com
    zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data flow (2023). Dataflow(test) Dataset [Dataset]. https://universe.roboflow.com/data-flow-lz3tb/dataflow-test
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset authored and provided by
    data flow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Dots Defect Test Bounding Boxes
    Description

    Dataflow(test)

    ## Overview
    
    Dataflow(test) is a dataset for object detection tasks - it contains Dots Defect Test annotations for 405 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. R

    Dataflow Dataset

    • universe.roboflow.com
    zip
    Updated Aug 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data flow (2023). Dataflow Dataset [Dataset]. https://universe.roboflow.com/data-flow-lz3tb/dataflow/model/25
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset authored and provided by
    data flow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Defect Bounding Boxes
    Description

    Dataflow

    ## Overview
    
    Dataflow is a dataset for object detection tasks - it contains Defect annotations for 613 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. Healthcare Operational Data Flows: Acute data set

    • standards.nhs.uk
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NHS England (2025). Healthcare Operational Data Flows: Acute data set [Dataset]. https://standards.nhs.uk/published-standards/healthcare-operational-data-flows-acute-data-set
    Explore at:
    Dataset updated
    May 23, 2025
    Dataset provided by
    National Health Servicehttps://www.nhs.uk/
    Authors
    NHS England
    Description

    The Healthcare Operational Data Flows (HODF): Acute Data Set provides an automated patient-based daily data collection to support NHS delivery plans for the recovery of elective care and emergency and urgent care.

  4. Data from: AGENT Guidelines for dataflow

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, pdf
    Updated Mar 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Alaux; Michael Alaux; Anne-Françoise Adam-Blondon; Anne-Françoise Adam-Blondon; Matthijs Brouwer; Matthijs Brouwer; Paul Kersey; Paul Kersey; Matthias Lange; Matthias Lange; Erwan Le Floch; Erwan Le Floch; Cyril Pommier; Cyril Pommier; Danuta Schüler; Danuta Schüler; Nils Stein; Nils Stein; Stephan Weise; Stephan Weise (2025). AGENT Guidelines for dataflow [Dataset]. http://doi.org/10.5281/zenodo.14989870
    Explore at:
    bin, pdfAvailable download formats
    Dataset updated
    Mar 7, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Alaux; Michael Alaux; Anne-Françoise Adam-Blondon; Anne-Françoise Adam-Blondon; Matthijs Brouwer; Matthijs Brouwer; Paul Kersey; Paul Kersey; Matthias Lange; Matthias Lange; Erwan Le Floch; Erwan Le Floch; Cyril Pommier; Cyril Pommier; Danuta Schüler; Danuta Schüler; Nils Stein; Nils Stein; Stephan Weise; Stephan Weise
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The AGENT project aims at integrating data from different sources (genebanks, research institutes, international archives) and types (passport, phenotypic, genomic data).

    These guidelines have been developed to explain the data flow within the AGENT project and should be useful for other projects.

    The phenotypic data templates are included.

  5. f

    Data from: Managing Your Data Flows: Architecture and Data Provenance For...

    • vivo.figshare.com
    • figshare.com
    • +1more
    pdf
    Updated Mar 31, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Fereira; Violeta Ilik; Alex Viggio (2016). Managing Your Data Flows: Architecture and Data Provenance For Your Institution [Dataset]. http://doi.org/10.6084/m9.figshare.2002206.v3
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 31, 2016
    Dataset provided by
    VIVO
    Authors
    John Fereira; Violeta Ilik; Alex Viggio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    VIVO aims to support an open and networked research ecosystem. This workshop will apply methods to understand VIVO’s interaction with various data source and the existing data ingest needs and challenges, highlighting how one can architect data ingest flows into your VIVO. We will cover the use of Karma, the VIVO harvester, and how Symplectic uses the Harvester, and how these tools are connected architecturally to the whole of the VIVO platform. The goal is to understand the diversity of tools and learn why and how different approaches to data ingest would meet specific use cases.

  6. G

    Sensitive Data Flow Maps with AI Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Sensitive Data Flow Maps with AI Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/sensitive-data-flow-maps-with-ai-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Sensitive Data Flow Maps with AI Market Outlook



    According to our latest research, the global Sensitive Data Flow Maps with AI market size reached USD 1.8 billion in 2024, reflecting rapid adoption across industries striving for enhanced data privacy and regulatory compliance. The market is expected to grow at a CAGR of 22.7% from 2025 to 2033, reaching a projected value of USD 13.2 billion by 2033. This robust growth is primarily fueled by increasing regulatory demands, the proliferation of sensitive data, and the need for automated, AI-driven solutions to map, monitor, and secure data flows across complex digital ecosystems.




    One of the core growth drivers for the Sensitive Data Flow Maps with AI market is the rapidly intensifying regulatory landscape. Governments and regulatory bodies worldwide are implementing stringent data privacy laws such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and similar frameworks in Asia Pacific and Latin America. These regulations require organizations to have a granular understanding of how sensitive data moves within their networks, who accesses it, and where it is stored. AI-powered data flow mapping tools enable real-time visibility and compliance, automating the identification and classification of sensitive data and mapping its lifecycle across the organization. This automation not only reduces the risk of non-compliance and associated penalties but also empowers organizations to proactively manage data privacy and security.




    Another significant growth factor is the exponential increase in data volume and complexity, driven by digital transformation initiatives, cloud migration, and the proliferation of connected devices. Organizations today manage vast, distributed data environments spanning on-premises infrastructure, public and private clouds, and edge devices. Traditional manual data mapping methods are no longer sufficient to keep pace with this complexity. AI-driven sensitive data flow maps leverage machine learning and natural language processing to automatically discover, classify, and monitor sensitive information as it traverses diverse systems. This capability is crucial for risk assessment, incident response, and ensuring that sensitive data is not inadvertently exposed or mishandled, thus bolstering the market’s growth trajectory.




    The integration of AI into data governance frameworks is also accelerating market adoption. Enterprises are increasingly recognizing the value of AI-powered data flow maps in supporting comprehensive data governance programs. These solutions provide actionable insights into data lineage, usage patterns, and access controls, enabling organizations to enforce data minimization, retention, and protection policies more effectively. Furthermore, as cyber threats become more sophisticated, the ability to visualize sensitive data flows in real-time enhances security postures, enabling rapid detection and mitigation of potential breaches. This convergence of compliance, governance, and security imperatives is driving sustained investment in advanced sensitive data flow mapping technologies.




    Regionally, North America currently leads the Sensitive Data Flow Maps with AI market, accounting for the largest revenue share in 2024 due to early regulatory adoption, advanced IT infrastructure, and a high concentration of tech-savvy enterprises. Europe follows closely, propelled by the GDPR and strong privacy advocacy. Asia Pacific is emerging as the fastest-growing region, driven by digitalization, expanding regulatory frameworks, and increasing cybersecurity investments across sectors such as BFSI, healthcare, and government. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as organizations in these regions ramp up efforts to modernize data management and comply with evolving data protection laws.





    Component Analysis



    The Sensitive Data Flow Maps with AI market is segmented by component into Software

  7. D

    Sensitive Data Flow Mapping Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Sensitive Data Flow Mapping Market Research Report 2033 [Dataset]. https://dataintelo.com/report/sensitive-data-flow-mapping-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Sensitive Data Flow Mapping Market Outlook



    According to our latest research, the global Sensitive Data Flow Mapping market size reached USD 2.13 billion in 2024. The market is expected to grow at a robust CAGR of 17.8% from 2025 to 2033, resulting in a forecasted market size of USD 8.04 billion by 2033. This remarkable growth is driven primarily by the increasing complexity of data privacy regulations, the proliferation of digital transformation initiatives, and the heightened need for organizations to track and secure sensitive data across diverse environments.




    The growth of the Sensitive Data Flow Mapping market is fundamentally propelled by the exponential increase in data generation and the parallel rise in data privacy concerns. Organizations are grappling with vast volumes of structured and unstructured data distributed across on-premises and cloud environments. Mapping the flow of sensitive data—such as personally identifiable information (PII), financial records, and health data—has become crucial for compliance with regulations like GDPR, CCPA, and HIPAA. The growing awareness of data breaches and the resulting financial and reputational damage are compelling enterprises to invest in advanced data flow mapping solutions that offer real-time visibility and control over data movement, storage, and access.




    Another significant growth driver is the digital transformation wave sweeping across industries. As businesses embrace cloud computing, Internet of Things (IoT), and hybrid IT architectures, the complexity of data ecosystems has surged. Sensitive Data Flow Mapping solutions are increasingly being adopted to gain granular insights into data flows, dependencies, and vulnerabilities. These solutions enable organizations to identify potential risks, enforce security policies, and automate compliance processes. The integration of artificial intelligence (AI) and machine learning (ML) into data flow mapping tools further enhances their ability to detect anomalies, predict threats, and ensure continuous compliance, which is crucial in today’s fast-evolving threat landscape.




    The demand for Sensitive Data Flow Mapping is also being fueled by the growing emphasis on data governance and risk management. Enterprises are under mounting pressure to demonstrate accountability and transparency in their data handling practices. Data flow mapping tools provide a comprehensive view of how sensitive data traverses through various systems, applications, and third-party integrations. This visibility is essential for implementing robust data governance frameworks, conducting risk assessments, and responding promptly to regulatory audits. As a result, sectors such as banking, healthcare, government, and retail are leading adopters, seeking to safeguard sensitive information and build trust with customers and stakeholders.




    From a regional perspective, North America dominates the Sensitive Data Flow Mapping market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The high adoption rate in North America is attributed to stringent regulatory requirements, advanced IT infrastructure, and a strong focus on cybersecurity. Europe’s growth is driven by GDPR compliance and an increasing number of data protection initiatives, while Asia Pacific is witnessing rapid expansion due to the digitalization of enterprises and evolving regulatory frameworks. Latin America and the Middle East & Africa are also exhibiting steady growth, albeit at a slower pace, as organizations in these regions gradually ramp up their investments in data protection and compliance solutions.



    Component Analysis



    The Sensitive Data Flow Mapping market is segmented by component into software and services, each playing a pivotal role in the overall ecosystem. Software solutions form the backbone of this market, offering automated tools for discovering, mapping, and visualizing sensitive data flows across complex IT environments. These platforms leverage advanced analytics, AI, and ML algorithms to provide real-time insights, detect anomalies, and generate actionable reports. The increasing sophistication of cyber threats and the need for continuous compliance monitoring are driving the demand for comprehensive software solutions that can seamlessly integrate with existing security and data management systems.




    Services, on the other hand, encompass consulting, deployment,

  8. e

    Guide to the use of data flows

    • data.europa.eu
    pdf
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Airparif, Guide to the use of data flows [Dataset]. https://data.europa.eu/data/datasets/5ba0ac15634f411171af593d?locale=en
    Explore at:
    pdf(808408)Available download formats
    Dataset authored and provided by
    Airparif
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Descriptive guide to published flows.

  9. S1 Data -

    • plos.figshare.com
    bin
    Updated Aug 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Liu; Yuan Zhang; Rui Jiang; Jing Cheng; JingJing Dai (2024). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0308716.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 19, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yang Liu; Yuan Zhang; Rui Jiang; Jing Cheng; JingJing Dai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Amidst growing skepticism towards globalization and rising digital trade, this study investigates the impact of Restrictions on Cross-Border Data Flows (RCDF) on Domestic Value Chains (DVCs) stability. As global value chains participation declines, the stability of DVCs—integral to internal economic dynamics—becomes crucial. This study situates within a framework exploring the role of innovation and RCDF in the increasingly interconnected global trade. Using a panel data fixed effect model, our analysis provides insights into the varying effects of RCDF on DVCs stability across countries with diverse economic structures and technological advancement levels. This approach allows for a nuanced understanding of the interplay between digital trade policies, value chain stability, and innovation. RCDF tend to disrupt DVCs by negatively impacting innovation, which necessitates proactive policy measures to mitigate these effects. In contrast, low-income countries experience a less detrimental impact; RCDF may even aid in integrating their DVCs into Global Value Chains, enhancing economic stability. It underscores the need for dynamic, adaptable policies and global collaboration to harmonize digital trade standards, thus offering guidance for policy-making in the context of an interconnected global economy.

  10. General data: Flows of employing legal units by autonomous community. CODEM...

    • datos.gob.es
    Updated Apr 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Instituto Nacional de Estadística (2023). General data: Flows of employing legal units by autonomous community. CODEM (API identifier: 49326) [Dataset]. https://datos.gob.es/en/catalogo/ea0010587-datos-generales-flujos-de-unidades-legales-empleadoras-por-comunidades-autonomas-codem-identificador-api-49326
    Explore at:
    Dataset updated
    Apr 5, 2023
    Dataset provided by
    National Statistics Institutehttp://www.ine.es/
    Authors
    Instituto Nacional de Estadística
    License

    https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal

    Description

    Table of Experimental Statistics. General data: Flows of employing legal units by autonomous community. Quarterly. Autonomous Communities and Cities. Company Demographic Profile

  11. R

    Sensitive Data Flow Maps with AI Market Research Report 2033

    • researchintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Intelo (2025). Sensitive Data Flow Maps with AI Market Research Report 2033 [Dataset]. https://researchintelo.com/report/sensitive-data-flow-maps-with-ai-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Research Intelo
    License

    https://researchintelo.com/privacy-and-policyhttps://researchintelo.com/privacy-and-policy

    Time period covered
    2024 - 2033
    Area covered
    Global
    Description

    Sensitive Data Flow Maps with AI Market Outlook



    According to our latest research, the Sensitive Data Flow Maps with AI market size was valued at $1.2 billion in 2024 and is projected to reach $7.8 billion by 2033, expanding at a remarkable CAGR of 23.1% during the forecast period of 2025–2033. This robust growth is primarily driven by the increasing complexity and volume of sensitive data across industries, coupled with stringent regulatory frameworks that demand advanced, automated data mapping solutions. The integration of artificial intelligence into data flow mapping is rapidly transforming how organizations visualize, manage, and secure sensitive information, enabling real-time compliance, enhanced risk management, and proactive data governance. As organizations globally grapple with evolving privacy laws and the rising threat landscape, the adoption of AI-powered sensitive data flow mapping solutions is becoming a critical strategic imperative.



    Regional Outlook



    North America currently commands the largest share of the global Sensitive Data Flow Maps with AI market, accounting for over 42% of the total market value in 2024. This dominance is attributed to the region’s advanced technological infrastructure, a high concentration of data-driven enterprises, and the presence of leading AI solution providers. The United States, in particular, is at the forefront due to its proactive regulatory environment, such as the California Consumer Privacy Act (CCPA) and the Health Insurance Portability and Accountability Act (HIPAA), which necessitate robust data mapping and compliance tools. Additionally, North American organizations are early adopters of AI-driven security and privacy technologies, further fueling market expansion. The region’s mature ecosystem, abundant investment in cybersecurity, and a strong focus on digital transformation initiatives continue to position it as the benchmark for sensitive data flow mapping innovation.



    The Asia Pacific region is projected to be the fastest-growing market, with a forecasted CAGR exceeding 27% through 2033. This rapid acceleration is driven by the exponential growth of digital economies, increasing penetration of cloud computing, and the proliferation of data-intensive industries such as banking, healthcare, and telecommunications. Countries like China, India, Japan, and South Korea are witnessing significant investments in AI and cybersecurity infrastructure, motivated by heightened awareness of data privacy risks and the introduction of stricter data protection regulations. The surge in cross-border data flows, coupled with a burgeoning startup ecosystem, is further propelling demand for sophisticated, AI-powered data mapping solutions. As organizations in Asia Pacific strive to align with global best practices and regulatory standards, the region is set to emerge as a pivotal hub for sensitive data flow mapping innovation and deployment.



    Emerging economies in Latin America, the Middle East, and Africa are gradually embracing Sensitive Data Flow Maps with AI, though adoption remains in its nascent stages compared to developed markets. Challenges such as limited digital infrastructure, varying regulatory maturity, and budget constraints have tempered the pace of market penetration. However, the growing digitization of public and private sectors, coupled with increasing awareness of data privacy risks, is driving incremental demand. Localized regulatory reforms, such as Brazil’s LGPD and evolving data protection laws in the GCC countries, are beginning to incentivize investments in AI-driven compliance and risk management tools. While these regions face hurdles related to skills shortages and integration complexities, the long-term outlook remains positive as governments and enterprises prioritize digital resilience and regulatory alignment.



    Report Scope





    Attributes Details
    Report Title Sensitive Data Flow Maps with AI Market Research Report 2033
    By Component Software, Services
    By Application Data

  12. r

    Nested parallelism and control flow in big data analytics systems

    • resodate.org
    Updated Jun 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gábor Etele Gévay (2022). Nested parallelism and control flow in big data analytics systems [Dataset]. http://doi.org/10.14279/depositonce-15711
    Explore at:
    Dataset updated
    Jun 15, 2022
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Gábor Etele Gévay
    Description

    Over the last 15 years, numerous distributed dataflow systems appeared for large-scale data analytics, such as Apache Flink and Apache Spark. Users of such systems write data analysis programs in a (more or less) high-level API, while the systems take care of the low-level details of executing the programs in a scalable way on a cluster of machines. The systems' APIs consist of distributed collection types (or distributed matrix, graph, etc. types), and corresponding parallel operations. Distributed dataflow systems work well for simple programs, which are straightforward to express by just a few of the system-provided parallel operations. However, modern data analytics often demands the composition of larger programs, where 1) parallel operations are surrounded by control flow statements (e.g., in iterative algorithms, such as PageRank or K-means clustering), and/or 2) parallel operations are nested into each other. In such cases, an unpleasant trade-off appears: we lose either performance or ease-of-use: If users compose these complex programs in a straightforward way, they run into performance issues. Expert users might be able to solve the performance issues, albeit at the cost of a significant effort of delving into low-level execution details. In this thesis, we solve this trade-off for the case of control flow statements as follows: Our system allows users to express control flow with easy-to-use, standard, imperative control flow constructs, and it compiles the program into a single dataflow job. Having a single job eliminates the job launch overhead from iteration steps, and enables several loop optimizations. We compile through an intermediate representation based on static single assignment form, which allows us to handle all the standard imperative control flow statements in a uniform way. A run-time component of our system coordinates the distributed execution of control flow statements, using a novel coordination algorithm, which leverages our intermediate representation to handle any imperative control flow. Furthermore, for handling nested parallel operations, we propose a compilation technique that flattens a nested program, i.e., creates an equivalent flat program where there is no nesting of parallel operations. The flattened program can then be executed on a standard distributed dataflow system. Our main design goal was to enable users to nest any data analysis program inside a parallel operation without changes, i.e., to not introduce significant restrictions on how the system's API can be used at inner nesting levels. An important example is that, contrary to previous systems that perform flattening, we can even handle programs where there is an iterative algorithm at inner nesting levels. We also show three optimizations, which solve performance problems that arise when applying the flattening technique in the context of distributed dataflow systems.

  13. r

    Specification and optimization of analytical data flows

    • resodate.org
    Updated May 27, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabian Hüske (2016). Specification and optimization of analytical data flows [Dataset]. http://doi.org/10.14279/depositonce-5150
    Explore at:
    Dataset updated
    May 27, 2016
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Fabian Hüske
    Description

    In the past, the majority of data analysis use cases was addressed by aggregating relational data. Since a few years, a trend is evolving, which is called “Big Data” and which has several implications on the field of data analysis. Compared to previous applications, much larger data sets are analyzed using more elaborate and diverse analysis methods such as information extraction techniques, data mining algorithms, and machine learning methods. At the same time, analysis applications include data sets with less or even no structure at all. This evolution has implications on the requirements on data processing systems. Due to the growing size of data sets and the increasing computational complexity of advanced analysis methods, data must be processed in a massively parallel fashion. The large number and diversity of data analysis techniques as well as the lack of data structure determine the use of user-defined functions and data types. Many traditional database systems are not flexible enough to satisfy these requirements. Hence, there is a need for programming abstractions to define and efficiently execute complex parallel data analysis programs that support custom user-defined operations. The success of the SQL query language has shown the advantages of declarative query specification, such as potential for optimization and ease of use. Today, most relational database management systems feature a query optimizer that compiles declarative queries into physical execution plans. Cost-based optimizers choose from billions of plan candidates the plan with the least estimated cost. However, traditional optimization techniques cannot be readily integrated into systems that aim to support novel data analysis use cases. For example, the use of user-defined functions (UDFs) can significantly limit the optimization potential of data analysis programs. Furthermore, lack of detailed data statistics is common when large amounts of unstructured data is analyzed. This leads to imprecise optimizer cost estimates, which can cause sub-optimal plan choices. In this thesis we address three challenges that arise in the context of specifying and optimizing data analysis programs. First, we propose a parallel programming model with declarative properties to specify data analysis tasks as data flow programs. In this model, data processing operators are composed of a system-provided second-order function and a user-defined first-order function. A cost-based optimizer compiles data flow programs specified in this abstraction into parallel data flows. The optimizer borrows techniques from relational optimizers and ports them to the domain of general-purpose parallel programming models. Second, we propose an approach to enhance the optimization of data flow programs that include UDF operators with unknown semantics. We identify operator properties and conditions to reorder neighboring UDF operators without changing the semantics of the program. We show how to automatically extract these properties from UDF operators by leveraging static code analysis techniques. Our approach is able to emulate relational optimizations such as filter and join reordering and holistic aggregation push-down while not being limited to relational operators. Finally, we analyze the impact of changing execution conditions such as varying predicate selectivities and memory budgets on the performance of relational query plans. We identify plan patterns that cause significantly varying execution performance for changing execution conditions. Plans that include such risky patterns are prone to cause problems in presence of imprecise optimizer estimates. Based on our findings, we introduce an approach to avoid risky plan choices. Moreover, we present a method to assess the risk of a query execution plan using a machine-learned prediction model. Experiments show that the prediction model outperforms risk predictions which are computed from optimizer estimates.

  14. f

    Data_Sheet_1_A parameter-optimization framework for neural decoding...

    • frontiersin.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Xie; Rong Chen; Shuvra S. Bhattacharyya (2023). Data_Sheet_1_A parameter-optimization framework for neural decoding systems.pdf [Dataset]. http://doi.org/10.3389/fninf.2023.938689.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Jing Xie; Rong Chen; Shuvra S. Bhattacharyya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Real-time neuron detection and neural activity extraction are critical components of real-time neural decoding. They are modeled effectively in dataflow graphs. However, these graphs and the components within them in general have many parameters, including hyper-parameters associated with machine learning sub-systems. The dataflow graph parameters induce a complex design space, where alternative configurations (design points) provide different trade-offs involving key operational metrics including accuracy and time-efficiency. In this paper, we propose a novel optimization framework that automatically configures the parameters in different neural decoders. The proposed optimization framework is evaluated in depth through two case studies. Significant performance improvement in terms of accuracy and efficiency is observed in both case studies compared to the manual parameter optimization that was associated with the published results of those case studies. Additionally, we investigate the application of efficient multi-threading strategies to speed-up the running time of our parameter optimization framework. Our proposed optimization framework enables efficient and effective estimation of parameters, which leads to more powerful neural decoding capabilities and allows researchers to experiment more easily with alternative decoding models.

  15. Data from: A Power-Aware, Self-Adaptive Macro Data Flow Framework

    • data.europa.eu
    • zenodo.org
    unknown
    Updated Jan 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2020). A Power-Aware, Self-Adaptive Macro Data Flow Framework [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-1194485?locale=ro
    Explore at:
    unknown(11377329)Available download formats
    Dataset updated
    Jan 23, 2020
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract: The dataflow programming model has been extensively used as an effective solution to implement efficient parallel programming frameworks. However, the amount of resources allocated to the runtime support is usually fixed once by the programmer or the runtime, and kept static during the entire execution. While there are cases where such a static choice may be appropriate, other scenarios may require to dynamically change the parallelism degree during the application execution. In this paper we propose an algorithm for multicore shared memory platforms, that dynamically selects the optimal number of cores to be used as well as their clock frequency according to either the workload pressure or to explicit user requirements. We implement the algorithm for both structured and unstructured parallel applications and we validate our proposal over three real applications, showing that it is able to save a significant amount of power, while not impairing the performance and not requiring additional effort from the application programmer. This dataset contains the raw data of the experiments and the scripts used to plot them.

  16. General data: Flows of self-employed workers by autonomous community. CODEM...

    • datos.gob.es
    Updated Apr 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Instituto Nacional de Estadística (2023). General data: Flows of self-employed workers by autonomous community. CODEM (API identifier: 49327) [Dataset]. https://datos.gob.es/en/catalogo/ea0010587-datos-generales-flujos-de-trabajadores-autonomos-por-comunidades-autonomas-codem-identificador-api-49327
    Explore at:
    Dataset updated
    Apr 5, 2023
    Dataset provided by
    National Statistics Institutehttp://www.ine.es/
    Authors
    Instituto Nacional de Estadística
    License

    https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal

    Description

    Table of Experimental Statistics. General data: Flows of self-employed workers by autonomous community. Quarterly. Autonomous Communities and Cities. Company Demographic Profile

  17. S

    An Efficient Multi-Topology Construction Method for Mobile Data Flows...

    • scidb.cn
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chi Zhang; Rui Han; Haojiang Deng (2024). An Efficient Multi-Topology Construction Method for Mobile Data Flows Scheduling in SDN [Dataset]. http://doi.org/10.57760/sciencedb.17235
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Chi Zhang; Rui Han; Haojiang Deng
    License

    https://api.github.com/licenses/unlicensehttps://api.github.com/licenses/unlicense

    Description

    data of the simulation.

  18. S

    Stream Data Pipeline Processing Tool Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Stream Data Pipeline Processing Tool Report [Dataset]. https://www.archivemarketresearch.com/reports/stream-data-pipeline-processing-tool-558610
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming market for stream data pipeline processing tools! This in-depth analysis reveals a $15 billion market in 2025, projected to grow at 18% CAGR through 2033, driven by real-time analytics, cloud adoption, and IoT data explosion. Learn about key players, market trends, and regional insights.

  19. S

    Stream Data Pipeline Processing Tool Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Stream Data Pipeline Processing Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/stream-data-pipeline-processing-tool-35484
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global stream data pipeline processing tool market is experiencing robust growth, driven by the exponential increase in real-time data generation across diverse sectors. The market, estimated at $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033, reaching approximately $50 billion by 2033. This expansion is fueled by the rising adoption of cloud-native architectures, the proliferation of IoT devices generating massive streaming data, and the increasing need for real-time analytics and decision-making capabilities across industries like finance (high-frequency trading, fraud detection), security (intrusion detection, threat intelligence), and others. The demand for sophisticated tools capable of handling high-volume, high-velocity data streams is paramount, leading to innovation in areas such as optimized data ingestion, processing, and storage solutions. Key players are strategically investing in advanced technologies like AI and machine learning to enhance the efficiency and analytical power of their offerings. The market is segmented by application (Finance, Security, and others), and tool type (real-time, proprietary, and cloud-native). The cloud-native segment is demonstrating the fastest growth due to its scalability and cost-effectiveness. While the North American market currently holds a significant share, regions like Asia-Pacific are exhibiting rapid growth, driven by increasing digitalization and technological adoption. Competition is intense, with established tech giants alongside specialized vendors vying for market dominance. Challenges include data security concerns, the need for skilled professionals, and the complexities of integrating these tools into existing infrastructure. The market's growth trajectory is further influenced by several key trends, including the increasing adoption of serverless architectures, the rise of edge computing, and the growing popularity of event-driven architectures. These trends enable organizations to process data closer to its source, reducing latency and enhancing real-time response capabilities. Furthermore, the integration of advanced analytics and machine learning capabilities into stream data pipeline processing tools is enhancing their value proposition by providing actionable insights from real-time data. However, the market faces certain restraints, such as the high initial investment costs associated with implementing these tools and the need for robust data governance frameworks to ensure data security and compliance. Despite these challenges, the overall market outlook remains positive, promising substantial growth opportunities for established and emerging players alike.

  20. European Users Energy Consumption

    • kaggle.com
    zip
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elahe Sarlakian (2024). European Users Energy Consumption [Dataset]. https://www.kaggle.com/datasets/elahesarlakian/europe-energy-consumption-by-users
    Explore at:
    zip(12676 bytes)Available download formats
    Dataset updated
    Apr 19, 2024
    Authors
    Elahe Sarlakian
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The indicator measures the energy end-use in a country excluding all non-energy use of energy carriers (e.g. natural gas used not for combustion but for producing chemicals). “Final energy consumption” only covers the energy consumed by end users, such as industry, transport, households, services and agriculture; it excludes energy consumption of the energy sector itself and losses occurring during transformation and distribution of energy.

    The dataset contains following columns: DATAFLOW: This column indicate the direction or type of data flow, such as imports, exports, production, consumption, etc.

    In this case, the value "ESTAT:SDG_07_11 (1.0)" in the DATAFLOW column seems to be a specific code or identifier that represents the type of data flow or category of data within the dataset. Let's break down the components of this value:

    1. ESTAT: This part of the value could be an abbreviation or code that indicates the source or organization providing the data. In this case, it might refer to Eurostat, which is the statistical office of the European Union.

    2. SDG_07_11: This part of the value could be a reference to a specific Sustainable Development Goal (SDG) related to energy, such as Goal 7 (Affordable and Clean Energy) and Target 7.11, which focuses on increasing the share of renewable energy in the global energy mix.

    3. (1.0): This part of the value could be a version number or identifier associated with the specific data flow or dataset. It might indicate a particular version or iteration of the data related to this specific category.

    Overall, the value "ESTAT:SDG_07_11 (1.0)" likely represents a specific data flow category related to Sustainable Development Goal 7.11 on renewable energy, possibly sourced from Eurostat. Understanding these identifiers can help you categorize and analyze the data more effectively within the context of energy consumption and sustainable development goals.

    LAST UPDATE: This column contains the date when the data was last updated or modified.

    freq: This column indicate the frequency of data collection or reporting, such as daily, monthly, quarterly, etc. It provides information on the time intervals at which the data is recorded.

    unit: This column specifies the unit of measurement for the data values in the dataset. for example I05 refers to 10^5

    geo: This column represents the geographical location or region associated with the data. The values like AL (Albania), AT (Austria), etc., indicate different countries or regions.

    TIME_PERIOD:This column contain information about the time period to which the data corresponds, such as years in this case.

    OBS_VALUE: This column contains the observed or recorded values related to energy consumption. It represents the actual numerical data points in the dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
data flow (2023). Dataflow(test) Dataset [Dataset]. https://universe.roboflow.com/data-flow-lz3tb/dataflow-test

Dataflow(test) Dataset

dataflow-test

dataflow(test)-dataset

Explore at:
zipAvailable download formats
Dataset updated
Jul 24, 2023
Dataset authored and provided by
data flow
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Dots Defect Test Bounding Boxes
Description

Dataflow(test)

## Overview

Dataflow(test) is a dataset for object detection tasks - it contains Dots Defect Test annotations for 405 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu