100+ datasets found
  1. d

    A generalized residual technique for analyzing complex movement models using...

    • dataone.org
    • data.niaid.nih.gov
    • +3more
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan R. Potts; Marie Auger-Méthé; Karl Mokross; Mark A. Lewis (2025). A generalized residual technique for analyzing complex movement models using earth mover's distance [Dataset]. http://doi.org/10.5061/dryad.9h42f
    Explore at:
    Dataset updated
    Jun 1, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Jonathan R. Potts; Marie Auger-Méthé; Karl Mokross; Mark A. Lewis
    Time period covered
    Aug 18, 2015
    Description
    1. Complex systems of moving and interacting objects are ubiquitous in the natural and social sciences. Predicting their behavior often requires models that mimic these systems with sufficient accuracy, while accounting for their inherent stochasticity. Though tools exist to determine which of a set of candidate models is best relative to the others, there is currently no generic goodness-of-fit framework for testing how close the best model is to the real complex stochastic system. 2. We propose such a framework, using a novel application of the Earth mover's distance, also known as the Wasserstein metric. It is applicable to any stochastic process where the probability of the model's state at time t is a function of the state at previous times. It generalizes the concept of a residual, often used to analyze 1D summary statistics, to situations where the complexity of the underlying model's probability distribution makes standard residual analysis too imprecise for practical use. 3. We...
  2. Code and data for "Analysis of complex multidimensional optical spectra by...

    • catalog.data.gov
    • data.nist.gov
    Updated Jul 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Code and data for "Analysis of complex multidimensional optical spectra by linear prediction" [Dataset]. https://catalog.data.gov/dataset/code-and-data-for-analysis-of-complex-multidimensional-optical-spectra-by-linear-predictio
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Python code for analyzing optical 2D spectroscopy data using linear prediction from singular value decomposition. Included as supplemental information with a publication on this topic. Includes scripts and associated data to generate the figures in the paper.

  3. f

    List of networks.

    • plos.figshare.com
    xls
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergey Shvydun (2025). List of networks. [Dataset]. http://doi.org/10.1371/journal.pcsy.0000042.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 13, 2025
    Dataset provided by
    PLOS Complex Systems
    Authors
    Sergey Shvydun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The concept of centrality is one of the essential tools for analyzing complex systems. Over the years, a large number of centrality indices have been proposed that account for different aspects of a network. Unfortunately, most real networks are substantially incomplete, which affects the results of the centrality measures. This article aims to evaluate the sensitivity of 16 centrality measures to the presence of errors or incomplete information about the structure of a complex network. Our experiments are performed across 113 empirical networks. As a result, we identify centrality indices that are highly vulnerable to incomplete data.

  4. Big Data Analysis Platform Market Report | Global Forecast From 2025 To 2033...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Big Data Analysis Platform Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-big-data-analysis-platform-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Big Data Analysis Platform Market Outlook



    The global market size for Big Data Analysis Platforms is projected to grow from USD 35.5 billion in 2023 to an impressive USD 110.7 billion by 2032, reflecting a CAGR of 13.5%. This substantial growth can be attributed to the increasing adoption of data-driven decision-making processes across various industries, the rapid proliferation of IoT devices, and the ever-growing volumes of data generated globally.



    One of the primary growth factors for the Big Data Analysis Platform market is the escalating need for businesses to derive actionable insights from complex and voluminous datasets. With the advent of technologies such as artificial intelligence and machine learning, organizations are increasingly leveraging big data analytics to enhance their operational efficiency, customer experience, and competitiveness. The ability to process vast amounts of data quickly and accurately is proving to be a game-changer, enabling businesses to make more informed decisions, predict market trends, and optimize their supply chains.



    Another significant driver is the rise of digital transformation initiatives across various sectors. Companies are increasingly adopting digital technologies to improve their business processes and meet changing customer expectations. Big Data Analysis Platforms are central to these initiatives, providing the necessary tools to analyze and interpret data from diverse sources, including social media, customer transactions, and sensor data. This trend is particularly pronounced in sectors such as retail, healthcare, and BFSI (banking, financial services, and insurance), where data analytics is crucial for personalizing customer experiences, managing risks, and improving operational efficiencies.



    Moreover, the growing adoption of cloud computing is significantly influencing the market. Cloud-based Big Data Analysis Platforms offer several advantages over traditional on-premises solutions, including scalability, flexibility, and cost-effectiveness. Businesses of all sizes are increasingly turning to cloud-based analytics solutions to handle their data processing needs. The ability to scale up or down based on demand, coupled with reduced infrastructure costs, makes cloud-based solutions particularly appealing to small and medium-sized enterprises (SMEs) that may not have the resources to invest in extensive on-premises infrastructure.



    Data Science and Machine-Learning Platforms play a pivotal role in the evolution of Big Data Analysis Platforms. These platforms provide the necessary tools and frameworks for processing and analyzing vast datasets, enabling organizations to uncover hidden patterns and insights. By integrating data science techniques with machine learning algorithms, businesses can automate the analysis process, leading to more accurate predictions and efficient decision-making. This integration is particularly beneficial in sectors such as finance and healthcare, where the ability to quickly analyze complex data can lead to significant competitive advantages. As the demand for data-driven insights continues to grow, the role of data science and machine-learning platforms in enhancing big data analytics capabilities is becoming increasingly critical.



    From a regional perspective, North America currently holds the largest market share, driven by the presence of major technology companies, high adoption rates of advanced technologies, and substantial investments in data analytics infrastructure. Europe and the Asia Pacific regions are also experiencing significant growth, fueled by increasing digitalization efforts and the rising importance of data analytics in business strategy. The Asia Pacific region, in particular, is expected to witness the highest CAGR during the forecast period, propelled by rapid economic growth, a burgeoning middle class, and increasing internet and smartphone penetration.



    Component Analysis



    The Big Data Analysis Platform market can be broadly categorized into three components: Software, Hardware, and Services. The software segment includes analytics software, data management software, and visualization tools, which are crucial for analyzing and interpreting large datasets. This segment is expected to dominate the market due to the continuous advancements in analytics software and the increasing need for sophisticated data analysis tools. Analytics software enables organizations to process and analyze data from multiple sources,

  5. f

    Statistical results of classification using different methods.

    • figshare.com
    xls
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guanqun Wang (2025). Statistical results of classification using different methods. [Dataset]. http://doi.org/10.1371/journal.pone.0318519.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Guanqun Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistical results of classification using different methods.

  6. f

    List of modifications in the structure of a network.

    • plos.figshare.com
    xls
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergey Shvydun (2025). List of modifications in the structure of a network. [Dataset]. http://doi.org/10.1371/journal.pcsy.0000042.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 13, 2025
    Dataset provided by
    PLOS Complex Systems
    Authors
    Sergey Shvydun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of modifications in the structure of a network.

  7. w

    Dataset of books called Risk analysis of complex and uncertain systems

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Risk analysis of complex and uncertain systems [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Risk+analysis+of+complex+and+uncertain+systems
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Risk analysis of complex and uncertain systems. It features 7 columns including author, publication date, language, and book publisher.

  8. A

    ‘Cognitive modeling of complex systems’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Cognitive modeling of complex systems’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-cognitive-modeling-of-complex-systems-2670/latest
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Cognitive modeling of complex systems’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vbmokin/cognitive-modeling-of-complex-systems on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    The dataset contains examples of weights and descriptions of cognitive maps of complex systems for cognitive modeling in various fields

    Content

    Each cognitive map is described by two files - one contains the weights of its edges, and the other (with a similar name) - the names of vertices and their type (input, target, or state variable) (default, all can be input or target in turn).

    Acknowledgements

    Examples of cognitive modeling techniques are described in my articles (with co-authors): - https://visnyk.vntu.edu.ua/index.php/visnyk/article/view/2238/2195 (from there, a picture was used for the dataset logo) - https://ieeexplore.ieee.org/document/8100371 Thanks to the articles I refer to in my articles. Thanks for the article, where there is a simple and clear example - I use it in my baseline notebook.

    Inspiration

    I invite you to develop notebooks that allow cognitive modeling of complex systems (playing scenarios for their development, optimization of modes, etc.) using graph theory.

    --- Original source retains full ownership of the source dataset ---

  9. A

    ‘Capitol Complex Renewable Energy Sources - Cumulative Total’ analyzed by...

    • analyst-2.ai
    Updated Jan 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Capitol Complex Renewable Energy Sources - Cumulative Total’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-capitol-complex-renewable-energy-sources-cumulative-total-6fc1/59bb3de1/?iid=000-346&v=presentation
    Explore at:
    Dataset updated
    Jan 26, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Capitol Complex Renewable Energy Sources - Cumulative Total’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/3b11da61-01be-4d67-b5ca-af5cf3751ac4 on 26 January 2022.

    --- Dataset description provided by original source is as follows ---

    Running total of renewable energy sources in the State of Oklahoma Capitol complex.

    --- Original source retains full ownership of the source dataset ---

  10. Data Science Tool Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Science Tool Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-science-tool-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Science Tool Market Outlook




    The global data science tool market size was valued at approximately USD 7.9 billion in 2023 and is projected to reach USD 29.8 billion by 2032, growing at a compound annual growth rate (CAGR) of 15.8% during the forecast period. This impressive growth is primarily driven by the escalating adoption of data science tools across various industries, driven by the need for data-driven decision making, advancements in machine learning and artificial intelligence, and an increasing amount of data generated worldwide.




    One of the significant growth factors for the data science tool market is the rising demand for big data analytics. Organizations across different sectors are increasingly recognizing the value of data analytics to gain insights, improve customer experience, and enhance operational efficiency. The surge in data generation, propelled by the proliferation of digital devices and social media, has necessitated the adoption of sophisticated data science tools to handle and analyze large datasets effectively. This growing reliance on data-driven decision-making is a key driver boosting the market growth.




    Another vital factor contributing to the market expansion is the advancements in artificial intelligence (AI) and machine learning (ML) technologies. Modern data science tools leverage AI and ML to offer advanced analytics capabilities, enabling organizations to predict trends, automate processes, and make more informed decisions. The continuous development in AI algorithms and the integration of these technologies into data science tools have significantly enhanced their capabilities, making them indispensable for businesses aiming to stay competitive in todayÂ’s digital landscape.




    The increasing application of data science tools in various industries such as healthcare, finance, retail, manufacturing, and IT & telecommunications further propels market growth. In healthcare, data science tools are used for predictive analytics, patient care optimization, and operational efficiency. Financial institutions utilize these tools for risk management, fraud detection, and customer analytics. Similarly, in retail and e-commerce, data science tools are employed for inventory management, customer segmentation, and personalized marketing. The broadening scope of applications across different sectors underscores the growing importance of data science tools.




    From a regional perspective, North America holds the largest market share in the data science tool market, driven by the presence of major technology companies, high adoption rates of advanced technologies, and significant investments in AI and big data analytics. Europe follows closely, with increasing digital transformation initiatives and government support for data-driven innovations. The Asia Pacific region is anticipated to witness the highest growth rate during the forecast period, fueled by rapid industrialization, expanding IT sector, and growing awareness about the benefits of data analytics among businesses.



    The advent of Ai Data Analysis Tool has revolutionized the way businesses approach data analytics. These tools are designed to process and analyze vast amounts of data with remarkable speed and accuracy, enabling organizations to derive actionable insights in real-time. By leveraging artificial intelligence, these tools can identify patterns and trends that might be missed by traditional data analysis methods. This capability is particularly beneficial for industries that rely heavily on data-driven decision-making, such as finance, healthcare, and retail. As businesses continue to generate more data, the demand for AI-powered data analysis tools is expected to grow, driving further innovation and development in this field.



    Component Analysis




    The data science tool market is segmented by component into software and services. The software segment includes a wide array of tools such as data preparation tools, data mining tools, data visualization tools, and predictive analytics tools. These software solutions are designed to assist data scientists and analysts in processing and analyzing complex data sets. The growing need for advanced data analytics solutions to manage and analyze large volumes of data is driving the demand for these software tools. The continuous innovation in software functionalities and the integrati

  11. d

    Replication Data for: Online Coders, Open Codebooks: New Opportunities for...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Winter, Nicholas J. G. (2023). Replication Data for: Online Coders, Open Codebooks: New Opportunities for Content Analysis of Political Communication [Dataset]. http://doi.org/10.7910/DVN/LWWGKY
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Winter, Nicholas J. G.
    Description

    Analyzing audiovisual communication is challenging because its content is highly symbolic and less rule-governed than verbal material. But audiovisual messages are important to understand: they amplify, enrich, or complicate the content of textual information. To address these measurement challenges, we describe a fully reproducible approach to analyzing video content using minimally – but systematically – trained online workers. By aggregating the work of multiple coders, the online approach achieves reliability, validity, and costs that equal those of traditional, intensively trained research assistants, with much greater speed, transparency, and replicability. We argue that measurement strategies relying on the “wisdom of the crowd” provide unique advantages for researchers analyzing complex and intricate audiovisual political content.

  12. Visual Data Analysis Tool Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Visual Data Analysis Tool Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-visual-data-analysis-tool-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Visual Data Analysis Tool Market Outlook



    The global visual data analysis tool market size was valued at USD 4.2 billion in 2023 and is projected to reach USD 9.8 billion by 2032, growing at a CAGR of 9.6% during the forecast period. This robust growth is driven by the increasing demand for data-driven decision-making processes across various industries and the adoption of advanced analytical tools that offer visual insights into complex datasets. The market is benefiting from advancements in artificial intelligence and machine learning, which enhance the capabilities of these tools, making them more intuitive and powerful.



    One of the primary growth factors for the visual data analysis tool market is the exponential growth of data generation across industries. With the advent of IoT, social media, and digital transactions, enterprises are inundated with vast amounts of data. The need to transform this data into actionable insights has propelled the adoption of visual data analysis tools. These tools simplify the interpretation of complex data sets, enabling businesses to make quicker and more informed decisions, thereby driving business growth and operational efficiency.



    Another significant driver is the increasing emphasis on data democratization within organizations. There is a growing trend to make data accessible to a broader audience within the enterprise, not just to data scientists or IT departments. Visual data analysis tools provide user-friendly interfaces that allow non-technical users to engage with data, fostering a data-driven culture across the organization. This democratization is crucial for enhancing collaborative decision-making and fostering innovation, which in turn fuels market growth.



    The integration of advanced technologies such as AI and machine learning into visual data analysis tools has further accelerated market growth. These technologies enable predictive analytics, real-time data processing, and automated insights generation, significantly enhancing the value provided by these tools. As businesses increasingly seek to leverage AI and machine learning for competitive advantage, the demand for sophisticated visual data analysis tools is expected to rise. This adoption is particularly prominent in sectors like healthcare, finance, and retail, where data plays a crucial role in strategic planning and operational efficiency.



    The rise of Big Data Analytics Tools has significantly influenced the landscape of visual data analysis. These tools are designed to handle and process massive datasets, providing businesses with the ability to uncover hidden patterns and insights that were previously inaccessible. By integrating Big Data Analytics Tools with visual data analysis platforms, organizations can enhance their decision-making processes, enabling them to respond more swiftly to market changes and customer demands. This integration not only improves the accuracy of data-driven decisions but also empowers businesses to leverage predictive analytics for future planning. Industries such as healthcare, finance, and retail are particularly benefiting from this synergy, as they can now analyze complex datasets in real-time, leading to improved operational efficiency and strategic planning.



    From a regional perspective, North America holds a significant share of the visual data analysis tool market, driven by the presence of major technology companies and high adoption rates of advanced analytics tools. The region's strong technological infrastructure and focus on innovation contribute to its market leadership. Meanwhile, the Asia Pacific region is expected to witness the highest growth rate during the forecast period. The rapid digital transformation in countries like China and India, coupled with increased investments in big data and analytics, is driving the demand for visual data analysis tools in this region. Europe also represents a substantial market share, driven by regulatory pressures and the need for efficient data management solutions in various industries.



    Component Analysis



    The visual data analysis tool market is segmented by component into software, hardware, and services. The software segment dominates the market, driven by the continuous innovation and development of advanced analytical tools. These software solutions are designed to handle large datasets and provide intuitive visualizations that help users derive meaningful insights. The increasing adoption of cloud-based analytics solutions is also

  13. g

    IPD Meta-Analysis of Complex Survey Data: A Simulation Study

    • search.gesis.org
    • datacatalogue.cessda.eu
    • +3more
    Updated Dec 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haensch, Anna-Carolina; Weiß, Bernd (2018). IPD Meta-Analysis of Complex Survey Data: A Simulation Study [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-1799
    Explore at:
    Dataset updated
    Dec 18, 2018
    Dataset provided by
    GESIS, Köln
    GESIS search
    Authors
    Haensch, Anna-Carolina; Weiß, Bernd
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    Replication files for the article "IPD Meta-Analysis of Complex Survey Data: A Simulation Study"

  14. Z

    TexBiG Dataset for Analysing Complex Document Layouts in the Digital...

    • data.niaid.nih.gov
    • doi.org
    • +1more
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tschirschwitz, David (2023). TexBiG Dataset for Analysing Complex Document Layouts in the Digital Humanities [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6885143
    Explore at:
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    Rodehorst, Volker
    Tschirschwitz, David
    Stein, Benno
    Klemstein, Franziska
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset for the paper "A Dataset for Analysing Complex Document Layouts in the Digital Humanities and its Evaluation with Krippendorff ’s Alpha" in its second version, containing an update of the test images (without annotations) from the paper "Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object Detection with Repeated Labels". Organization of the dataset is also updated to make it easier to use.

    TexBiG (from the German Text-Bild-Gefüge, meaning Text-Image-Structure) is a document layout analysis dataset for historical documents in the late 19th and early 20th century. The dataset provides instance segmentation (bounding boxes and polygons/masks) annotations for 19 different classes with more then 52.000 instances. The added test images can be used to make submission on the leaderboard on EvalAI.

    The annotation guideline can be found in the first of the dataset.

  15. q

    2015-H_Sayama-Intro to Modeling and Analysis of Complex Systems

    • qubeshub.org
    Updated Apr 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H Sayama (2023). 2015-H_Sayama-Intro to Modeling and Analysis of Complex Systems [Dataset]. http://doi.org/10.25334/E4QB-1F11
    Explore at:
    Dataset updated
    Apr 1, 2023
    Dataset provided by
    QUBES
    Authors
    H Sayama
    Description

    Introduces students to mathematical/computational modeling and analysis developed in the emerging interdisciplinary field of Complex Systems Science.

  16. Data from: Using Machine Learning to Analyze Molecular Dynamics Simulations...

    • acs.figshare.com
    • figshare.com
    zip
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alfie-Louise R. Brownless; Elisa Rheaume; Katie M. Kuo; Shina C. L. Kamerlin; James C. Gumbart (2025). Using Machine Learning to Analyze Molecular Dynamics Simulations of Biomolecules [Dataset]. http://doi.org/10.1021/acs.jpcb.4c08824.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 27, 2025
    Dataset provided by
    ACS Publications
    Authors
    Alfie-Louise R. Brownless; Elisa Rheaume; Katie M. Kuo; Shina C. L. Kamerlin; James C. Gumbart
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Machine learning (ML) techniques have become powerful tools in both industrial and academic settings. Their ability to facilitate analysis of complex data and generation of predictive insights is transforming how scientific problems are approached across a wide range of disciplines. In this tutorial, we present a cursory introduction to three widely used ML techniqueslogistic regression, random forest, and multilayer perceptronapplied toward analyzing molecular dynamics (MD) trajectory data. We employ our chosen ML models to the study of the SARS-CoV-2 spike protein receptor binding domain interacting with the receptor ACE2. We develop a pipeline for processing MD simulation trajectory data and identifying residues that significantly impact the stability of the complex.

  17. A

    ‘US Health Insurance Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘US Health Insurance Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-us-health-insurance-dataset-920a/latest
    Explore at:
    Dataset updated
    Feb 29, 2020
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘US Health Insurance Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/teertha/ushealthinsurancedataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    The venerable insurance industry is no stranger to data driven decision making. Yet in today's rapidly transforming digital landscape, Insurance is struggling to adapt and benefit from new technologies compared to other industries, even within the BFSI sphere (compared to the Banking sector for example.) Extremely complex underwriting rule-sets that are radically different in different product lines, many non-KYC environments with a lack of centralized customer information base, complex relationship with consumers in traditional risk underwriting where sometimes customer centricity runs reverse to business profit, inertia of regulatory compliance - are some of the unique challenges faced by Insurance Business.

    Despite this, emergent technologies like AI and Block Chain have brought a radical change in Insurance, and Data Analytics sits at the core of this transformation. We can identify 4 key factors behind the emergence of Analytics as a crucial part of InsurTech:

    • Big Data: The explosion of unstructured data in the form of images, videos, text, emails, social media
    • AI: The recent advances in Machine Learning and Deep Learning that can enable businesses to gain insight, do predictive analytics and build cost and time - efficient innovative solutions
    • Real time Processing: Ability of real time information processing through various data feeds (for ex. social media, news)
    • Increased Computing Power: a complex ecosystem of new analytics vendors and solutions that enable carriers to combine data sources, external insights, and advanced modeling techniques in order to glean insights that were not possible before.

    This dataset can be helpful in a simple yet illuminating study in understanding the risk underwriting in Health Insurance, the interplay of various attributes of the insured and see how they affect the insurance premium.

    Content

    This dataset contains 1338 rows of insured data, where the Insurance charges are given against the following attributes of the insured: Age, Sex, BMI, Number of Children, Smoker and Region. There are no missing or undefined values in the dataset.

    Inspiration

    This relatively simple dataset should be an excellent starting point for EDA, Statistical Analysis and Hypothesis testing and training Linear Regression models for predicting Insurance Premium Charges.

    Proposed Tasks: - Exploratory Data Analytics - Statistical hypothesis testing - Statistical Modeling - Linear Regression

    --- Original source retains full ownership of the source dataset ---

  18. Immersive Analytics Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Immersive Analytics Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/immersive-analytics-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Immersive Analytics Market Outlook



    The global immersive analytics market size was valued at USD 1.5 billion in 2023 and is expected to reach USD 7.8 billion by 2032, growing at a compound annual growth rate (CAGR) of 20.1% during the forecast period. The rapid growth of the market is fueled by the increasing demand for more intuitive and interactive data visualization tools that can enhance decision-making processes across various industries. This surge is further driven by advancements in augmented reality (AR), virtual reality (VR), and mixed reality (MR) technologies, which provide more engaging and effective ways to analyze complex data sets.



    One of the key growth factors of the immersive analytics market is the heightened need for data-driven decision-making in business environments. Organizations are increasingly recognizing the importance of leveraging data to gain competitive advantages, making immersive analytics an attractive solution for visualizing complex datasets. This need is particularly pronounced in sectors such as healthcare and financial services, where nuanced data interpretation can have significant impacts on outcomes and profitability. As a result, companies are investing heavily in immersive analytics solutions to enhance their analytical capabilities.



    Another driving force is the advancement of AR and VR technologies, which have significantly evolved in recent years. These technologies are now more accessible and cost-effective, allowing a broader range of industries to adopt immersive analytics solutions. The integration of AR and VR with big data analytics enables users to interact with data in a more meaningful way, leading to better insights and more informed decision-making. The development of more sophisticated and user-friendly hardware such as VR headsets and AR glasses also contributes to the broader adoption of immersive analytics.



    Additionally, the growing demand for customized and real-time analytics is propelling the market forward. Businesses are seeking more tailored analytics solutions that can provide insights specific to their unique operational challenges and objectives. Immersive analytics platforms offer the ability to delve deeper into data sets in real-time, providing dynamic insights that traditional analytics tools cannot match. This capability is essential for sectors like retail and manufacturing, where real-time data can drive more efficient operations and improved customer experiences.



    Immersive Media Solutions are becoming an integral part of the immersive analytics landscape, offering businesses innovative ways to engage with data. These solutions leverage cutting-edge technologies such as AR and VR to create compelling and interactive environments where data can be visualized and analyzed in real-time. By transforming static data into dynamic visual experiences, immersive media solutions enable users to gain deeper insights and make more informed decisions. This is particularly beneficial in industries where understanding complex data sets is crucial, such as healthcare and finance. As the demand for more engaging data visualization tools grows, immersive media solutions are poised to play a pivotal role in shaping the future of analytics.



    From a regional perspective, North America currently dominates the immersive analytics market due to its advanced technological infrastructure and high adoption rates of AR and VR technologies. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period. This growth is attributed to the rapid digital transformation, increasing investments in AR and VR technologies, and the rising number of tech-savvy consumers in countries such as China, Japan, and India. The European market is also poised for substantial growth due to the region's focus on innovation and the presence of several leading technology firms.



    Component Analysis



    The immersive analytics market can be segmented by component into software, hardware, and services. The software segment constitutes various platforms and tools that enable immersive data visualization and interaction. This segment is anticipated to hold the largest market share due to the continuous development of advanced analytics software that can seamlessly integrate with AR and VR technologies. Software solutions are becoming increasingly sophisticated, offering more features such as real-time analytics, predictive modeling, and personalized dashboards.


  19. Z

    Graphic Novel Character Networks and Statistics

    • data.niaid.nih.gov
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vincent Labatut (2024). Graphic Novel Character Networks and Statistics [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6395874
    Explore at:
    Dataset updated
    Oct 1, 2024
    Dataset authored and provided by
    Vincent Labatut
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description. This dataset contains the character networks extracted from the graphic novel Thorgal, as well as the statistics and plots produced when analyzing these networks.

    Source code. The source code used to produce these files is available on GitHub: https://github.com/CompNet/NaNet

    Citation. If you use these data, please cite the following article:

    V. Labatut, “Complex Network Analysis of a Graphic Novel: The Case of the Bande Dessinée Thorgal,” Advances in Complex Systems, p. 22400033, 2022. ⟨hal-03694768⟩ - DOI: 10.1142/S0219525922400033

    @Article{Labatut2022, author = {Labatut, Vincent}, title = {Complex Network Analysis of a Graphic Novel: The Case of the Bande Dessinée {T}horgal}, journal = {Advances in Complex Systems}, year = {2022}, volume = {25}, number = {5&6}, pages = {2240003}, doi = {10.1142/S0219525922400033},}

  20. f

    Data_Sheet_1_Analyzing Complex Longitudinal Data in Educational Research: A...

    • frontiersin.figshare.com
    docx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oi-Man Kwok; Mark Hok-Chio Lai; Fuhui Tong; Rafael Lara-Alecio; Beverly Irby; Myeongsun Yoon; Yu-Chen Yeh (2023). Data_Sheet_1_Analyzing Complex Longitudinal Data in Educational Research: A Demonstration With Project English Language and Literacy Acquisition (ELLA) Data Using xxM.docx [Dataset]. http://doi.org/10.3389/fpsyg.2018.00790.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Frontiers
    Authors
    Oi-Man Kwok; Mark Hok-Chio Lai; Fuhui Tong; Rafael Lara-Alecio; Beverly Irby; Myeongsun Yoon; Yu-Chen Yeh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    When analyzing complex longitudinal data, especially data from different educational settings, researchers generally focus only on the mean part (i.e., the regression coefficients), ignoring the equally important random part (i.e., the random effect variances) of the model. By using Project English Language and Literacy Acquisition (ELLA) data, we demonstrated the importance of taking the complex data structure into account by carefully specifying the random part of the model, showing that not only can it affect the variance estimates, the standard errors, and the tests of significance of the regression coefficients, it also can offer different perspectives of the data, such as information related to the developmental process. We used xxM (Mehta, 2013), which can flexibly estimate different grade-level variances separately and the potential carryover effect from each grade factor to the later time measures. Implications of the findings and limitations of the study are discussed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jonathan R. Potts; Marie Auger-Méthé; Karl Mokross; Mark A. Lewis (2025). A generalized residual technique for analyzing complex movement models using earth mover's distance [Dataset]. http://doi.org/10.5061/dryad.9h42f

A generalized residual technique for analyzing complex movement models using earth mover's distance

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 1, 2025
Dataset provided by
Dryad Digital Repository
Authors
Jonathan R. Potts; Marie Auger-Méthé; Karl Mokross; Mark A. Lewis
Time period covered
Aug 18, 2015
Description
  1. Complex systems of moving and interacting objects are ubiquitous in the natural and social sciences. Predicting their behavior often requires models that mimic these systems with sufficient accuracy, while accounting for their inherent stochasticity. Though tools exist to determine which of a set of candidate models is best relative to the others, there is currently no generic goodness-of-fit framework for testing how close the best model is to the real complex stochastic system. 2. We propose such a framework, using a novel application of the Earth mover's distance, also known as the Wasserstein metric. It is applicable to any stochastic process where the probability of the model's state at time t is a function of the state at previous times. It generalizes the concept of a residual, often used to analyze 1D summary statistics, to situations where the complexity of the underlying model's probability distribution makes standard residual analysis too imprecise for practical use. 3. We...
Search
Clear search
Close search
Google apps
Main menu