36 datasets found
  1. Anomaly Detection Market By Component (Solutions & Services), Technology...

    • verifiedmarketresearch.com
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Anomaly Detection Market By Component (Solutions & Services), Technology (Big Data Analytics, Machine Learning and Artificial Intelligence), Vertical (Manufacturing IT and Telecom), Service (Professional services & Managed services), & Region for 2024-2031 [Dataset]. https://www.verifiedmarketresearch.com/product/global-anomaly-detection-market-size-and-forecast/
    Explore at:
    Dataset updated
    May 2, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2031
    Area covered
    Global
    Description

    Anomaly Detection Market size was valued at USD 5.66 Billion in 2024 and is projected to reach USD 19.4 Billion by 2031, growing at a CAGR of 16.65% from 2024 to 2031.

    The Anomaly Detection market is experiencing significant growth driven by several key factors. One primary driver is the escalating frequency and sophistication of cyber threats and security breaches across industries, compelling organizations to adopt advanced anomaly detection solutions to safeguard their digital assets and sensitive data. Additionally, the proliferation of big data and the Internet of Things (IoT) generates vast volumes of data that traditional security measures struggle to monitor effectively, creating a pressing need for anomaly detection capabilities. Moreover, the increasing adoption of artificial intelligence (AI) and machine learning (ML) technologies enhances anomaly detection algorithms’ accuracy and efficacy, enabling organizations to detect and mitigate anomalies in real-time. Furthermore, stringent regulatory requirements and compliance standards, particularly in sectors such as finance, healthcare, and telecommunications, are driving the adoption of anomaly detection solutions to ensure regulatory compliance and mitigate risks. Additionally, the growing demand for anomaly detection in fraud detection, network security, and operational monitoring applications further fuels market growth, presenting lucrative opportunities for vendors in the Anomaly Detection market.

  2. A

    Anomaly Detection Technology Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AMA Research & Media LLP (2025). Anomaly Detection Technology Report [Dataset]. https://www.archivemarketresearch.com/reports/anomaly-detection-technology-55077
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 9, 2025
    Dataset provided by
    AMA Research & Media LLP
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Anomaly Detection Technology market is experiencing robust growth, projected to reach a market size of $6,650.9 million in 2025. While the CAGR isn't explicitly provided, considering the rapid advancements in AI, machine learning, and the increasing need for cybersecurity and predictive maintenance across diverse sectors, a conservative estimate of the CAGR for the forecast period (2025-2033) would be around 15-20%. This growth is fueled by several key drivers. The increasing volume and complexity of data generated by businesses necessitates advanced analytics for identifying unusual patterns indicative of fraud, security breaches, equipment malfunctions, or other critical events. Furthermore, the rising adoption of cloud computing and the expanding deployment of IoT devices are contributing significantly to market expansion. The BFSI, manufacturing, and healthcare sectors are leading adopters, leveraging anomaly detection to improve risk management, optimize operational efficiency, and enhance customer experience. However, challenges remain, including the complexity of implementing and integrating anomaly detection solutions, the need for specialized expertise, and concerns related to data privacy and security. The market is segmented by type (Big Data Analytics, Data Mining and Business Intelligence, Machine Learning and Artificial Intelligence, Others) and application (BFSI, Manufacturing, Retail, Healthcare, Government, IT & Telecom, Others), reflecting the diverse applications of this technology. The competitive landscape is characterized by a mix of established technology giants like IBM, Microsoft, and Cisco, alongside specialized anomaly detection vendors. The future of the Anomaly Detection Technology market is bright, with continued growth driven by technological innovation and increasing adoption across various industries. The development of more sophisticated algorithms, improved data visualization tools, and the integration of anomaly detection into existing business processes will further fuel market expansion. The focus on addressing challenges related to data privacy and security, coupled with the emergence of specialized solutions catering to specific industry needs, will shape the market's trajectory in the coming years. While economic fluctuations and competitive pressures might influence growth rates, the fundamental need for advanced anomaly detection capabilities across multiple sectors guarantees the market's long-term viability and potential for substantial growth.

  3. d

    Data from: Distributed Anomaly Detection using 1-class SVM for Vertically...

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data
    Explore at:
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Dashlink
    Description

    There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

  4. A

    Anomaly Detection Technology Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Anomaly Detection Technology Report [Dataset]. https://www.archivemarketresearch.com/reports/anomaly-detection-technology-54954
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 9, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global anomaly detection technology market is experiencing robust growth, projected to reach a market size of $4825.8 million in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 4.7% from 2025 to 2033. This expansion is driven by the increasing adoption of big data analytics and artificial intelligence (AI) across various sectors. Businesses are increasingly relying on anomaly detection to enhance cybersecurity, improve operational efficiency, and gain a competitive edge through predictive maintenance and fraud detection. The BFSI (Banking, Financial Services, and Insurance), manufacturing, and healthcare sectors are leading adopters, fueled by the need to protect sensitive data and improve risk management. The market's diverse segmentations, including solutions like Big Data Analytics, Machine Learning, and Business Intelligence, cater to specific industry needs. The rise of cloud-based solutions and the increasing sophistication of AI algorithms further contribute to the market's growth. Competitive landscape is shaped by major technology players such as IBM, Cisco, and Microsoft alongside specialized security firms like Splunk and Darktrace. The increasing prevalence of cyber threats and the need for real-time threat detection is a significant catalyst, driving investment and innovation within the market. The continued growth trajectory is anticipated to be influenced by factors such as the growing volume and complexity of data, increasing regulatory compliance requirements, and the escalating demand for advanced threat detection capabilities. However, challenges such as data privacy concerns, the need for skilled professionals, and the complexity of implementing and integrating anomaly detection systems might present some constraints. Nevertheless, the market is expected to witness sustained growth, driven by ongoing technological advancements and the increasing awareness of the critical role anomaly detection plays in safeguarding businesses and critical infrastructure. Geographic expansion, particularly in developing economies, is also projected to further fuel market growth in the coming years.

  5. d

    Anomaly Detection in Sequences

    • catalog.data.gov
    • s.cnmilf.com
    • +3more
    Updated Dec 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2023). Anomaly Detection in Sequences [Dataset]. https://catalog.data.gov/dataset/anomaly-detection-in-sequences
    Explore at:
    Dataset updated
    Dec 6, 2023
    Dataset provided by
    Dashlink
    Description

    We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that arise from recordings of switch sensors in the cockpits of commercial airliners. While the algorithms we present are general and domain-independent, we focus on a specific problem that is critical to determining system-wide health of a fleet of aircraft. The approach taken uses unsupervised clustering of sequences using the normalized length of he longest common subsequence (nLCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. In this method, an outlier sequence is defined as a sequence that is far away from a cluster. We present new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence is deemed to be an outlier. The algorithm provides a coherent description to an analyst of the anomalies in the sequence when compared to more normal sequences. The final section of the paper demonstrates the effectiveness of sequenceMiner for anomaly detection on a real set of discrete sequence data from a fleet of commercial airliners. We show that sequenceMiner discovers actionable and operationally significant safety events. We also compare our innovations with standard HiddenMarkov Models, and show that our methods are superior

  6. Data from: Multi-Source Distributed System Data for AI-powered Analytics

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sasho Nedelkoski; Jasmin Bogatinovski; Ajay Kumar Mandapati; Soeren Becker; Jorge Cardoso; Odej Kao; Sasho Nedelkoski; Jasmin Bogatinovski; Ajay Kumar Mandapati; Soeren Becker; Jorge Cardoso; Odej Kao (2022). Multi-Source Distributed System Data for AI-powered Analytics [Dataset]. http://doi.org/10.5281/zenodo.3549604
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 10, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sasho Nedelkoski; Jasmin Bogatinovski; Ajay Kumar Mandapati; Soeren Becker; Jorge Cardoso; Odej Kao; Sasho Nedelkoski; Jasmin Bogatinovski; Ajay Kumar Mandapati; Soeren Becker; Jorge Cardoso; Odej Kao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract:

    In recent years there has been an increased interest in Artificial Intelligence for IT Operations (AIOps). This field utilizes monitoring data from IT systems, big data platforms, and machine learning to automate various operations and maintenance (O&M) tasks for distributed systems.
    The major contributions have been materialized in the form of novel algorithms.
    Typically, researchers took the challenge of exploring one specific type of observability data sources, such as application logs, metrics, and distributed traces, to create new algorithms.
    Nonetheless, due to the low signal-to-noise ratio of monitoring data, there is a consensus that only the analysis of multi-source monitoring data will enable the development of useful algorithms that have better performance.
    Unfortunately, existing datasets usually contain only a single source of data, often logs or metrics. This limits the possibilities for greater advances in AIOps research.
    Thus, we generated high-quality multi-source data composed of distributed traces, application logs, and metrics from a complex distributed system. This paper provides detailed descriptions of the experiment, statistics of the data, and identifies how such data can be analyzed to support O&M tasks such as anomaly detection, root cause analysis, and remediation.

    General Information:

    This repository contains the simple scripts for data statistics, and link to the multi-source distributed system dataset.

    You may find details of this dataset from the original paper:

    Sasho Nedelkoski, Jasmin Bogatinovski, Ajay Kumar Mandapati, Soeren Becker, Jorge Cardoso, Odej Kao, "Multi-Source Distributed System Data for AI-powered Analytics".

    If you use the data, implementation, or any details of the paper, please cite!

    BIBTEX:

    _

    @inproceedings{nedelkoski2020multi,
     title={Multi-source Distributed System Data for AI-Powered Analytics},
     author={Nedelkoski, Sasho and Bogatinovski, Jasmin and Mandapati, Ajay Kumar and Becker, Soeren and Cardoso, Jorge and Kao, Odej},
     booktitle={European Conference on Service-Oriented and Cloud Computing},
     pages={161--176},
     year={2020},
     organization={Springer}
    }
    

    _

    The multi-source/multimodal dataset is composed of distributed traces, application logs, and metrics produced from running a complex distributed system (Openstack). In addition, we also provide the workload and fault scripts together with the Rally report which can serve as ground truth. We provide two datasets, which differ on how the workload is executed. The sequential_data is generated via executing workload of sequential user requests. The concurrent_data is generated via executing workload of concurrent user requests.

    The raw logs in both datasets contain the same files. If the user wants the logs filetered by time with respect to the two datasets, should refer to the timestamps at the metrics (they provide the time window). In addition, we suggest to use the provided aggregated time ranged logs for both datasets in CSV format.

    Important: The logs and the metrics are synchronized with respect time and they are both recorded on CEST (central european standard time). The traces are on UTC (Coordinated Universal Time -2 hours). They should be synchronized if the user develops multimodal methods. Please read the IMPORTANT_experiment_start_end.txt file before working with the data.

    Our GitHub repository with the code for the workloads and scripts for basic analysis can be found at: https://github.com/SashoNedelkoski/multi-source-observability-dataset/

  7. Data from: Detecting Anomalies in Multivariate Data Sets with Switching...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datasets.ai
    • +3more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Detecting Anomalies in Multivariate Data Sets with Switching Sequences and Continuous Streams [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/detecting-anomalies-in-multivariate-data-sets-with-switching-sequences-and-continuous-stre
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. Here, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequence of events in the discrete streams can lead to off-nominal system performance. We discuss the application domain, novel algorithms, and also briefly discuss results on synthetic and real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art methods.

  8. w

    OceanXtremes: Oceanographic Data-Intensive Anomaly Detection and Analysis...

    • data.wu.ac.at
    • data.amerigeoss.org
    xml
    Updated Jan 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2018). OceanXtremes: Oceanographic Data-Intensive Anomaly Detection and Analysis Portal [Dataset]. https://data.wu.ac.at/schema/data_gov/N2M1NjFmOGYtMGVkMi00OTQ4LWE3ZDUtMDc0N2NhOTA4YmNi
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 25, 2018
    Dataset provided by
    National Aeronautics and Space Administration
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Anomaly detection is a process of identifying items, events or observations, which do not conform to an expected pattern in a dataset or time series. Current and future missions and our research communities challenge us to rapidly identify features and anomalies in complex and voluminous observations to further science and improve decision support. Given this data intensive reality, we propose to develop an anomaly detection system, called OceanXtremes, powered by an intelligent, elastic Cloud-based analytic service backend that enables execution of domain-specific, multi-scale anomaly and feature detection algorithms across the entire archive of ocean science datasets. A parallel analytics engine will be developed as the key computational and data-mining core of OceanXtreams' backend processing. This analytic engine will demonstrate three new technology ideas to provide rapid turn around on climatology computation and anomaly detection: 1. An adaption of the Hadoop/MapReduce framework for parallel data mining of science datasets, typically large 3 or 4 dimensional arrays packaged in NetCDF and HDF. 2. An algorithm profiling service to efficiently and cost-effectively scale up hybrid Cloud computing resources based on the needs of scheduled jobs (CPU, memory, network, and bursting from a private Cloud computing cluster to public cloud provider like Amazon Cloud services). 3. An extension to industry-standard search solutions (OpenSearch and Faceted search) to provide support for shared discovery and exploration of ocean phenomena and anomalies, along with unexpected correlations between key measured variables. We will use a hybrid Cloud compute cluster (private Eucalyptus on-premise at JPL with bursting to Amazon Web Services) as the operational backend. The key idea is that the parallel data-mining operations will be run 'near' the ocean data archives (a local 'network' hop) so that we can efficiently access the thousands of (say, daily) files making up a three decade time-series, and then cache key variables and pre-computed climatologies in a high-performance parallel database. OceanXtremes will be equipped with both web portal and web service interfaces for users and applications/systems to register and retrieve oceanographic anomalies data. By leveraging technology such as Datacasting (Bingham, et.al, 2007), users can also subscribe to anomaly or 'event' types of their interest and have newly computed anomaly metrics and other information delivered to them by metadata feeds packaged in standard Rich Site Summary (RSS) format. Upon receiving new feed entries, users can examine the metrics and download relevant variables, by simply clicking on a link, to begin further analyzing the event. The OceanXtremes web portal will allow users to define their own anomaly or feature types where continuous backend processing will be scheduled to populate the new user-defined anomaly type by executing the chosen data mining algorithm (i.e. differences from climatology or gradients above a specified threshold). Metadata on the identified anomalies will be cataloged including temporal and geospatial profiles, key physical metrics, related observational artifacts and other relevant metadata to facilitate discovery, extraction, and visualization. Products created by the anomaly detection algorithm will be made explorable and subsettable using Webification (Huang, et.al, 2014) and OPeNDAP (http://opendap.org) technologies. Using this platform scientists can efficiently search for anomalies or ocean phenomena, compute data metrics for events or over time-series of ocean variables, and efficiently find and access all of the data relevant to their study (and then download only that data).

  9. Data from: Multiple Kernel Learning for Heterogeneous Anomaly Detection:...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datasets.ai
    • +4more
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). Multiple Kernel Learning for Heterogeneous Anomaly Detection: Algorithm and Aviation Safety Case Study [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/multiple-kernel-learning-for-heterogeneous-anomaly-detection-algorithm-and-aviation-safety
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. In this paper, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequences of events in the discrete streams can lead to off-nominal system performance. We discuss the application domain, novel algorithms, and also discuss results on real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art methods.

  10. CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, csv
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka; Pavel Šiška; Pavel Šiška (2025). CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting [Dataset]. http://doi.org/10.5281/zenodo.13382427
    Explore at:
    csv, application/gzipAvailable download formats
    Dataset updated
    Feb 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka; Pavel Šiška; Pavel Šiška
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CESNET-TimeSeries24: The dataset for network traffic forecasting and anomaly detection

    The dataset called CESNET-TimeSeries24 was collected by long-term monitoring of selected statistical metrics for 40 weeks for each IP address on the ISP network CESNET3 (Czech Education and Science Network). The dataset encompasses network traffic from more than 275,000 active IP addresses, assigned to a wide variety of devices, including office computers, NATs, servers, WiFi routers, honeypots, and video-game consoles found in dormitories. Moreover, the dataset is also rich in network anomaly types since it contains all types of anomalies, ensuring a comprehensive evaluation of anomaly detection methods.

    Last but not least, the CESNET-TimeSeries24 dataset provides traffic time series on institutional and IP subnet levels to cover all possible anomaly detection or forecasting scopes. Overall, the time series dataset was created from the 66 billion IP flows that contain 4 trillion packets that carry approximately 3.7 petabytes of data. The CESNET-TimeSeries24 dataset is a complex real-world dataset that will finally bring insights into the evaluation of forecasting models in real-world environments.

    Please cite the usage of our dataset as:

    Koumar, J., Hynek, K., Čejka, T. et al. CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting. Sci Data 12, 338 (2025). https://doi.org/10.1038/s41597-025-04603-x

    @Article{cesnettimeseries24,
    author={Koumar, Josef and Hynek, Karel and {\v{C}}ejka, Tom{\'a}{\v{s}} and {\v{S}}i{\v{s}}ka, Pavel},
    title={CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting},
    journal={Scientific Data},
    year={2025},
    month={Feb},
    day={26},
    volume={12},
    number={1},
    pages={338},
    issn={2052-4463},
    doi={10.1038/s41597-025-04603-x},
    url={https://doi.org/10.1038/s41597-025-04603-x}
    }

    Time series

    We create evenly spaced time series for each IP address by aggregating IP flow records into time series datapoints. The created datapoints represent the behavior of IP addresses within a defined time window of 10 minutes. The vector of time-series metrics v_{ip, i} describes the IP address ip in the i-th time window. Thus, IP flows for vector v_{ip, i} are captured in time windows starting at t_i and ending at t_{i+1}. The time series are built from these datapoints.

    Datapoints created by the aggregation of IP flows contain the following time-series metrics:

    • Simple volumetric metrics: the number of IP flows, the number of packets, and the transmitted data size (i.e. number of bytes)
    • Unique volumetric metrics: the number of unique destination IP addresses, the number of unique destination Autonomous System Numbers (ASNs), and the number of unique destination transport layer ports. The aggregation of \textit{Unique volumetric metrics} is memory intensive since all unique values must be stored in an array. We used a server with 41 GB of RAM, which was enough for 10-minute aggregation on the ISP network.
    • Ratios metrics: the ratio of UDP/TCP packets, the ratio of UDP/TCP transmitted data size, the direction ratio of packets, and the direction ratio of transmitted data size
    • Average metrics: the average flow duration, and the average Time To Live (TTL)

    Multiple time aggregation: The original datapoints in the dataset are aggregated by 10 minutes of network traffic. The size of the aggregation interval influences anomaly detection procedures, mainly the training speed of the detection model. However, the 10-minute intervals can be too short for longitudinal anomaly detection methods. Therefore, we added two more aggregation intervals to the datasets--1 hour and 1 day.

    Time series of institutions: We identify 283 institutions inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution's data.

    Time series of institutional subnets: We identify 548 institution subnets inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution subnet's data.

    Data Records

    The file hierarchy is described below:

    cesnet-timeseries24/

    |- institution_subnets/

    | |- agg_10_minutes/

    | |- agg_1_hour/

    | |- agg_1_day/

    | |- identifiers.csv

    |- institutions/

    | |- agg_10_minutes/

    | |- agg_1_hour/

    | |- agg_1_day/

    | |- identifiers.csv

    |- ip_addresses_full/

    | |- agg_10_minutes/

    | |- agg_1_hour/

    | |- agg_1_day/

    | |- identifiers.csv

    |- ip_addresses_sample/

    | |- agg_10_minutes/

    | |- agg_1_hour/

    | |- agg_1_day/

    | |- identifiers.csv

    |- times/

    | |- times_10_minutes.csv

    | |- times_1_hour.csv

    | |- times_1_day.csv

    |- ids_relationship.csv
    |- weekends_and_holidays.csv

    The following list describes time series data fields in CSV files:

    • id_time: Unique identifier for each aggregation interval within the time series, used to segment the dataset into specific time periods for analysis.
    • n_flows: Total number of flows observed in the aggregation interval, indicating the volume of distinct sessions or connections for the IP address.
    • n_packets: Total number of packets transmitted during the aggregation interval, reflecting the packet-level traffic volume for the IP address.
    • n_bytes: Total number of bytes transmitted during the aggregation interval, representing the data volume for the IP address.
    • n_dest_ip: Number of unique destination IP addresses contacted by the IP address during the aggregation interval, showing the diversity of endpoints reached.
    • n_dest_asn: Number of unique destination Autonomous System Numbers (ASNs) contacted by the IP address during the aggregation interval, indicating the diversity of networks reached.
    • n_dest_port: Number of unique destination transport layer ports contacted by the IP address during the aggregation interval, representing the variety of services accessed.
    • tcp_udp_ratio_packets: Ratio of packets sent using TCP versus UDP by the IP address during the aggregation interval, providing insight into the transport protocol usage pattern. This metric belongs to the interval <0, 1> where 1 is when all packets are sent over TCP, and 0 is when all packets are sent over UDP.
    • tcp_udp_ratio_bytes: Ratio of bytes sent using TCP versus UDP by the IP address during the aggregation interval, highlighting the data volume distribution between protocols. This metric belongs to the interval <0, 1> with same rule as tcp_udp_ratio_packets.
    • dir_ratio_packets: Ratio of packet directions (inbound versus outbound) for the IP address during the aggregation interval, indicating the balance of traffic flow directions. This metric belongs to the interval <0, 1>, where 1 is when all packets are sent in the outgoing direction from the monitored IP address, and 0 is when all packets are sent in the incoming direction to the monitored IP address.
    • dir_ratio_bytes: Ratio of byte directions (inbound versus outbound) for the IP address during the aggregation interval, showing the data volume distribution in traffic flows. This metric belongs to the interval <0, 1> with the same rule as dir_ratio_packets.
    • avg_duration: Average duration of IP flows for the IP address during the aggregation interval, measuring the typical session length.
    • avg_ttl: Average Time To Live (TTL) of IP flows for the IP address during the aggregation interval, providing insight into the lifespan of packets.

    Moreover, the time series created by re-aggregation contains following time series metrics instead of n_dest_ip, n_dest_asn, and n_dest_port:

    • sum_n_dest_ip: Sum of numbers of unique destination IP addresses.
    • avg_n_dest_ip: The average number of unique destination IP addresses.
    • std_n_dest_ip: Standard deviation of numbers of unique destination IP addresses.
    • sum_n_dest_asn: Sum of numbers of unique destination ASNs.
    • avg_n_dest_asn: The average number of unique destination ASNs.
    • std_n_dest_asn: Standard deviation of numbers of unique destination ASNs)
    • sum_n_dest_port: Sum of numbers of unique destination transport layer ports.
    • avg_n_dest_port: The average number of unique destination transport layer ports.
    • std_n_dest_port: Standard deviation of numbers of unique destination transport layer

  11. Anomaly Detection Global Market Report 2025

    • thebusinessresearchcompany.com
    pdf,excel,csv,ppt
    Updated Jan 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Business Research Company (2025). Anomaly Detection Global Market Report 2025 [Dataset]. https://www.thebusinessresearchcompany.com/report/anomaly-detection-global-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jan 14, 2025
    Dataset authored and provided by
    The Business Research Company
    License

    https://www.thebusinessresearchcompany.com/privacy-policyhttps://www.thebusinessresearchcompany.com/privacy-policy

    Description

    The Anomaly Detection Market is projected to grow at 18.1% CAGR, reaching $12.04 Billion by 2029. Where is the industry heading next? Get the sample report now!

  12. o

    Controlled Anomalies Time Series (CATS) Dataset

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Feb 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fleith (2023). Controlled Anomalies Time Series (CATS) Dataset [Dataset]. http://doi.org/10.5281/zenodo.7646896
    Explore at:
    Dataset updated
    Feb 16, 2023
    Authors
    Patrick Fleith
    Description

    The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies. The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]: Multivariate (17 variables) including sensors reading and control signals. It simulates the operational behaviour of an arbitrary complex system including: 4 Deliberate Actuations / Control Commands sent by a simulated operator / controller, for instance, commands of an operator to turn ON/OFF some equipment. 3 Environmental Stimuli / External Forces acting on the system and affecting its behaviour, for instance, the wind affecting the orientation of a large ground antenna. 10 Telemetry Readings representing the observable states of the complex system by means of sensors, for instance, a position, a temperature, a pressure, a voltage, current, humidity, velocity, acceleration, etc. 5 million timestamps. Sensors readings are at 1Hz sampling frequency. 1 million nominal observations (the first 1 million datapoints). This is suitable to start learning the "normal" behaviour. 4 million observations that include both nominal and anomalous segments. This is suitable to evaluate both semi-supervised approaches (novelty detection) as well as unsupervised approaches (outlier detection). 200 anomalous segments. One anomalous segment may contain several successive anomalous observations / timestamps. Only the last 4 million observations contain anomalous segments. Different types of anomalies to understand what anomaly types can be detected by different approaches. The categories are available in the dataset and in the metadata. Fine control over ground truth. As this is a simulated system with deliberate anomaly injection, the start and end time of the anomalous behaviour is known very precisely. In contrast to real world datasets, there is no risk that the ground truth contains mislabelled segments which is often the case for real data. Suitable for root cause analysis. In addition to the anomaly category, the time series channel in which the anomaly first developed itself is recorded and made available as part of the metadata. This can be useful to evaluate the performance of algorithm to trace back anomalies to the right root cause channel. Affected channels. In addition to the knowledge of the root cause channel in which the anomaly first developed itself, we provide information of channels possibly affected by the anomaly. This can also be useful to evaluate the explainability of anomaly detection systems which may point out to the anomalous channels (root cause and affected). Obvious anomalies. The simulated anomalies have been designed to be "easy" to be detected for human eyes (i.e., there are very large spikes or oscillations), hence also detectable for most algorithms. It makes this synthetic dataset useful for screening tasks (i.e., to eliminate algorithms that are not capable to detect those obvious anomalies). However, during our initial experiments, the dataset turned out to be challenging enough even for state-of-the-art anomaly detection approaches, making it suitable also for regular benchmark studies. Context provided. Some variables can only be considered anomalous in relation to other behaviours. A typical example consists of a light and switch pair. The light being either on or off is nominal, the same goes for the switch, but having the switch on and the light off shall be considered anomalous. In the CATS dataset, users can choose (or not) to use the available context, and external stimuli, to test the usefulness of the context for detecting anomalies in this simulation. Pure signal ideal for robustness-to-noise analysis. The simulated signals are provided without noise: while this may seem unrealistic at first, it is an advantage since users of the dataset can decide to add on top of the provided series any type of noise and choose an amplitude. This makes it well suited to test how sensitive and robust detection algorithms are against various levels of noise. No missing data. You can drop whatever data you want to assess the impact of missing values on your detector with respect to a clean baseline. Change Log Version 2 Metadata: we include a metadata.csv with information about: Anomaly categories Root cause channel (signal in which the anomaly is first visible) Affected channel (signal in which the anomaly might propagate) through coupled system dynamics Removal of anomaly overlaps: version 1 contained anomalies which overlapped with each other resulting in only 190 distinct anomalous segments. Now, there are no more anomaly overlaps. Two data files: CSV and parquet for convenience. [1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive ...

  13. f

    Data from: Nonparametric Anomaly Detection on Time Series of Graphs

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

  14. U

    Unsupervised Learning Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AMA Research & Media LLP (2025). Unsupervised Learning Report [Dataset]. https://www.archivemarketresearch.com/reports/unsupervised-learning-56897
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    AMA Research & Media LLP
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The unsupervised learning market is experiencing robust growth, driven by the increasing volume of unstructured data and the need for businesses to extract valuable insights without pre-defined labels. This market is projected to reach $XX billion in 2025, exhibiting a Compound Annual Growth Rate (CAGR) of XX% during the forecast period of 2025-2033. This substantial growth is fueled by several key trends, including the rising adoption of cloud-based solutions for enhanced scalability and cost-effectiveness, the proliferation of big data analytics applications across various industries, and the increasing demand for advanced anomaly detection and pattern recognition capabilities. The market segmentation reveals a significant contribution from large enterprises due to their higher budgets and complex data management needs, while the cloud-based segment dominates owing to its flexibility and accessibility. Key players like Microsoft, IBM, and Google are heavily investing in R&D and strategic partnerships to consolidate their market share and capitalize on emerging opportunities in areas such as fraud detection, customer segmentation, and predictive maintenance. The market faces challenges such as the complexity of implementing unsupervised learning algorithms and the need for skilled data scientists, however, ongoing technological advancements and the growing availability of user-friendly tools are mitigating these restraints. The continued growth trajectory is anticipated to be further propelled by advancements in deep learning techniques, particularly in areas like generative adversarial networks (GANs) and autoencoders, which are enhancing the accuracy and efficiency of unsupervised learning models. The geographical distribution of the market shows strong performance in North America and Europe, due to early adoption and well-established technological infrastructure. However, the Asia-Pacific region presents a significant growth opportunity, driven by rapid digitalization and increasing investments in data analytics capabilities within emerging economies like India and China. The competitive landscape is characterized by both established technology giants and specialized AI startups, leading to continuous innovation and a wide range of solutions tailored to specific industry needs. The overall outlook for the unsupervised learning market remains highly promising, with significant potential for expansion across various sectors. (Note: To provide specific numerical data for market size and CAGR, please provide those values.)

  15. Fraud Detection And Prevention Market Analysis North America, Europe, APAC,...

    • technavio.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fraud Detection And Prevention Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, UK, Germany, China, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/fraud-detection-and-prevention-market-analysis
    Explore at:
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Europe, United States, Global
    Description

    Snapshot img

    Fraud Detection And Prevention Market Size 2024-2028

    The fraud detection and prevention market size is forecast to increase by USD 86.68 billion at a CAGR of 27.17% between 2023 and 2028.

    In the current business landscape, the market is experiencing significant growth due to several key factors. The increasing adoption of cloud infrastructure services, such as cloud computing and big data, is driving market expansion. These technologies enable organizations to store and process large volumes of data, which is essential for advanced fraud detection techniques like anomaly detection. Moreover, the healthcare services sector is increasingly relying on fraud detection solutions to safeguard sensitive patient data. In addition, the rise of business intelligence (BI) and machine-to-machine (M2M) services is leading to an increased need for robust fraud prevention measures. Phone-based authentication solutions are also gaining popularity as an effective method for securing user identities and preventing fraud. The technological advancement in fraud detection and prevention solutions and services, coupled with the complexity of IT infrastructure, is further fueling market growth.
    

    What will be the Size of the Fraud Detection And Prevention Market During the Forecast Period?

    Request Free Sample

    The market encompasses a range of solutions designed to safeguard businesses and organizations from various types of financial and data breaches. Key end-use industries, including healthcare, manufacturing, governments, and IT , business intelligence and telecom, among others, increasingly rely on advanced technologies to mitigate risks. Market dynamics are driven by the growing adoption of cloud-based solutions, big data analytics, and blockchain technology. These innovations enable real-time fraud detection, enhancing the ability to prevent incidents such as payment fraud, identity theft, phishing scams, and money laundering. 
    SMEs and large enterprises across sectors like travel and transportation, energy and utilities, media and entertainment, professional services, and insurance claims face similar challenges, making the market expansive and diverse. Authentication solutions, real-time fraud detection, and managed services are integral components of the market, catering to the evolving needs of businesses in an increasingly digital world.
    

    How is this Fraud Detection And Prevention Industry segmented and which is the largest segment?

    The fraud detection and prevention industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Component
    
      Solutions
      Services
    
    
    End-user
    
      Large enterprise
      SMEs
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        Germany
        Spain
        UK
    
    
      APAC
    
        China
        Japan
        India
    
    
      South America
    
        South Africa
    
    
      Middle East and Africa
    

    By Component Insights

    The solutions segment is estimated to witness significant growth during the forecast period.
    

    The market is experiencing significant growth due to escalating cyber threats and the increasing need for robust security measures. Key drivers include the rising number of fraudulent activities such as identity theft, money laundering, and phishing scams, as well as economic uncertainty and the pandemic. In the solutions segment, authentication solutions have emerged as a major revenue generator. However, the high cost of biometric technology may hinder growth in this area. SMEs, healthcare, manufacturing, end-use enterprises, governments, IT and telecom, travel and transportation, energy and utilities, media and entertainment, and financial institutions are among the key industries investing in fraud detection and prevention. Digital technologies, including cloud-based solutions, Big Data, artificial intelligence, and machine learning, are increasingly being adopted for real-time fraud detection. Fraud complexity and online data transactions pose significant challenges, necessitating proactive measures and trained cybersecurity professionals.

    Get a glance at the Fraud Detection And Prevention Industry report of share of various segments Request Free Sample

    The Solutions segment was valued at USD 11.84 billion in 2018 and showed a gradual increase during the forecast period.

    Regional Analysis

    North America is estimated to contribute 40% to the growth of the global market during the forecast period.
    

    Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.

    For more insights on the market share of various regions, Request Free Sample

    The North American the market is projected to expand substantially due to the increasing prevalence of cyber threats in sectors like healthcare

  16. Global Graph Analytics Market Size By Deployment Mode, By Component, By...

    • verifiedmarketresearch.com
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Global Graph Analytics Market Size By Deployment Mode, By Component, By Application, By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/graph-analytics-market/
    Explore at:
    Dataset updated
    Feb 19, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2030
    Area covered
    Global
    Description

    Graph Analytics Market size was valued at USD 77.1 Million in 2023 and is projected to reach USD 637.1 Million by 2030, growing at a CAGR of 35.1% during the forecast period 2024-2030.

    Global Graph Analytics Market Drivers
    The market drivers for the Graph Analytics Market can be influenced by various factors. These may include:

    Growing Need for Data Analysis: In order to extract insightful information from the massive amounts of data generated by social media, IoT devices, and corporate transactions, there is a growing need for sophisticated analytics tools like graph analytics.

    Growing Uptake of Big Data Tools: Graph analytics solutions are becoming more and more popular due to the spread of big data platforms and technology. Businesses are using these technologies to improve the efficiency of their analysis of intricately linked datasets.

    Developments in AI and ML: The capabilities of graph analytics solutions are being improved by advances in machine learning and artificial intelligence. These technologies make it possible for recommendation systems, anomaly detection, and forecasts based on graph data to be more accurate.

    Increasing Recognition of the Advantages of Graph Databases: Businesses are realizing the advantages of graph databases for handling and evaluating highly related data. Consequently, there’s been a sharp increase in the use of graph analytics tools to leverage the potential of graph databases for diverse applications.

    The use of advanced analytics solutions, such as graph analytics, for fraud detection, cybersecurity, and risk management is becoming more and more important as a result of the increase in cyberthreats and fraudulent activity.

    Demand for Personalized suggestions: Companies in a variety of sectors are using graph analytics to provide their clients with suggestions that are tailored specifically to them. Personalized recommendations increase consumer engagement and loyalty on social networking, e-commerce, and entertainment platforms.

    Analysis of Networks and Social Media is Necessary: In order to comprehend relationships, influence patterns, and community structures, networks and social media data must be analyzed using graph analytics. The capacity to do this is very helpful for security agencies, sociologists, and marketers.

    Government programs and Regulations: The need for graph analytics solutions is being driven by regulations pertaining to data security and privacy as well as government programs aimed at encouraging the adoption of data analytics. These tools are being purchased by organizations in order to guarantee compliance and reduce risks.

    Emergence of Industry-specific Use Cases: Graph analytics is finding applications in a number of areas, such as healthcare, finance, retail, and transportation. These use cases include supply chain management, customer attrition prediction, and financial fraud detection in addition to patient care optimization.

    Technological Developments in Graph Analytics Tools: As graph analytics tools, algorithms, and platforms continue to evolve, their capabilities and performance are being enhanced. Adoption is being fueled by this technological advancement across a variety of industries and use cases.

  17. d

    Data from: Fleet Level Anomaly Detection of Aviation Safety Data

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2023). Fleet Level Anomaly Detection of Aviation Safety Data [Dataset]. https://catalog.data.gov/dataset/fleet-level-anomaly-detection-of-aviation-safety-data
    Explore at:
    Dataset updated
    Dec 6, 2023
    Dataset provided by
    Dashlink
    Description

    For the purposes of this paper, the National Airspace System (NAS) encompasses the operations of all aircraft which are subject to air traffic control procedures. The NAS is a highly complex dynamic system that is sensitive to aeronautical decision-making and risk management skills. In order to ensure a healthy system with safe flights a systematic approach to anomaly detection is very important when evaluating a given set of circumstances and for determination of the best possible course of action. Given the fact that the NAS is a vast and loosely integrated network of systems, it requires improved safety assurance capabilities to maintain an extremely low accident rate under increasingly dense operating conditions. Data mining based tools and techniques are required to support and aid operators’ (such as pilots, management, or policy makers) overall decision-making capacity. Within the NAS, the ability to analyze fleetwide aircraft data autonomously is still considered a significantly challenging task. For our purposes a fleet is defined as a group of aircraft sharing generally compatible parameter lists. Here, in this effort, we aim at developing a system level analysis scheme. In this paper we address the capability for detection of fleetwide anomalies as they occur, which itself is an important initiative toward the safety of the real-world flight operations. The flight data recorders archive millions of data points with valuable information on flights everyday. The operational parameters consist of both continuous and discrete (binary & categorical) data from several critical subsystems and numerous complex procedures. In this paper, we discuss a system level anomaly detection approach based on the theory of kernel learning to detect potential safety anomalies in a very large data base of commercial aircraft. We also demonstrate that the proposed approach uncovers some operationally significant events due to environmental, mechanical, and human factors issues in high dimensional, multivariate Flight Operations Quality Assurance (FOQA) data. We present the results of our detection algorithms on real FOQA data from a regional carrier.

  18. ESA Anomaly Dataset

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Jun 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriele De Canio; Gabriele De Canio; Krzysztof Kotowski; Christoph Haskamp; Krzysztof Kotowski; Christoph Haskamp (2024). ESA Anomaly Dataset [Dataset]. http://doi.org/10.5281/zenodo.12528696
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 28, 2024
    Dataset provided by
    European Space Agencyhttp://www.esa.int/
    Authors
    Gabriele De Canio; Gabriele De Canio; Krzysztof Kotowski; Christoph Haskamp; Krzysztof Kotowski; Christoph Haskamp
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Time period covered
    Jun 25, 2024
    Description

    ESA Anomaly Dataset is the first large-scale, real-life satellite telemetry dataset with curated anomaly annotations originated from three ESA missions. We hope that this unique dataset will allow researchers and scientists from academia, research institutes, national and international space agencies, and industry to benchmark models and approaches on a common baseline as well as research and develop novel, computational-efficient approaches for anomaly detection in satellite telemetry data.

    The dataset results from the work of an 18-month project carried by an industry Consortium composed of Airbus Defence and Space, KP Labs and the European Space Agency’s European Space Operations Centre. The project, funded by the European Space Agency (ESA), is part of the Artificial Intelligence for Automation (A²I) Roadmap (De Canio et al., 2023), a large endeavour started in 2021 to automate space operations by leveraging artificial intelligence.

    Further details can be found on the arXiv and Github.

    References
    De Canio, G. et al. (2023) Development of an actionable AI roadmap for automating mission operations. In, 2023 SpaceOps Conference. American Institute of Aeronautics and Astronautics, Dubai, United Arab Emirates.

  19. d

    Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...

    • catalog.data.gov
    • datadiscoverystudio.org
    • +6more
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms [Dataset]. https://catalog.data.gov/dataset/discovering-anomalous-aviation-safety-events-using-scalable-data-mining-algorithms
    Explore at:
    Dataset updated
    Dec 6, 2023
    Dataset provided by
    Dashlink
    Description

    The worldwide civilian aviation system is one of the most complex dynamical systems created. Most modern commercial aircraft have onboard flight data recorders that record several hundred discrete and continuous parameters at approximately 1Hz for the entire duration of the flight. These data contain information about the flight control systems, actuators, engines, landing gear, avionics, and pilot commands. In this paper, recent advances in the development of a novel knowledge discovery process consisting of a suite of data mining techniques for identifying precursors to aviation safety incidents are discussed. The data mining techniques include scalable multiple-kernel learning for large-scale distributed anomaly detection. A novel multivariate time-series search algorithm is used to search for signatures of discovered anomalies on massive datasets. The process can identify operationally significant events due to environmental, mechanical, and human factors issues in the high-dimensional flight operations quality assurance data. All discovered anomalies are validated by a team of independent domain experts. This novel automated knowledge discovery process is aimed at complementing the state-of-the-art human-generated exceedance-based analysis that fails to discover previously unknown aviation safety incidents. In this paper, the discovery pipeline, the methods used, and some of the significant anomalies detected on real-world commercial aviation data are discussed.

  20. Algorithms for Speeding up Distance-Based Outlier Detection

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    application/rdfxml +5
    Updated Jun 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Algorithms for Speeding up Distance-Based Outlier Detection [Dataset]. https://data.nasa.gov/dataset/Algorithms-for-Speeding-up-Distance-Based-Outlier-/hwws-rz2p
    Explore at:
    csv, xml, tsv, application/rssxml, application/rdfxml, jsonAvailable download formats
    Dataset updated
    Jun 26, 2018
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Area covered
    Speed limit
    Description

    The problem of distance-based outlier detection is difficult to solve efficiently in very large datasets because of potential quadratic time complexity. We address this problem and develop sequential and distributed algorithms that are significantly more efficient than state-of-the-art methods while still guaranteeing the same outliers. By combining simple but effective indexing and disk block accessing techniques, we have developed a sequential algorithm iOrca that is up to an order-of-magnitude faster than the state-of-the-art. The indexing scheme is based on sorting the data points in order of increasing distance from a fixed reference point and then accessing those points based on this sorted order. To speed up the basic outlier detection technique, we develop two distributed algorithms (DOoR and iDOoR) for modern distributed multi-core clusters of machines, connected on a ring topology. The first algorithm passes data blocks from each machine around the ring, incrementally updating the nearest neighbors of the points passed. By maintaining a cutoff threshold, it is able to prune a large number of points in a distributed fashion. The second distributed algorithm extends this basic idea with the indexing scheme discussed earlier. In our experiments, both distributed algorithms exhibit significant improvements compared to the state-of-the-art distributed methods.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
VERIFIED MARKET RESEARCH (2024). Anomaly Detection Market By Component (Solutions & Services), Technology (Big Data Analytics, Machine Learning and Artificial Intelligence), Vertical (Manufacturing IT and Telecom), Service (Professional services & Managed services), & Region for 2024-2031 [Dataset]. https://www.verifiedmarketresearch.com/product/global-anomaly-detection-market-size-and-forecast/
Organization logo

Anomaly Detection Market By Component (Solutions & Services), Technology (Big Data Analytics, Machine Learning and Artificial Intelligence), Vertical (Manufacturing IT and Telecom), Service (Professional services & Managed services), & Region for 2024-2031

Explore at:
Dataset updated
May 2, 2024
Dataset provided by
Verified Market Researchhttps://www.verifiedmarketresearch.com/
Authors
VERIFIED MARKET RESEARCH
License

https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

Time period covered
2024 - 2031
Area covered
Global
Description

Anomaly Detection Market size was valued at USD 5.66 Billion in 2024 and is projected to reach USD 19.4 Billion by 2031, growing at a CAGR of 16.65% from 2024 to 2031.

The Anomaly Detection market is experiencing significant growth driven by several key factors. One primary driver is the escalating frequency and sophistication of cyber threats and security breaches across industries, compelling organizations to adopt advanced anomaly detection solutions to safeguard their digital assets and sensitive data. Additionally, the proliferation of big data and the Internet of Things (IoT) generates vast volumes of data that traditional security measures struggle to monitor effectively, creating a pressing need for anomaly detection capabilities. Moreover, the increasing adoption of artificial intelligence (AI) and machine learning (ML) technologies enhances anomaly detection algorithms’ accuracy and efficacy, enabling organizations to detect and mitigate anomalies in real-time. Furthermore, stringent regulatory requirements and compliance standards, particularly in sectors such as finance, healthcare, and telecommunications, are driving the adoption of anomaly detection solutions to ensure regulatory compliance and mitigate risks. Additionally, the growing demand for anomaly detection in fraud detection, network security, and operational monitoring applications further fuels market growth, presenting lucrative opportunities for vendors in the Anomaly Detection market.

Search
Clear search
Close search
Google apps
Main menu