100+ datasets found
  1. d

    Data from: Distributed Anomaly Detection using 1-class SVM for Vertically...

    • catalog.data.gov
    • data.nasa.gov
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

  2. P

    Radio observatory anomaly detection dataset Dataset

    • paperswithcode.com
    Updated Jul 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Mesarcik; Albert-Jan Boonstra; Marco Iacobelli; Elena Ranguelova; Cees de Laat; Rob van Nieuwpoort (2023). Radio observatory anomaly detection dataset Dataset [Dataset]. https://paperswithcode.com/dataset/radio-observatory-anomaly-detection-dataset
    Explore at:
    Dataset updated
    Jul 2, 2023
    Authors
    Michael Mesarcik; Albert-Jan Boonstra; Marco Iacobelli; Elena Ranguelova; Cees de Laat; Rob van Nieuwpoort
    Description

    The ROAD dataset is made up of observations from the Low Frequency Array (LOFAR) telescope. LOFAR is comprised of 52 stations across Europe, where each station is an array of 96 dual polarisation low-band antennas (LBA) in the 10–90 MHz range and 48 or 96 dual polarisation high-band antenna antennas (HBA) in the 110–250 MHz range. The data are four dimensional, with the dimensions corresponding to time, frequency, polarisation, and station. dictate the array configuration (i.e. the number of stations used), the number of frequency channels (Nf), the time sampling, as well as the overall integration time (Nt) of the observing session. Furthermore, the dual-polarisation of the antennas results in a correlation product (Npol) of size 4. The ROAD dataset contains ten classes that describe various system-wide phenomena and anomalies from data obtained by the LOFAR telescope. These classes are categorised into four groups: data processing system failures, electronic anomalies, environmental effects, and unwanted astronomical events as shown by the table below.

    CategoryDescriptionBandPolarisationOccurrence rateNum Samples
    NormalAll non-characterised effectsBothAll-4687
    % Electric fenceRFI emitted from electric fencesLowCross64
    Data processing
    First order data lossData loss from consecutive time and/or frequency channelsBothAll0.02146
    Second order data lossData loss from single frequency and/or single time channelsBothAll0.04283
    Electronic systems
    High noise elementHigh power disturbances caused by miscellaneous eventsBothAll0.0188
    Oscillating tileAmplifier going into oscillationHighAll0.0156
    Astronomical events
    Source in side-lobesA-team source passing through side-lobesHighAll0.06446
    Galactic planeGalactic plane passing through the main lobe of the antennaBothCross0.08550
    Solar stormStrong emissions from the sunLowAll0.02147
    Environmental effects
    LightningLightning stormBothAll0.06389
    Ionospheric RFI reflectionsRFI reflected from the ionosphereLowAll0.04261
  3. Anomaly Detection Market Analysis North America, Europe, APAC, South...

    • technavio.com
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Anomaly Detection Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, Germany, UK, China, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/anomaly-detection-market-industry-analysis
    Explore at:
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    United States, Global
    Description

    Snapshot img

    Anomaly Detection Market Size 2024-2028

    The anomaly detection market size is forecast to increase by USD 3.71 billion at a CAGR of 13.63% between 2023 and 2028. Anomaly detection is a critical aspect of cybersecurity, particularly in sectors like healthcare where abnormal patient conditions or unusual network activity can have significant consequences. The market for anomaly detection solutions is experiencing significant growth due to several factors. Firstly, the increasing incidence of internal threats and cyber frauds has led organizations to invest in advanced tools for detecting and responding to anomalous behavior. Secondly, the infrastructural requirements for implementing these solutions are becoming more accessible, making them a viable option for businesses of all sizes. Data science and machine learning algorithms play a crucial role in anomaly detection, enabling accurate identification of anomalies and minimizing the risk of incorrect or misleading conclusions.

    However, data quality is a significant challenge in this field, as poor quality data can lead to false positives or false negatives, undermining the effectiveness of the solution. Overall, the market for anomaly detection solutions is expected to grow steadily in the coming years, driven by the need for enhanced cybersecurity and the increasing availability of advanced technologies.

    What will be the Anomaly Detection Market Size During the Forecast Period?

    Request Free Sample

    Anomaly detection, also known as outlier detection, is a critical data analysis technique used to identify observations or events that deviate significantly from the normal behavior or expected patterns in data. These deviations, referred to as anomalies or outliers, can indicate infrastructure failures, breaking changes, manufacturing defects, equipment malfunctions, or unusual network activity. In various industries, including manufacturing, cybersecurity, healthcare, and data science, anomaly detection plays a crucial role in preventing incorrect or misleading conclusions. Artificial intelligence and machine learning algorithms, such as statistical tests (Grubbs test, Kolmogorov-Smirnov test), decision trees, isolation forest, naive Bayesian, autoencoders, local outlier factor, and k-means clustering, are commonly used for anomaly detection.

    Furthermore, these techniques help identify anomalies by analyzing data points and their statistical properties using charts, visualization, and ML models. For instance, in manufacturing, anomaly detection can help identify defective products, while in cybersecurity, it can detect unusual network activity. In healthcare, it can be used to identify abnormal patient conditions. By applying anomaly detection techniques, organizations can proactively address potential issues and mitigate risks, ensuring optimal performance and security.

    Market Segmentation

    The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Deployment
    
      Cloud
      On-premise
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        Japan
    
    
      South America
    
    
    
      Middle East and Africa
    

    By Deployment Insights

    The cloud segment is estimated to witness significant growth during the forecast period. The market is witnessing a notable shift towards cloud-based solutions due to their numerous advantages over traditional on-premises systems. Cloud-based anomaly detection offers breaking changes such as quicker deployment, enhanced flexibility, and scalability, real-time data visibility, and customization capabilities. These features are provided by service providers with flexible payment models like monthly subscriptions and pay-as-you-go, making cloud-based software a cost-effective and economical choice. Anodot, Ltd, Cisco Systems Inc, IBM Corp, and SAS Institute Inc are some prominent companies offering cloud-based anomaly detection solutions in addition to on-premise alternatives. In the context of security threats, architectural optimization, marketing strategies, finance, fraud detection, manufacturing, and defects, equipment malfunctions, cloud-based anomaly detection is becoming increasingly popular due to its ability to provide real-time insights and swift response to anomalies.

    Get a glance at the market share of various segments Request Free Sample

    The cloud segment accounted for USD 1.59 billion in 2018 and showed a gradual increase during the forecast period.

    Regional Insights

    When it comes to Anomaly Detection Market growth, North America is estimated to contribute 37% to the global market during the forecast period. Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast per

  4. d

    Comparative Analysis of Data-Driven Anomaly Detection Methods

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Comparative Analysis of Data-Driven Anomaly Detection Methods [Dataset]. https://catalog.data.gov/dataset/comparative-analysis-of-data-driven-anomaly-detection-methods
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    This paper provides a review of three different advanced machine learning algorithms for anomaly detection in continuous data streams from a ground-test firing of a subscale Solid Rocket Motor (SRM). This study compares Orca, one-class support vector machines, and the Inductive Monitoring System (IMS) for anomaly detection on the data streams. We measure the performance of the algorithm with respect to the detection horizon for situations where fault information is available. These algorithms have been also studied by the present authors (and other co-authors) as applied to liquid propulsion systems. The trade space will be explored between these algorithms for both types of propulsion systems.

  5. z

    Controlled Anomalies Time Series (CATS) Dataset

    • zenodo.org
    • explore.openaire.eu
    bin
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fleith; Patrick Fleith (2024). Controlled Anomalies Time Series (CATS) Dataset [Dataset]. http://doi.org/10.5281/zenodo.7646897
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Solenix Engineering GmbH
    Authors
    Patrick Fleith; Patrick Fleith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies.

    The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]:

    • Multivariate (17 variables) including sensors reading and control signals. It simulates the operational behaviour of an arbitrary complex system including:
      • 4 Deliberate Actuations / Control Commands sent by a simulated operator / controller, for instance, commands of an operator to turn ON/OFF some equipment.
      • 3 Environmental Stimuli / External Forces acting on the system and affecting its behaviour, for instance, the wind affecting the orientation of a large ground antenna.
      • 10 Telemetry Readings representing the observable states of the complex system by means of sensors, for instance, a position, a temperature, a pressure, a voltage, current, humidity, velocity, acceleration, etc.
    • 5 million timestamps. Sensors readings are at 1Hz sampling frequency.
      • 1 million nominal observations (the first 1 million datapoints). This is suitable to start learning the "normal" behaviour.
      • 4 million observations that include both nominal and anomalous segments. This is suitable to evaluate both semi-supervised approaches (novelty detection) as well as unsupervised approaches (outlier detection).
    • 200 anomalous segments. One anomalous segment may contain several successive anomalous observations / timestamps. Only the last 4 million observations contain anomalous segments.
    • Different types of anomalies to understand what anomaly types can be detected by different approaches.
    • Fine control over ground truth. As this is a simulated system with deliberate anomaly injection, the start and end time of the anomalous behaviour is known very precisely. In contrast to real world datasets, there is no risk that the ground truth contains mislabelled segments which is often the case for real data.
    • Obvious anomalies. The simulated anomalies have been designed to be "easy" to be detected for human eyes (i.e., there are very large spikes or oscillations), hence also detectable for most algorithms. It makes this synthetic dataset useful for screening tasks (i.e., to eliminate algorithms that are not capable to detect those obvious anomalies). However, during our initial experiments, the dataset turned out to be challenging enough even for state-of-the-art anomaly detection approaches, making it suitable also for regular benchmark studies.
    • Context provided. Some variables can only be considered anomalous in relation to other behaviours. A typical example consists of a light and switch pair. The light being either on or off is nominal, the same goes for the switch, but having the switch on and the light off shall be considered anomalous. In the CATS dataset, users can choose (or not) to use the available context, and external stimuli, to test the usefulness of the context for detecting anomalies in this simulation.
    • Pure signal ideal for robustness-to-noise analysis. The simulated signals are provided without noise: while this may seem unrealistic at first, it is an advantage since users of the dataset can decide to add on top of the provided series any type of noise and choose an amplitude. This makes it well suited to test how sensitive and robust detection algorithms are against various levels of noise.
    • No missing data. You can drop whatever data you want to assess the impact of missing values on your detector with respect to a clean baseline.

    [1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. doi:10.14778/3538598.3538602”

    About Solenix

    Solenix is an international company providing software engineering, consulting services and software products for the space market. Solenix is a dynamic company that brings innovative technologies and concepts to the aerospace market, keeping up to date with technical advancements and actively promoting spin-in and spin-out technology activities. We combine modern solutions which complement conventional practices. We aspire to achieve maximum customer satisfaction by fostering collaboration, constructivism, and flexibility.

  6. Satellite telemetry data anomaly prediction

    • kaggle.com
    Updated Apr 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orvile (2025). Satellite telemetry data anomaly prediction [Dataset]. https://www.kaggle.com/datasets/orvile/satellite-telemetry-data-anomaly-prediction
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Orvile
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    OPSSAT-AD - anomaly detection dataset for satellite telemetry

    This is the AI-ready benchmark dataset (OPSSAT-AD) containing the telemetry data acquired on board OPS-SAT---a CubeSat mission that has been operated by the European Space Agency.

    It is accompanied by the paper with baseline results obtained using 30 supervised and unsupervised classic and deep machine learning algorithms for anomaly detection. They were trained and validated using the training-test dataset split introduced in this work, and we present a suggested set of quality metrics that should always be calculated to confront the new algorithms for anomaly detection while exploiting OPSSAT-AD. We believe that this work may become an important step toward building a fair, reproducible, and objective validation procedure that can be used to quantify the capabilities of the emerging anomaly detection techniques in an unbiased and fully transparent way.

    The included files are:

    segments.csv with the acquired telemetry signals from ESA OPS-SAT aircraft,
    dataset.csv with the extracted, synthetic features are computed for each manually split and labeled telemetry segment.
    code files for data processing and example modeliing (dataset_generator.ipynb for data processing, modeling_examples.ipynb with simple examples, requirements.txt- with details on Python configuration, and the LICENSE file)
    

    Citation Bogdan, R. (2024). OPSSAT-AD - anomaly detection dataset for satellite telemetry [Data set]. Ruszczak. https://doi.org/10.5281/zenodo.15108715

  7. i

    Unified Spacecraft Anomaly Detection Benchmark Dataset

    • ieee-dataport.org
    Updated Mar 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankit Srivastava (2024). Unified Spacecraft Anomaly Detection Benchmark Dataset [Dataset]. https://ieee-dataport.org/documents/unified-spacecraft-anomaly-detection-benchmark-dataset
    Explore at:
    Dataset updated
    Mar 30, 2024
    Authors
    Ankit Srivastava
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    finance

  8. f

    Anomaly Detection in High-Dimensional Data

    • tandf.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles (2023). Anomaly Detection in High-Dimensional Data [Dataset]. http://doi.org/10.6084/m9.figshare.12844508.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We define an anomaly as an observation where its k-nearest neighbor distance with the maximum gap is significantly different from what we would expect if the distribution of k-nearest neighbors with the maximum gap is in the maximum domain of attraction of the Gumbel distribution. An approach based on extreme value theory is used for the anomalous threshold calculation. Using various synthetic and real datasets, we demonstrate the wide applicability and usefulness of our algorithm, which we call the stray algorithm. We also demonstrate how this algorithm can assist in detecting anomalies present in other data structures using feature engineering. We show the situations where the stray algorithm outperforms the HDoutliers algorithm both in accuracy and computational time. This framework is implemented in the open source R package stray. Supplementary materials for this article are available online.

  9. v

    Global Anomaly Detection Solution Market Size By Type (Statistical Anomaly...

    • verifiedmarketresearch.com
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Global Anomaly Detection Solution Market Size By Type (Statistical Anomaly Detection, Machine Learning Anomaly Detection), By Application (Network Security, Fraud Detection, Risk Management), By Industry Vertical (Banking, Financial Services, And Insurance (BFSI), Retail And E-commerce, Healthcare), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/anomaly-detection-solution-market/
    Explore at:
    Dataset updated
    Nov 1, 2024
    Dataset authored and provided by
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Description

    Anomaly Detection Solution Market size was valued at USD 6.18 Billion in 2024 and is projected to reach USD 19.99 Billion by 2032, growing at a CAGR of 15.80% from 2026 to 2032.

    Global Anomaly Detection Solution Market Dynamics

    The key market dynamics that are shaping the global Anomaly Detection Solution Market include:

    Key Market Drivers: Increasing Cybersecurity Threats: The surge in sophisticated cyberattacks and data breaches is a key driver of the Anomaly Detection Solution Market. Cybercriminals are increasingly targeting organizations with innovative tactics for breaching security systems. Anomaly detection solutions are critical for detecting unexpected patterns or behaviors that could indicate a threat such as unauthorized access or insider threats. Growing Volume of Data: The exponential rise of data generated by businesses, fueled by digital transformation and IoT devices, needs excellent anomaly detection.

  10. A

    Anomaly Detection Industry Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Anomaly Detection Industry Report [Dataset]. https://www.datainsightsmarket.com/reports/anomaly-detection-industry-14721
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The anomaly detection market is experiencing robust growth, fueled by the increasing volume and complexity of data generated across various industries. A compound annual growth rate (CAGR) of 16.22% from 2019 to 2024 suggests a significant market expansion, driven by the imperative for businesses to enhance cybersecurity, improve operational efficiency, and gain valuable insights from their data. Key drivers include the rising adoption of cloud computing, the proliferation of IoT devices generating massive datasets, and the growing need for real-time fraud detection and prevention, particularly within the BFSI (Banking, Financial Services, and Insurance) sector. The market is segmented by solution type (software, services), end-user industry (BFSI, manufacturing, healthcare, IT and telecommunications, others), and deployment (on-premise, cloud). The cloud deployment segment is anticipated to witness faster growth due to its scalability, cost-effectiveness, and ease of implementation. The increasing sophistication of cyberattacks and the need for proactive security measures are further bolstering demand for advanced anomaly detection solutions. While data privacy concerns and the complexity of integrating these solutions into existing IT infrastructure represent potential restraints, the overall market trajectory indicates a sustained period of expansion. Companies like SAS Institute, IBM, and Microsoft are actively shaping this market with their comprehensive offerings. The significant growth trajectory is expected to continue through 2033. The substantial investments in research and development by major players and the growing adoption across diverse sectors, including healthcare for predictive maintenance and anomaly detection in medical imaging, will continue to fuel the expansion. The competitive landscape is characterized by both established players offering comprehensive solutions and emerging niche players focusing on specific industry needs. This competitive dynamism fosters innovation and drives the development of more efficient and sophisticated anomaly detection technologies. While regional variations exist, North America and Europe currently hold a significant market share, with Asia-Pacific poised for rapid expansion due to increasing digitalization and investment in advanced technologies. This report provides a detailed analysis of the global anomaly detection market, projecting robust growth from $XXX million in 2025 to $YYY million by 2033. The study covers the historical period (2019-2024), base year (2025), and forecast period (2025-2033), offering invaluable insights for businesses navigating this rapidly evolving landscape. Keywords: Anomaly detection, machine learning, AI, cybersecurity, fraud detection, predictive analytics, data mining, big data analytics, real-time analytics. Recent developments include: June 2023: Wipro has launched a new suite of banking financial services built on Microsoft Cloud; the partnership will combine Microsoft Cloud capabilities with Wipro FullStride Cloud and leverage Wipro's and Capco's deep domain expertise in financial services. And develop new solutions to help financial services clients accelerate growth and deepen client relationships., June 2023: Cisco has announced delivering on its promise of the AI-driven Cisco Security Cloud to simplify cybersecurity and empower people to do their best work from anywhere, regardless of the increasingly sophisticated threat landscape. Cisco invests in cutting-edge artificial intelligence and machine learning innovations that will empower security teams by simplifying operations and increasing efficacy.. Key drivers for this market are: Increasing Number of Cyber Crimes, Increasing Adoption of Anomaly Detection Solutions in Software Testing. Potential restraints include: Open Source Alternatives Pose as a Threat. Notable trends are: BFSI is Expected to Hold a Significant Part of the Market Share.

  11. Z

    Data set for anomaly detection on a HPC system

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Borghesi (2023). Data set for anomaly detection on a HPC system [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3251872
    Explore at:
    Dataset updated
    Apr 19, 2023
    Dataset provided by
    Andrea Borghesi
    Andrea Bartolini
    Francesco Beneventi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set contains the data collected on the DAVIDE HPC system (CINECA & E4 & University of Bologna, Bologna, Italy) in the period March-May 2018.

    The data set has been used to train a autoencoder-based model to automatically detect anomalies in a semi-supervised fashion, on a real HPC system.

    This work is described in:

    1) "Anomaly Detection using Autoencoders in High Performance Computing Systems", Andrea Borghesi, Andrea Bartolini, Michele Lombardi, Michela Milano, Luca Benini, IAAI19 (proceedings in process) -- https://arxiv.org/abs/1902.08447

    2) "Online Anomaly Detection in HPC Systems", Andrea Borghesi, Antonio Libri, Luca Benini, Andrea Bartolini, AICAS19 (proceedings in process) -- https://arxiv.org/abs/1811.05269

    See the git repository for usage examples & details --> https://github.com/AndreaBorghesi/anomaly_detection_HPC

  12. ComplexVAD Video Anomaly Detection Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Furkan Mumcu; Furkan Mumcu; Mike Jones; Mike Jones; Anoop Cherian; Anoop Cherian; Yasin Yilmaz; Yasin Yilmaz (2024). ComplexVAD Video Anomaly Detection Dataset [Dataset]. http://doi.org/10.5281/zenodo.11475281
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Furkan Mumcu; Furkan Mumcu; Mike Jones; Mike Jones; Anoop Cherian; Anoop Cherian; Yasin Yilmaz; Yasin Yilmaz
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Introduction

    The ComplexVAD dataset consists of 104 training and 113 testing video sequences taken from a static camera looking at a scene of a two-lane street with sidewalks on either side of the street and another sidewalk going across the street at a crosswalk. The videos were collected over a period of a few months on the campus of the University of South Florida using a camcorder with 1920 x 1080 pixel resolution. Videos were collected at various times during the day and on each day of the week. Videos vary in duration with most being about 12 minutes long. The total duration of all training and testing videos is a little over 34 hours. The scene includes cars, buses and golf carts driving in two directions on the street, pedestrians walking and jogging on the sidewalks and crossing the street, people on scooters, skateboards and bicycles on the street and sidewalks, and cars moving in the parking lot in the background. Branches of a tree also move at the top of many frames.

    The 113 testing videos have a total of 118 anomalous events consisting of 40 different anomaly types.

    Ground truth annotations are provided for each testing video in the form of bounding boxes around each anomalous event in each frame. Each bounding box is also labeled with a track number, meaning each anomalous event is labeled as a track of bounding boxes. A single frame can have more than one anomaly labeled.

    At a Glance

    • The size of the unzipped dataset is ~39GB
    • The dataset consists of Train sequences (containing only videos with normal activity), Test sequences (containing some anomalous activity), a ground truth annotation file for each Test sequence, and a README.md file describing the data organization and ground truth annotation format.
    • The zip files contain a Train directory, a Test directory, an annotations directory, and a README.md file.

    License

    The ComplexVAD dataset is released under CC-BY-SA-4.0 license.

    All data:

    Created by Mitsubishi Electric Research Laboratories (MERL), 2024
    
    SPDX-License-Identifier: CC-BY-SA-4.0
  13. d

    Unsupervised Anomaly Detection for Liquid-Fueled Rocket Prop...

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Apr 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Unsupervised Anomaly Detection for Liquid-Fueled Rocket Prop... [Dataset]. https://catalog.data.gov/dataset/unsupervised-anomaly-detection-for-liquid-fueled-rocket-prop
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset provided by
    Dashlink
    Description

    Title: Unsupervised Anomaly Detection for Liquid-Fueled Rocket Propulsion Health Monitoring. Abstract: This article describes the results of applying four unsupervised anomaly detection algorithms to data from two rocket propulsion testbeds. The first testbed uses historical data from the Space Shuttle Main Engine. The second testbed uses data from an experimental rocket engine test stand located at NASA Stennis Space Center. The article describes nine anomalies detected by the four algorithms. The four algorithms use four different definitions of anomalousness. Orca uses a nearest-neighbor approach, defining a point to be an anomaly if its nearest neighbors in the data space are far away from it. The Inductive Monitoring System clusters the training data, and then uses the distance to the nearest cluster as its measure of anomalousness. GritBot learns rules from the training data, and then classifies points as anomalous if they violate these rules. One-class support vector machines map the data into a high-dimensional space in which most of the normal points are on one side of a hyperplane, and then classify points on the other side of the hyperplane as anomalous. Because of these different definitions of anomalousness, different algorithms detect different anomalies. We therefore conclude that it is useful to use multiple algorithms.

  14. m

    Anomaly Detection

    • data.mendeley.com
    • narcis.nl
    Updated Jan 19, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Pescape' (2017). Anomaly Detection [Dataset]. http://doi.org/10.17632/dkg3b6vz65.1
    Explore at:
    Dataset updated
    Jan 19, 2017
    Authors
    Antonio Pescape'
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Time Series for Anomaly Detection

    The file is a Matlab data file. It contains 3 time series, representing the packet rate of 3 different traffic traces, related to inbound traffic of the UNINA Network. The traces were collected in year 2004. The packet rate was sampled with a period of 2 seconds and each trace lasts 2 hours. These data have been used for studies on volume-based anomaly detection and are related to time intervals during which no anomalies were observed on the UNINA network by the NOC operators. In other words, they can be considered anomaly-free.

    When refering to our Anomaly Detection Dataset, please cite the following reference:

    A. Dainotti, A. Pescapè, G. Ventre, "A cascade architecture for DoS attacks detection based on the wavelet transform", Journal of Computer Security, Volume 17, Number 6/2009, Pages 945-968.

  15. Data from: Anomaly Detection in a Fleet of Systems

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datasets.ai
    • +3more
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). Anomaly Detection in a Fleet of Systems [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/anomaly-detection-in-a-fleet-of-systems
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    A fleet is a group of systems (e.g., cars, aircraft) that are designed and manufactured the same way and are intended to be used the same way. For example, a fleet of delivery trucks may consist of one hundred instances of a particular model of truck, each of which is intended for the same type of service—almost the same amount of time and distance driven every day, approximately the same total weight carried, etc. For this reason, one may imagine that data mining for fleet monitoring may merely involve collecting operating data from the multiple systems in the fleet and developing some sort of model, such as a model of normal operation that can be used for anomaly detection. However, one then may realize that each member of the fleet will be unique in some ways—there will be minor variations in manufacturing, quality of parts, and usage. For this reason, the typical machine learning and statis- tics algorithm’s assumption that all the data are independent and identically distributed is not correct. One may realize that data from each system in the fleet must be treated as unique so that one can notice significant changes in the operation of that system.

  16. Anomaly Detection Service Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Anomaly Detection Service Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/anomaly-detection-service-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Dec 3, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Anomaly Detection Service Market Outlook




    The anomaly detection service market size is poised for substantial growth, with its valuation estimated at USD 4.5 billion in 2023 and projected to reach USD 12.8 billion by 2032, reflecting a robust CAGR of 12.4% during the forecast period. The exponential growth trajectory of this market is underpinned by several critical factors, including the increasing reliance on data-driven decision-making across industries, the rising sophistication of cyber threats, and the need for real-time monitoring and analysis. The growing integration of advanced technologies such as artificial intelligence and machine learning in anomaly detection solutions is further catalyzing market expansion by enhancing accuracy and reducing false positives.




    One of the primary growth drivers of the anomaly detection service market is the escalating volume of data generated across diverse sectors. With the proliferation of IoT devices, mobile applications, and digital platforms, industries are inundated with massive datasets that require real-time analysis to derive actionable insights. Anomaly detection services provide the capability to sift through vast amounts of data to identify irregular patterns and potential threats, enabling organizations to act swiftly and mitigate risks. Additionally, the increasing focus on enhanced customer experiences and operational efficiency is propelling businesses to invest in robust anomaly detection solutions that ensure seamless operations and prevent disruptions.




    The mounting frequency and complexity of cyberattacks have significantly contributed to the demand for advanced anomaly detection services. As cybercriminals employ more sophisticated methods to breach security systems, traditional security measures are often inadequate. Anomaly detection services, leveraging machine learning and artificial intelligence, can detect unusual patterns and deviations from normal behavior, thus providing an additional layer of security against cyber threats. Furthermore, regulatory requirements mandating data protection and privacy have compelled organizations to adopt anomaly detection solutions to comply with standards and safeguard sensitive information, driving further market growth.




    Technological advancements and innovations in the field of artificial intelligence and big data analytics are playing a pivotal role in shaping the anomaly detection service market. These technologies enable the development of more refined and accurate detection models that can process and analyze data in real time. The integration of AI and ML algorithms not only increases the precision of anomaly detection but also helps in predicting future anomalies, thereby allowing organizations to take pre-emptive measures. The ability to customize and scale solutions according to specific organizational needs is another factor that is attracting enterprises towards investing in anomaly detection services.




    The regional outlook for the anomaly detection service market is characterized by significant variations in growth rates and adoption patterns across different geographies. North America remains a dominant region due to the early adoption of cutting-edge technologies, a strong emphasis on cybersecurity, and substantial investments in IT infrastructure. Europe is also witnessing steady growth, driven by stringent regulatory norms and the increasing focus on safeguarding digital assets. Meanwhile, the Asia Pacific region is anticipated to exhibit the highest CAGR over the forecast period, fueled by rapid digital transformation, expanding IT and telecommunications sectors, and increasing awareness about the importance of cybersecurity in emerging economies.



    Component Analysis




    In the anomaly detection service market, the component segmentation into software and services encapsulates a dynamic aspect of market growth. The software segment is witnessing a significant surge in demand as organizations increasingly seek sophisticated tools capable of real-time anomaly detection. These software solutions, often powered by AI and ML algorithms, facilitate the seamless integration of data from various sources, enhancing overall system efficiency. The burgeoning need for customizable and scalable solutions that can be tailored to specific industry requirements positions the software segment as a pivotal growth driver in the anomaly detection landscape.




    On the other hand, the services segment is equally pivotal,

  17. f

    Data from: Nonparametric Anomaly Detection on Time Series of Graphs

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

  18. Data from: Detecting Anomalies in Multivariate Data Sets with Switching...

    • data.nasa.gov
    • s.cnmilf.com
    • +3more
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Detecting Anomalies in Multivariate Data Sets with Switching Sequences and Continuous Streams [Dataset]. https://data.nasa.gov/dataset/detecting-anomalies-in-multivariate-data-sets-with-switching-sequences-and-continuous-stre
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. Here, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequence of events in the discrete streams can lead to off-nominal system performance. We discuss the application domain, novel algorithms, and also briefly discuss results on synthetic and real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art methods.

  19. Anomaly detection from sound data- Fan

    • kaggle.com
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vuppala Adithya Sairam (2023). Anomaly detection from sound data- Fan [Dataset]. https://www.kaggle.com/datasets/vuppalaadithyasairam/anomaly-detection-from-sound-data-fan
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vuppala Adithya Sairam
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The dataset is a subset of the Task-2 of DCASE 2020 Challenge. The Challenge is to identify anomaly of a machine using the audio data. There are three different parts of the dataset, namely, training, validation and testing which have been combined into a single dataset.

    Training- https://zenodo.org/record/3678171

    Validation- https://zenodo.org/record/3727685

    Testing- https://zenodo.org/record/3841772

  20. q

    SAIVT-Campus Dataset

    • researchdatafinder.qut.edu.au
    Updated Jun 30, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Simon Denman (2016). SAIVT-Campus Dataset [Dataset]. https://researchdatafinder.qut.edu.au/individual/n2531
    Explore at:
    Dataset updated
    Jun 30, 2016
    Dataset provided by
    Queensland University of Technology (QUT)
    Authors
    Dr Simon Denman
    Description

    SAIVT-Campus Dataset

    Overview

    The SAIVT-Campus Database is an abnormal event detection database captured on a university campus, where the abnormal events are caused by the onset of a storm. Contact Dr Simon Denman or Dr Jingxin Xu for more information.

    Licensing

    The SAIVT-Campus database is © 2012 QUT and is licensed under the Creative Commons Attribution-ShareAlike 3.0 Australia License.

    Attribution

    To attribute this database, please include the following citation: Xu, Jingxin, Denman, Simon, Fookes, Clinton B., & Sridharan, Sridha (2012) Activity analysis in complicated scenes using DFT coefficients of particle trajectories. In 9th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2012), 18-21 September 2012, Beijing, China. available at eprints.

    Acknowledging the Database in your Publications

    In addition to citing our paper, we kindly request that the following text be included in an acknowledgements section at the end of your publications: We would like to thank the SAIVT Research Labs at Queensland University of Technology (QUT) for freely supplying us with the SAIVT-Campus database for our research.

    Installing the SAIVT-Campus database

    After downloading and unpacking the archive, you should have the following structure:

    SAIVT-Campus +-- LICENCE.txt +-- README.txt +-- test_dataset.avi +-- training_dataset.avi +-- Xu2012 - Activity analysis in complicated scenes using DFT coefficients of particle trajectories.pdf

    Notes

    The SAIVT-Campus dataset is captured at the Queensland University of Technology, Australia.

    It contains two video files from real-world surveillance footage without any actors:

    training_dataset.avi (the training dataset)
    test_dataset.avi (the test dataset).
    

    This dataset contains a mixture of crowd densities and it has been used in the following paper for abnormal event detection:

    Xu, Jingxin, Denman, Simon, Fookes, Clinton B., & Sridharan, Sridha (2012) Activity analysis in complicated scenes using DFT coefficients of particle trajectories. In 9th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2012), 18-21 September 2012, Beijing, China. Available at eprints. 
    This paper is also included with the database (Xu2012 - Activity analysis in complicated scenes using DFT coefficients of particle trajectories.pdf) Both video files are one hour in duration.
    

    The normal activities include pedestrians entering or exiting the building, entering or exiting a lecture theatre (yellow door), and going to the counter at the bottom right. The abnormal events are caused by a heavy rain outside, and include people running in from the rain, people walking towards the door to exit and turning back, wearing raincoats, loitering and standing near the door and overcrowded scenes. The rain happens only in the later part of the test dataset.

    As a result, we assume that the training dataset only contains the normal activities. We have manually made an annotation as below:

    the training dataset does not have abnormal scenes
    the test dataset separates into two parts: only normal activities occur from 00:00:00 to 00:47:16 abnormalities are present from 00:47:17 to 01:00:00. We annotate the time 00:47:17 as the start time for the abnormal events, as from this time on we have begun to observe people stop walking or turn back from walking towards the door to exit, which indicates that the rain outside the building has influenced the activities inside the building. Should you have any questions, please do not hesitate to contact Dr Jingxin Xu.
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dashlink (2025). Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data

Data from: Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data

Related Article
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description

There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

Search
Clear search
Close search
Google apps
Main menu