100+ datasets found
  1. d

    Data from: Distributed Anomaly Detection using 1-class SVM for Vertically...

    • catalog.data.gov
    • data.nasa.gov
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

  2. d

    NCARDRS Congenital Anomaly Official Statistics Report, 2020

    • digital.nhs.uk
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). NCARDRS Congenital Anomaly Official Statistics Report, 2020 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/ncardrs-congenital-anomaly-statistics-annual-data
    Explore at:
    Dataset updated
    Dec 1, 2022
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Description

    This publication contains information on congenital anomalies in babies delivered in England in 2020. It includes this report showing key findings, spreadsheet tables with more detailed estimates and a methodology document.

  3. Comparative Analysis of Data-Driven Anomaly Detection Methods

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Comparative Analysis of Data-Driven Anomaly Detection Methods [Dataset]. https://data.nasa.gov/dataset/comparative-analysis-of-data-driven-anomaly-detection-methods
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This paper provides a review of three different advanced machine learning algorithms for anomaly detection in continuous data streams from a ground-test firing of a subscale Solid Rocket Motor (SRM). This study compares Orca, one-class support vector machines, and the Inductive Monitoring System (IMS) for anomaly detection on the data streams. We measure the performance of the algorithm with respect to the detection horizon for situations where fault information is available. These algorithms have been also studied by the present authors (and other co-authors) as applied to liquid propulsion systems. The trade space will be explored between these algorithms for both types of propulsion systems.

  4. f

    Anomaly Detection in High-Dimensional Data

    • tandf.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles (2023). Anomaly Detection in High-Dimensional Data [Dataset]. http://doi.org/10.6084/m9.figshare.12844508.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We define an anomaly as an observation where its k-nearest neighbor distance with the maximum gap is significantly different from what we would expect if the distribution of k-nearest neighbors with the maximum gap is in the maximum domain of attraction of the Gumbel distribution. An approach based on extreme value theory is used for the anomalous threshold calculation. Using various synthetic and real datasets, we demonstrate the wide applicability and usefulness of our algorithm, which we call the stray algorithm. We also demonstrate how this algorithm can assist in detecting anomalies present in other data structures using feature engineering. We show the situations where the stray algorithm outperforms the HDoutliers algorithm both in accuracy and computational time. This framework is implemented in the open source R package stray. Supplementary materials for this article are available online.

  5. c

    Data from: Detecting Anomalies in Multivariate Data Sets with Switching...

    • s.cnmilf.com
    • datasets.ai
    • +3more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Detecting Anomalies in Multivariate Data Sets with Switching Sequences and Continuous Streams [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/detecting-anomalies-in-multivariate-data-sets-with-switching-sequences-and-continuous-stre
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. Here, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequence of events in the discrete streams can lead to off-nominal system performance. We discuss the application _domain, novel algorithms, and also briefly discuss results on synthetic and real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art methods.

  6. Z

    Controlled Anomalies Time Series (CATS) Dataset

    • data.niaid.nih.gov
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fleith (2024). Controlled Anomalies Time Series (CATS) Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7646896
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset authored and provided by
    Patrick Fleith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies.

    The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]:

    Multivariate (17 variables) including sensors reading and control signals. It simulates the operational behaviour of an arbitrary complex system including:

    4 Deliberate Actuations / Control Commands sent by a simulated operator / controller, for instance, commands of an operator to turn ON/OFF some equipment.

    3 Environmental Stimuli / External Forces acting on the system and affecting its behaviour, for instance, the wind affecting the orientation of a large ground antenna.

    10 Telemetry Readings representing the observable states of the complex system by means of sensors, for instance, a position, a temperature, a pressure, a voltage, current, humidity, velocity, acceleration, etc.

    5 million timestamps. Sensors readings are at 1Hz sampling frequency.

    1 million nominal observations (the first 1 million datapoints). This is suitable to start learning the "normal" behaviour.

    4 million observations that include both nominal and anomalous segments. This is suitable to evaluate both semi-supervised approaches (novelty detection) as well as unsupervised approaches (outlier detection).

    200 anomalous segments. One anomalous segment may contain several successive anomalous observations / timestamps. Only the last 4 million observations contain anomalous segments.

    Different types of anomalies to understand what anomaly types can be detected by different approaches. The categories are available in the dataset and in the metadata.

    Fine control over ground truth. As this is a simulated system with deliberate anomaly injection, the start and end time of the anomalous behaviour is known very precisely. In contrast to real world datasets, there is no risk that the ground truth contains mislabelled segments which is often the case for real data.

    Suitable for root cause analysis. In addition to the anomaly category, the time series channel in which the anomaly first developed itself is recorded and made available as part of the metadata. This can be useful to evaluate the performance of algorithm to trace back anomalies to the right root cause channel.

    Affected channels. In addition to the knowledge of the root cause channel in which the anomaly first developed itself, we provide information of channels possibly affected by the anomaly. This can also be useful to evaluate the explainability of anomaly detection systems which may point out to the anomalous channels (root cause and affected).

    Obvious anomalies. The simulated anomalies have been designed to be "easy" to be detected for human eyes (i.e., there are very large spikes or oscillations), hence also detectable for most algorithms. It makes this synthetic dataset useful for screening tasks (i.e., to eliminate algorithms that are not capable to detect those obvious anomalies). However, during our initial experiments, the dataset turned out to be challenging enough even for state-of-the-art anomaly detection approaches, making it suitable also for regular benchmark studies.

    Context provided. Some variables can only be considered anomalous in relation to other behaviours. A typical example consists of a light and switch pair. The light being either on or off is nominal, the same goes for the switch, but having the switch on and the light off shall be considered anomalous. In the CATS dataset, users can choose (or not) to use the available context, and external stimuli, to test the usefulness of the context for detecting anomalies in this simulation.

    Pure signal ideal for robustness-to-noise analysis. The simulated signals are provided without noise: while this may seem unrealistic at first, it is an advantage since users of the dataset can decide to add on top of the provided series any type of noise and choose an amplitude. This makes it well suited to test how sensitive and robust detection algorithms are against various levels of noise.

    No missing data. You can drop whatever data you want to assess the impact of missing values on your detector with respect to a clean baseline.

    Change Log

    Version 2

    Metadata: we include a metadata.csv with information about:

    Anomaly categories

    Root cause channel (signal in which the anomaly is first visible)

    Affected channel (signal in which the anomaly might propagate) through coupled system dynamics

    Removal of anomaly overlaps: version 1 contained anomalies which overlapped with each other resulting in only 190 distinct anomalous segments. Now, there are no more anomaly overlaps.

    Two data files: CSV and parquet for convenience.

    [1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. doi:10.14778/3538598.3538602”

    About Solenix

    Solenix is an international company providing software engineering, consulting services and software products for the space market. Solenix is a dynamic company that brings innovative technologies and concepts to the aerospace market, keeping up to date with technical advancements and actively promoting spin-in and spin-out technology activities. We combine modern solutions which complement conventional practices. We aspire to achieve maximum customer satisfaction by fostering collaboration, constructivism, and flexibility.

  7. f

    Data from: Nonparametric Anomaly Detection on Time Series of Graphs

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

  8. Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...

    • data.nasa.gov
    • datadiscoverystudio.org
    • +5more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms [Dataset]. https://data.nasa.gov/dataset/discovering-anomalous-aviation-safety-events-using-scalable-data-mining-algorithms
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The worldwide civilian aviation system is one of the most complex dynamical systems created. Most modern commercial aircraft have onboard flight data recorders that record several hundred discrete and continuous parameters at approximately 1Hz for the entire duration of the flight. These data contain information about the flight control systems, actuators, engines, landing gear, avionics, and pilot commands. In this paper, recent advances in the development of a novel knowledge discovery process consisting of a suite of data mining techniques for identifying precursors to aviation safety incidents are discussed. The data mining techniques include scalable multiple-kernel learning for large-scale distributed anomaly detection. A novel multivariate time-series search algorithm is used to search for signatures of discovered anomalies on massive datasets. The process can identify operationally significant events due to environmental, mechanical, and human factors issues in the high-dimensional flight operations quality assurance data. All discovered anomalies are validated by a team of independent domain experts. This novel automated knowledge discovery process is aimed at complementing the state-of-the-art human-generated exceedance-based analysis that fails to discover previously unknown aviation safety incidents. In this paper, the discovery pipeline, the methods used, and some of the significant anomalies detected on real-world commercial aviation data are discussed.

  9. i

    Unified Spacecraft Anomaly Detection Benchmark Dataset

    • ieee-dataport.org
    Updated Mar 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankit Srivastava (2024). Unified Spacecraft Anomaly Detection Benchmark Dataset [Dataset]. https://ieee-dataport.org/documents/unified-spacecraft-anomaly-detection-benchmark-dataset
    Explore at:
    Dataset updated
    Mar 30, 2024
    Authors
    Ankit Srivastava
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    finance

  10. Anomaly Detection Market Analysis North America, Europe, APAC, South...

    • technavio.com
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Anomaly Detection Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, Germany, UK, China, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/anomaly-detection-market-industry-analysis
    Explore at:
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Global, United States
    Description

    Snapshot img

    Anomaly Detection Market Size 2024-2028

    The anomaly detection market size is forecast to increase by USD 3.71 billion at a CAGR of 13.63% between 2023 and 2028. Anomaly detection is a critical aspect of cybersecurity, particularly in sectors like healthcare where abnormal patient conditions or unusual network activity can have significant consequences. The market for anomaly detection solutions is experiencing significant growth due to several factors. Firstly, the increasing incidence of internal threats and cyber frauds has led organizations to invest in advanced tools for detecting and responding to anomalous behavior. Secondly, the infrastructural requirements for implementing these solutions are becoming more accessible, making them a viable option for businesses of all sizes. Data science and machine learning algorithms play a crucial role in anomaly detection, enabling accurate identification of anomalies and minimizing the risk of incorrect or misleading conclusions.

    However, data quality is a significant challenge in this field, as poor quality data can lead to false positives or false negatives, undermining the effectiveness of the solution. Overall, the market for anomaly detection solutions is expected to grow steadily in the coming years, driven by the need for enhanced cybersecurity and the increasing availability of advanced technologies.

    What will be the Anomaly Detection Market Size During the Forecast Period?

    Request Free Sample

    Anomaly detection, also known as outlier detection, is a critical data analysis technique used to identify observations or events that deviate significantly from the normal behavior or expected patterns in data. These deviations, referred to as anomalies or outliers, can indicate infrastructure failures, breaking changes, manufacturing defects, equipment malfunctions, or unusual network activity. In various industries, including manufacturing, cybersecurity, healthcare, and data science, anomaly detection plays a crucial role in preventing incorrect or misleading conclusions. Artificial intelligence and machine learning algorithms, such as statistical tests (Grubbs test, Kolmogorov-Smirnov test), decision trees, isolation forest, naive Bayesian, autoencoders, local outlier factor, and k-means clustering, are commonly used for anomaly detection.

    Furthermore, these techniques help identify anomalies by analyzing data points and their statistical properties using charts, visualization, and ML models. For instance, in manufacturing, anomaly detection can help identify defective products, while in cybersecurity, it can detect unusual network activity. In healthcare, it can be used to identify abnormal patient conditions. By applying anomaly detection techniques, organizations can proactively address potential issues and mitigate risks, ensuring optimal performance and security.

    Market Segmentation

    The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Deployment
    
      Cloud
      On-premise
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        Japan
    
    
      South America
    
    
    
      Middle East and Africa
    

    By Deployment Insights

    The cloud segment is estimated to witness significant growth during the forecast period. The market is witnessing a notable shift towards cloud-based solutions due to their numerous advantages over traditional on-premises systems. Cloud-based anomaly detection offers breaking changes such as quicker deployment, enhanced flexibility, and scalability, real-time data visibility, and customization capabilities. These features are provided by service providers with flexible payment models like monthly subscriptions and pay-as-you-go, making cloud-based software a cost-effective and economical choice. Anodot, Ltd, Cisco Systems Inc, IBM Corp, and SAS Institute Inc are some prominent companies offering cloud-based anomaly detection solutions in addition to on-premise alternatives. In the context of security threats, architectural optimization, marketing strategies, finance, fraud detection, manufacturing, and defects, equipment malfunctions, cloud-based anomaly detection is becoming increasingly popular due to its ability to provide real-time insights and swift response to anomalies.

    Get a glance at the market share of various segments Request Free Sample

    The cloud segment accounted for USD 1.59 billion in 2018 and showed a gradual increase during the forecast period.

    Regional Insights

    When it comes to Anomaly Detection Market growth, North America is estimated to contribute 37% to the global market during the forecast period. Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast per

  11. d

    Anomaly Detection for Complex Systems

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Anomaly Detection for Complex Systems [Dataset]. https://catalog.data.gov/dataset/anomaly-detection-for-complex-systems
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    In performance maintenance in large, complex systems, sensor information from sub-components tends to be readily available, and can be used to make predictions about the system's health and diagnose possible anomalies. However, existing methods can only use predictions of individual component anomalies to guess at systemic problems, not accurately estimate the magnitude of the problem, nor prescribe good solutions. Since physical complex systems usually have well-defined semantics of operation, we here propose using anomaly detection techniques drawn from data mining in conjunction with an automated theorem prover working on a domain-specific knowledge base to perform systemic anomalydetection on complex systems. For clarity of presentation, the remaining content of this submission is presented compactly in Fig 1.

  12. Data from: Theoretically Optimal Distributed Anomaly Detection

    • data.nasa.gov
    • datasets.ai
    • +2more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Theoretically Optimal Distributed Anomaly Detection [Dataset]. https://data.nasa.gov/dataset/theoretically-optimal-distributed-anomaly-detection
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    A novel general framework for distributed anomaly detection with theoretical performance guarantees is proposed. Our algorithmic approach combines existing anomaly detection procedures with a novel method for computing global statistics using local sufficient statistics. Under a Gaussian assumption, our distributed algorithm is guaranteed to perform as well as its centralized counterpart, a condition we call Ôzero information lossÕ. We further report experimental results on synthetic as well as real-world data to demonstrate the viability of our approach.

  13. P

    Radio observatory anomaly detection dataset Dataset

    • paperswithcode.com
    Updated Jul 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Mesarcik; Albert-Jan Boonstra; Marco Iacobelli; Elena Ranguelova; Cees de Laat; Rob van Nieuwpoort (2023). Radio observatory anomaly detection dataset Dataset [Dataset]. https://paperswithcode.com/dataset/radio-observatory-anomaly-detection-dataset
    Explore at:
    Dataset updated
    Jul 2, 2023
    Authors
    Michael Mesarcik; Albert-Jan Boonstra; Marco Iacobelli; Elena Ranguelova; Cees de Laat; Rob van Nieuwpoort
    Description

    The ROAD dataset is made up of observations from the Low Frequency Array (LOFAR) telescope. LOFAR is comprised of 52 stations across Europe, where each station is an array of 96 dual polarisation low-band antennas (LBA) in the 10–90 MHz range and 48 or 96 dual polarisation high-band antenna antennas (HBA) in the 110–250 MHz range. The data are four dimensional, with the dimensions corresponding to time, frequency, polarisation, and station. dictate the array configuration (i.e. the number of stations used), the number of frequency channels (Nf), the time sampling, as well as the overall integration time (Nt) of the observing session. Furthermore, the dual-polarisation of the antennas results in a correlation product (Npol) of size 4. The ROAD dataset contains ten classes that describe various system-wide phenomena and anomalies from data obtained by the LOFAR telescope. These classes are categorised into four groups: data processing system failures, electronic anomalies, environmental effects, and unwanted astronomical events as shown by the table below.

    CategoryDescriptionBandPolarisationOccurrence rateNum Samples
    NormalAll non-characterised effectsBothAll-4687
    % Electric fenceRFI emitted from electric fencesLowCross64
    Data processing
    First order data lossData loss from consecutive time and/or frequency channelsBothAll0.02146
    Second order data lossData loss from single frequency and/or single time channelsBothAll0.04283
    Electronic systems
    High noise elementHigh power disturbances caused by miscellaneous eventsBothAll0.0188
    Oscillating tileAmplifier going into oscillationHighAll0.0156
    Astronomical events
    Source in side-lobesA-team source passing through side-lobesHighAll0.06446
    Galactic planeGalactic plane passing through the main lobe of the antennaBothCross0.08550
    Solar stormStrong emissions from the sunLowAll0.02147
    Environmental effects
    LightningLightning stormBothAll0.06389
    Ionospheric RFI reflectionsRFI reflected from the ionosphereLowAll0.04261
  14. c

    Data from: Discovering System Health Anomalies using Data Mining Techniques

    • s.cnmilf.com
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +3more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering System Health Anomalies using Data Mining Techniques [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/discovering-system-health-anomalies-using-data-mining-techniques
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the rapid assimilation, analysis, and discovery of system anomalies. We motivate our work with a discussion of an aviation problem where the identification of anomalous sequences is essential for safety reasons. The data in this application are discrete and continuous sensor measurements and can be dealt with seamlessly using the methods described here to discover anomalous flights. We specifically treat the problem of discovering anomalous features in the time series that may be hidden from the sensor suite and compare those methods to standard envelope detection methods on test data designed to accentuate the differences between the two methods. Identification of these hidden anomalies is crucial to building stable, reusable, and cost-efficient systems. We also discuss a data mining framework for the analysis and discovery of anomalies in high-dimensional time series of sensor measurements that would be found in an ISHM system. We conclude with recommendations that describe the tradeoffs in building an integrated scalable platform for robust anomaly detection in ISHM applications.

  15. Congenital anomaly statistics

    • data.europa.eu
    • data.wu.ac.at
    html
    Updated Oct 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2021). Congenital anomaly statistics [Dataset]. https://data.europa.eu/data/datasets/congenital_anomaly_statistics?locale=en
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Oct 11, 2021
    Dataset authored and provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence

    Description

    Birth notification details of babies born with anomalies. Source: National Congenital Anomaly System (NCAS) Publisher: Office for National Statistics (ONS) Geographies: Government Office Region (GOR), National, Strategic Health Authority (SHA) Geographic coverage: England and Wales Time coverage: 1999 to 2006 Type of data: Administrative data

  16. Annual precipitation anomaly in the United States 1900-2024

    • statista.com
    • ai-chatbox.pro
    Updated Feb 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Annual precipitation anomaly in the United States 1900-2024 [Dataset]. https://www.statista.com/statistics/1293607/precipitation-anomaly-in-the-us/
    Explore at:
    Dataset updated
    Feb 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In 2024, precipitation in the United States stood 1.66 inches above the annual average recorded across the previous century (1901 to 2020). Except for 2022 and 2023, the past 10 years have all seen annual precipitation above the average, with the highest anomaly of the displayed period recorded in 2019, at nearly five inches of rainfall.

  17. Comparison of Unsupervised Anomaly Detection Methods

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nasa.gov (2025). Comparison of Unsupervised Anomaly Detection Methods [Dataset]. https://data.nasa.gov/dataset/comparison-of-unsupervised-anomaly-detection-methods
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Several different unsupervised anomaly detection algorithms have been applied to Space Shuttle Main Engine (SSME) data to serve the purpose of developing a comprehensive suite of Integrated Systems Health Management (ISHM) tools. As the theoretical bases for these methods vary considerably, it is reasonable to conjecture that the resulting anomalies detected by them may differ quite significantly as well. As such, it would be useful to apply a common metric with which to compare the results. However, for such a quantitative analysis to be statistically significant, a sufficient number of examples of both nominally categorized and anomalous data must be available. Due to the lack of sufficient examples of anomalous data, use of any statistics that rely upon a statistically significant sample of anomalous data is infeasible. Therefore, the main focus of this paper will be to compare actual examples of anomalies detected by the algorithms via the sensors in which they appear, as well the times at which they appear. We find that there is enough overlap in detection of the anomalies among all of the different algorithms tested in order for them to corroborate the severity of these anomalies. In certain cases, the severity of these anomalies is supported by their categorization as failures by experts, with realistic physical explanations. For those anomalies that can not be corroborated by at least one other method, this overlap says less about the severity of the anomaly, and more about their technical nuances, which will also be discussed.

  18. Anomaly detection from sound data- Fan

    • kaggle.com
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vuppala Adithya Sairam (2023). Anomaly detection from sound data- Fan [Dataset]. https://www.kaggle.com/datasets/vuppalaadithyasairam/anomaly-detection-from-sound-data-fan
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vuppala Adithya Sairam
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The dataset is a subset of the Task-2 of DCASE 2020 Challenge. The Challenge is to identify anomaly of a machine using the audio data. There are three different parts of the dataset, namely, training, validation and testing which have been combined into a single dataset.

    Training- https://zenodo.org/record/3678171

    Validation- https://zenodo.org/record/3727685

    Testing- https://zenodo.org/record/3841772

  19. Anomaly Detection Service Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Anomaly Detection Service Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/anomaly-detection-service-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Dec 3, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Anomaly Detection Service Market Outlook




    The anomaly detection service market size is poised for substantial growth, with its valuation estimated at USD 4.5 billion in 2023 and projected to reach USD 12.8 billion by 2032, reflecting a robust CAGR of 12.4% during the forecast period. The exponential growth trajectory of this market is underpinned by several critical factors, including the increasing reliance on data-driven decision-making across industries, the rising sophistication of cyber threats, and the need for real-time monitoring and analysis. The growing integration of advanced technologies such as artificial intelligence and machine learning in anomaly detection solutions is further catalyzing market expansion by enhancing accuracy and reducing false positives.




    One of the primary growth drivers of the anomaly detection service market is the escalating volume of data generated across diverse sectors. With the proliferation of IoT devices, mobile applications, and digital platforms, industries are inundated with massive datasets that require real-time analysis to derive actionable insights. Anomaly detection services provide the capability to sift through vast amounts of data to identify irregular patterns and potential threats, enabling organizations to act swiftly and mitigate risks. Additionally, the increasing focus on enhanced customer experiences and operational efficiency is propelling businesses to invest in robust anomaly detection solutions that ensure seamless operations and prevent disruptions.




    The mounting frequency and complexity of cyberattacks have significantly contributed to the demand for advanced anomaly detection services. As cybercriminals employ more sophisticated methods to breach security systems, traditional security measures are often inadequate. Anomaly detection services, leveraging machine learning and artificial intelligence, can detect unusual patterns and deviations from normal behavior, thus providing an additional layer of security against cyber threats. Furthermore, regulatory requirements mandating data protection and privacy have compelled organizations to adopt anomaly detection solutions to comply with standards and safeguard sensitive information, driving further market growth.




    Technological advancements and innovations in the field of artificial intelligence and big data analytics are playing a pivotal role in shaping the anomaly detection service market. These technologies enable the development of more refined and accurate detection models that can process and analyze data in real time. The integration of AI and ML algorithms not only increases the precision of anomaly detection but also helps in predicting future anomalies, thereby allowing organizations to take pre-emptive measures. The ability to customize and scale solutions according to specific organizational needs is another factor that is attracting enterprises towards investing in anomaly detection services.




    The regional outlook for the anomaly detection service market is characterized by significant variations in growth rates and adoption patterns across different geographies. North America remains a dominant region due to the early adoption of cutting-edge technologies, a strong emphasis on cybersecurity, and substantial investments in IT infrastructure. Europe is also witnessing steady growth, driven by stringent regulatory norms and the increasing focus on safeguarding digital assets. Meanwhile, the Asia Pacific region is anticipated to exhibit the highest CAGR over the forecast period, fueled by rapid digital transformation, expanding IT and telecommunications sectors, and increasing awareness about the importance of cybersecurity in emerging economies.



    Component Analysis




    In the anomaly detection service market, the component segmentation into software and services encapsulates a dynamic aspect of market growth. The software segment is witnessing a significant surge in demand as organizations increasingly seek sophisticated tools capable of real-time anomaly detection. These software solutions, often powered by AI and ML algorithms, facilitate the seamless integration of data from various sources, enhancing overall system efficiency. The burgeoning need for customizable and scalable solutions that can be tailored to specific industry requirements positions the software segment as a pivotal growth driver in the anomaly detection landscape.




    On the other hand, the services segment is equally pivotal,

  20. P

    NAB Dataset

    • paperswithcode.com
    Updated Nov 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Lavin; Subutai Ahmad (2020). NAB Dataset [Dataset]. https://paperswithcode.com/dataset/nab
    Explore at:
    Dataset updated
    Nov 9, 2020
    Authors
    Alexander Lavin; Subutai Ahmad
    Description

    The First Temporal Benchmark Designed to Evaluate Real-time Anomaly Detectors Benchmark

    The growth of the Internet of Things has created an abundance of streaming data. Finding anomalies in this data can provide valuable insights into opportunities or failures. Yet it’s difficult to achieve, due to the need to process data in real time, continuously learn and make predictions. How do we evaluate and compare various real-time anomaly detection techniques?

    The Numenta Anomaly Benchmark (NAB) provides a standard, open source framework for evaluating real-time anomaly detection algorithms on streaming data. Through a controlled, repeatable environment of open-source tools, NAB rewards detectors that find anomalies as soon as possible, trigger no false alarms, and automatically adapt to any changing statistics.

    NAB comprises two main components: a scoring system designed for streaming data and a dataset with labeled, real-world time-series data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dashlink (2025). Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data

Data from: Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data

Related Article
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description

There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

Search
Clear search
Close search
Google apps
Main menu