100+ datasets found

r
Data from: Datasets for outlier detection
researchdata.edu.au
research-repository.rmit.edu.au
+1more
Updated Mar 27, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sevvandi Kandanaarachchi; Mario Munoz Acosta; Kate Smith-Miles; Rob Hyndman (2019). Datasets for outlier detection [Dataset]. http://doi.org/10.26180/5c6253c0b3323
Explore at:
Unique identifier
https://doi.org/10.26180/5c6253c0b3323
Dataset updated
Mar 27, 2019
Dataset provided by
Monash University
Authors
Sevvandi Kandanaarachchi; Mario Munoz Acosta; Kate Smith-Miles; Rob Hyndman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The zip files contains 12338 datasets for outlier detection investigated in the following papers:

(1) Instance space analysis for unsupervised outlier detection
Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Kate Smith-Miles

(2) On normalization and algorithm selection for unsupervised outlier detection
Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Rob J. Hyndman, Kate Smith-Miles

Some of these datasets were originally discussed in the paper:

On the evaluation of unsupervised outlier detection:measures, datasets and an empirical study
Authors : G. O. Campos, A, Zimek, J. Sander, R. J.G.B. Campello, B. Micenkova, E. Schubert, I. Assent, M.E. Houle.
d
Algorithms for Speeding up Distance-Based Outlier Detection
catalog.data.gov
cloud.csiss.gmu.edu
+2more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Algorithms for Speeding up Distance-Based Outlier Detection [Dataset]. https://catalog.data.gov/dataset/algorithms-for-speeding-up-distance-based-outlier-detection
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
The problem of distance-based outlier detection is difficult to solve efficiently in very large datasets because of potential quadratic time complexity. We address this problem and develop sequential and distributed algorithms that are significantly more efficient than state-of-the-art methods while still guaranteeing the same outliers. By combining simple but effective indexing and disk block accessing techniques, we have developed a sequential algorithm iOrca that is up to an order-of-magnitude faster than the state-of-the-art. The indexing scheme is based on sorting the data points in order of increasing distance from a fixed reference point and then accessing those points based on this sorted order. To speed up the basic outlier detection technique, we develop two distributed algorithms (DOoR and iDOoR) for modern distributed multi-core clusters of machines, connected on a ring topology. The first algorithm passes data blocks from each machine around the ring, incrementally updating the nearest neighbors of the points passed. By maintaining a cutoff threshold, it is able to prune a large number of points in a distributed fashion. The second distributed algorithm extends this basic idea with the indexing scheme discussed earlier. In our experiments, both distributed algorithms exhibit significant improvements compared to the state-of-the-art distributed methods.
MNIST dataset for Outliers Detection - [ MNIST4OD ]
figshare.com
application/gzip
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giovanni Stilo; Bardh Prenkaj (2024). MNIST dataset for Outliers Detection - [ MNIST4OD ] [Dataset]. http://doi.org/10.6084/m9.figshare.9954986.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9954986.v2
Dataset updated
May 17, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Giovanni Stilo; Bardh Prenkaj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here we present a dataset, MNIST4OD, of large size (number of dimensions and number of instances) suitable for Outliers Detection task.The dataset is based on the famous MNIST dataset (http://yann.lecun.com/exdb/mnist/).We build MNIST4OD in the following way:To distinguish between outliers and inliers, we choose the images belonging to a digit as inliers (e.g. digit 1) and we sample with uniform probability on the remaining images as outliers such as their number is equal to 10% of that of inliers. We repeat this dataset generation process for all digits. For implementation simplicity we then flatten the images (28 X 28) into vectors.Each file MNIST_x.csv.gz contains the corresponding dataset where the inlier class is equal to x.The data contains one instance (vector) in each line where the last column represents the outlier label (yes/no) of the data point. The data contains also a column which indicates the original image class (0-9).See the following numbers for a complete list of the statistics of each datasets ( Name | Instances | Dimensions | Number of Outliers in % ):MNIST_0 | 7594 | 784 | 10MNIST_1 | 8665 | 784 | 10MNIST_2 | 7689 | 784 | 10MNIST_3 | 7856 | 784 | 10MNIST_4 | 7507 | 784 | 10MNIST_5 | 6945 | 784 | 10MNIST_6 | 7564 | 784 | 10MNIST_7 | 8023 | 784 | 10MNIST_8 | 7508 | 784 | 10MNIST_9 | 7654 | 784 | 10
s
Outlier Set Two-step Method (OSTI)
orda.shef.ac.uk
application/x-rar
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amal Sarfraz; Abigail Birnbaum; Flannery Dolan; Jonathan Lamontagne; Lyudmila Mihaylova; Charles Rouge (2025). Outlier Set Two-step Method (OSTI) [Dataset]. http://doi.org/10.15131/shef.data.28227974.v3
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.15131/shef.data.28227974.v3
Dataset updated
Jul 1, 2025
Dataset provided by
The University of Sheffield
Authors
Amal Sarfraz; Abigail Birnbaum; Flannery Dolan; Jonathan Lamontagne; Lyudmila Mihaylova; Charles Rouge
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These files are supplements to the paper titled 'A Robust Two-step Method for Detection of Outlier Sets'.This paper identifies and addresses the need for a robust method that identifies sets of points that collectively deviate from typical patterns in a dataset, which it calls "outlier sets'', while excluding individual points from detection. This new methodology, Outlier Set Two-step Identification (OSTI) employs a two-step approach to detect and label these outlier sets. First, it uses Gaussian Mixture Models for probabilistic clustering, identifying candidate outlier sets based on cluster weights below a predetermined threshold. Second, OSTI measures the Inter-cluster Mahalanobis distance between each candidate outlier set's centroid and the overall dataset mean. OSTI then tests the null hypothesis that this distance does not significantly differ from its theoretical chi-square distribution, enabling the formal detection of outlier sets. We test OSTI systematically on 8,000 synthetic 2D datasets across various inlier configurations and thousands of possible outlier set characteristics. Results show OSTI robustly and consistently detects outlier sets with an average F1 score of 0.92 and an average purity (the degree to which outlier sets identified correspond to those generated synthetically, i.e., our ground truth) of 98.58%. We also compare OSTI with state-of-the-art outlier detection methods, to illuminate how OSTI fills a gap as a tool for the exclusive detection of outlier sets.
i
Data from: An Effective Algorithm of Outlier Correction in Space-time Radar...
ieee-dataport.org
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yongchan Kim (2024). An Effective Algorithm of Outlier Correction in Space-time Radar Rainfall Data Based on the Iterative Localized Analysis [Dataset]. https://ieee-dataport.org/documents/effective-algorithm-outlier-correction-space-time-radar-rainfall-data-based-iterative
Explore at:
Dataset updated
Feb 13, 2024
Authors
Yongchan Kim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ensuring accurate representations in spatial and temporal data analyses.
f
Data from: A Diagnostic Procedure for Detecting Outliers in Linear...
tandf.figshare.com
figshare.com
txt
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow (2024). A Diagnostic Procedure for Detecting Outliers in Linear State–Space Models [Dataset]. http://doi.org/10.6084/m9.figshare.12162075.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12162075.v1
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Outliers can be more problematic in longitudinal data than in independent observations due to the correlated nature of such data. It is common practice to discard outliers as they are typically regarded as a nuisance or an aberration in the data. However, outliers can also convey meaningful information concerning potential model misspecification, and ways to modify and improve the model. Moreover, outliers that occur among the latent variables (innovative outliers) have distinct characteristics compared to those impacting the observed variables (additive outliers), and are best evaluated with different test statistics and detection procedures. We demonstrate and evaluate the performance of an outlier detection approach for multi-subject state-space models in a Monte Carlo simulation study, with corresponding adaptations to improve power and reduce false detection rates. Furthermore, we demonstrate the empirical utility of the proposed approach using data from an ecological momentary assessment study of emotion regulation together with an open-source software implementation of the procedures.
h
mnist-outlier
huggingface.co
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Renumics (2023). mnist-outlier [Dataset]. https://huggingface.co/datasets/renumics/mnist-outlier
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2023
Dataset authored and provided by
Renumics
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "mnist-outlier"

📚 This dataset is an enriched version of the MNIST Dataset. The workflow is described in the medium article: Changes of Embeddings during Fine-Tuning of Transformers.

Explore the Dataset

The open source data curation tool Renumics Spotlight allows you to explorer this dataset. You can find a Hugging Face Space running Spotlight with this dataset here: https://huggingface.co/spaces/renumics/mnist-outlier.

Or you can explorer it locally:… See the full description on the dataset page: https://huggingface.co/datasets/renumics/mnist-outlier.
d
Data from: Privacy Preserving Outlier Detection through Random Nonlinear...
catalog.data.gov
data.amerigeoss.org
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Privacy Preserving Outlier Detection through Random Nonlinear Data Distortion [Dataset]. https://catalog.data.gov/dataset/privacy-preserving-outlier-detection-through-random-nonlinear-data-distortion
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Consider a scenario in which the data owner has some private/sensitive data and wants a data miner to access it for studying important patterns without revealing the sensitive information. Privacy preserving data mining aims to solve this problem by randomly transforming the data prior to its release to data miners. Previous work only considered the case of linear data perturbations — additive, multiplicative or a combination of both for studying the usefulness of the perturbed output. In this paper, we discuss nonlinear data distortion using potentially nonlinear random data transformation and show how it can be useful for privacy preserving anomaly detection from sensitive datasets. We develop bounds on the expected accuracy of the nonlinear distortion and also quantify privacy by using standard definitions. The highlight of this approach is to allow a user to control the amount of privacy by varying the degree of nonlinearity. We show how our general transformation can be used for anomaly detection in practice for two specific problem instances: a linear model and a popular nonlinear model using the sigmoid function. We also analyze the proposed nonlinear transformation in full generality and then show that for specific cases it is distance preserving. A main contribution of this paper is the discussion between the invertibility of a transformation and privacy preservation and the application of these techniques to outlier detection. Experiments conducted on real-life datasets demonstrate the effectiveness of the approach.
h
cifar10-outlier
huggingface.co
Updated Jul 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Renumics (2023). cifar10-outlier [Dataset]. https://huggingface.co/datasets/renumics/cifar10-outlier
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2023
Dataset authored and provided by
Renumics
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
Dataset Card for "cifar10-outlier"

📚 This dataset is an enriched version of the CIFAR-10 Dataset. The workflow is described in the medium article: Changes of Embeddings during Fine-Tuning of Transformers.

Explore the Dataset

The open source data curation tool Renumics Spotlight allows you to explorer this dataset. You can find a Hugging Face Spaces running Spotlight with this dataset here:

Full Version (High hardware requirement)… See the full description on the dataset page: https://huggingface.co/datasets/renumics/cifar10-outlier.
f
Data from: Error and anomaly detection for intra-participant time-series...
tandf.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David R. Mullineaux; Gareth Irwin (2023). Error and anomaly detection for intra-participant time-series data [Dataset]. http://doi.org/10.6084/m9.figshare.5189002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5189002
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
David R. Mullineaux; Gareth Irwin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identification of errors or anomalous values, collectively considered outliers, assists in exploring data or through removing outliers improves statistical analysis. In biomechanics, outlier detection methods have explored the ‘shape’ of the entire cycles, although exploring fewer points using a ‘moving-window’ may be advantageous. Hence, the aim was to develop a moving-window method for detecting trials with outliers in intra-participant time-series data. Outliers were detected through two stages for the strides (mean 38 cycles) from treadmill running. Cycles were removed in stage 1 for one-dimensional (spatial) outliers at each time point using the median absolute deviation, and in stage 2 for two-dimensional (spatial–temporal) outliers using a moving window standard deviation. Significance levels of the t-statistic were used for scaling. Fewer cycles were removed with smaller scaling and smaller window size, requiring more stringent scaling at stage 1 (mean 3.5 cycles removed for 0.0001 scaling) than at stage 2 (mean 2.6 cycles removed for 0.01 scaling with a window size of 1). Settings in the supplied Matlab code should be customised to each data set, and outliers assessed to justify whether to retain or remove those cycles. The method is effective in identifying trials with outliers in intra-participant time series data.
Outlier Detection and Removal Dataset
kaggle.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aamir Shahzad (2025). Outlier Detection and Removal Dataset [Dataset]. https://www.kaggle.com/datasets/aamir5659/outlier-detection-and-removal-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aamir Shahzad
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
📁 Files Included: Outlier_Loan_datase.csv – Raw dataset with outliers `.Final_Outliers_clean_dataset.csv (IQR + Z-score)

This dataset is designed for practicing outlier detection and data cleaning techniques.
It includes both the original (uncleaned) and cleaned versions of a financial dataset.
t
Outlier Detection on Sensor Data - Dataset - LDM
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Outlier Detection on Sensor Data - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/outlier-detection-on-sensor-data
Explore at:
Dataset updated
Dec 2, 2024
Description
The dataset used for outlier detection on sensor data from temperature and humidity sensors deployed in sensorized farms and manufacturing units on Purdue University's campus.
Outlier Detection and Prevention
kaggle.com
Updated Jul 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Omsingh Bais (2021). Outlier Detection and Prevention [Dataset]. https://www.kaggle.com/datasets/ombais/outlier-detection-and-prevention
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Omsingh Bais
Description
Dataset

This dataset was created by Omsingh Bais

Contents
u
Data from: Detection of outlier loci and their utility for fisheries...
open.library.ubc.ca
borealisdata.ca
+1more
Updated May 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Russello, Michael A; Kirk, Stephanie L; Frazer, Karen K; Askey, Paul J (2021). Data from: Detection of outlier loci and their utility for fisheries management [Dataset]. http://doi.org/10.14288/1.0397632
Explore at:
Unique identifier
https://doi.org/10.14288/1.0397632
Dataset updated
May 19, 2021
Authors
Russello, Michael A; Kirk, Stephanie L; Frazer, Karen K; Askey, Paul J
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jun 24, 2020
Area covered
British Columbia
Description
Usage notes
Okanagan_Lake_kokanee_microsatellite_data
Length, in base-pairs, of alleles at up to 52 EST-linked and non-EST-linked microsatellite loci in 164 individual kokanee (Oncorhynchus nerka) sampled at seven spawning sites across Okanagan Lake, British Columbia over two sampling years (2007 and 2010). File in GenAlEx format with missing data coded as 0. Data collected with funds from NSERC, Habitat Conservation Trust Fund and Northwest Scientific Association.
f
Data from: Outlier detection in cylindrical data based on Mahalanobis...
tandf.figshare.com
text/x-tex
Updated Jan 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashant S. Dhamale; Akanksha S. Kashikar (2025). Outlier detection in cylindrical data based on Mahalanobis distance [Dataset]. http://doi.org/10.6084/m9.figshare.24092089.v1
Explore at:
text/x-texAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24092089.v1
Dataset updated
Jan 2, 2025
Dataset provided by
Taylor & Francis
Authors
Prashant S. Dhamale; Akanksha S. Kashikar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cylindrical data are bivariate data formed from the combination of circular and linear variables. Identifying outliers is a crucial step in any data analysis work. This paper proposes a new distribution-free procedure to detect outliers in cylindrical data using the Mahalanobis distance concept. The use of Mahalanobis distance incorporates the correlation between the components of the cylindrical distribution, which had not been accounted for in the earlier papers on outlier detection in cylindrical data. The threshold for declaring an observation to be an outlier can be obtained via parametric or non-parametric bootstrap, depending on whether the underlying distribution is known or unknown. The performance of the proposed method is examined via extensive simulations from the Johnson-Wehrly distribution. The proposed method is applied to two real datasets, and the outliers are identified in those datasets.
i
Fifth Generation Wireless Channels Outlier Detection and Clustering
ieee-dataport.org
Updated May 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jojo Blanza (2024). Fifth Generation Wireless Channels Outlier Detection and Clustering [Dataset]. https://ieee-dataport.org/documents/fifth-generation-wireless-channels-outlier-detection-and-clustering
Explore at:
Dataset updated
May 27, 2024
Authors
Jojo Blanza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
lower latency
a
Find Outliers Minnesota Hospitals
umn.hub.arcgis.com
Updated May 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Minnesota (2020). Find Outliers Minnesota Hospitals [Dataset]. https://umn.hub.arcgis.com/maps/UMN::find-outliers-minnesota-hospitals
Explore at:
Dataset updated
May 6, 2020
Dataset authored and provided by
University of Minnesota
Area covered

Description
The following report outlines the workflow used to optimize your Find Outliers result:Initial Data Assessment.There were 137 valid input features.There were 4 outlier locations; these will not be used to compute the polygon cell size.Incident AggregationThe polygon cell size was 49251.0000 Meters.The aggregation process resulted in 72 weighted areas.Incident Count Properties:Min1.0000Max21.0000Mean1.9028Std. Dev.2.4561Scale of AnalysisThe optimal fixed distance band selected was based on peak clustering found at 94199.9365 Meters.Outlier AnalysisCreating the random reference distribution with 499 permutations.There are 3 output features statistically significant based on a FDR correction for multiple testing and spatial dependence.There are 2 statistically significant high outlier features.There are 0 statistically significant low outlier features.There are 0 features part of statistically significant low clusters.There are 1 features part of statistically significant high clusters.OutputPink output features are part of a cluster of high values.Light Blue output features are part of a cluster of low values.Red output features represent high outliers within a cluster of low values.Blue output features represent low outliers within a cluster of high values.
R
Vision Based Building Energy Data Outlier Detection Dataset
universe.roboflow.com
zip
Updated Apr 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
energy data outlier detection (2024). Vision Based Building Energy Data Outlier Detection Dataset [Dataset]. https://universe.roboflow.com/energy-data-outlier-detection/vision-based-building-energy-data-outlier-detection
Explore at:
zipAvailable download formats
Dataset updated
Apr 3, 2024
Dataset authored and provided by
energy data outlier detection
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
11785 Bounding Boxes
Description
Vision Based Building Energy Data Outlier Detection

## Overview Vision Based Building Energy Data Outlier Detection is a dataset for object detection tasks - it contains 11785 annotations for 2,159 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Chemical outlier dataset
zenodo.org
bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mario Lovric; Mario Lovric (2020). Chemical outlier dataset [Dataset]. http://doi.org/10.5281/zenodo.1167835
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1167835
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mario Lovric; Mario Lovric
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The objects are numbered. The Y-variable are boiling points. Other features are structural features of molecules. In the outlier column the outliers are assigned with a value of 1.

The data is derived from a published chemical dataset on boiling point measurements [1] and from public data [2]. Features were generated by means of the RDKit Python library [3]. The dataset was infused with known outliers (~5%) based on significant structural differences, i.e. polar and non-polar molecules.

Cherqaoui D., Villemin D. Use of a Neural Network to determine the Boiling Point of Alkanes. J CHEM SOC FARADAY TRANS. 1994;90(1):97–102.

https://pubchem.ncbi.nlm.nih.gov/

RDKit: Open-source cheminformatics; http://www.rdkit.org
i
Outlier Shapes
ieee-dataport.org
Updated Jan 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kishor datta Gupta (2021). Outlier Shapes [Dataset]. https://ieee-dataport.org/documents/outlier-shapes
Explore at:
Dataset updated
Jan 18, 2021
Authors
Kishor datta Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for outlier test

Facebook

Twitter

Click to copy link

Link copied

Cite

Sevvandi Kandanaarachchi; Mario Munoz Acosta; Kate Smith-Miles; Rob Hyndman (2019). Datasets for outlier detection [Dataset]. http://doi.org/10.26180/5c6253c0b3323

Data from: Datasets for outlier detection

Explore at:

Unique identifier

https://doi.org/10.26180/5c6253c0b3323

Dataset updated

Mar 27, 2019

Dataset provided by

Monash University

Authors

Sevvandi Kandanaarachchi; Mario Munoz Acosta; Kate Smith-Miles; Rob Hyndman

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The zip files contains 12338 datasets for outlier detection investigated in the following papers:

(1) Instance space analysis for unsupervised outlier detection

Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Kate Smith-Miles

(2) On normalization and algorithm selection for unsupervised outlier detection

Authors : Sevvandi Kandanaarachchi, Mario A. Munoz, Rob J. Hyndman, Kate Smith-Miles

Some of these datasets were originally discussed in the paper:

On the evaluation of unsupervised outlier detection:measures, datasets and an empirical study

Authors : G. O. Campos, A, Zimek, J. Sander, R. J.G.B. Campello, B. Micenkova, E. Schubert, I. Assent, M.E. Houle.

Clear search

Close search

Google apps

Main menu

Data from: Datasets for outlier detection

Algorithms for Speeding up Distance-Based Outlier Detection

MNIST dataset for Outliers Detection - [ MNIST4OD ]

Outlier Set Two-step Method (OSTI)

Data from: An Effective Algorithm of Outlier Correction in Space-time Radar...

Data from: A Diagnostic Procedure for Detecting Outliers in Linear...

mnist-outlier

Data from: Privacy Preserving Outlier Detection through Random Nonlinear...

cifar10-outlier

Data from: Error and anomaly detection for intra-participant time-series...

Outlier Detection and Removal Dataset

Outlier Detection on Sensor Data - Dataset - LDM

Outlier Detection and Prevention

Dataset

Contents

Data from: Detection of outlier loci and their utility for fisheries...

Okanagan_Lake_kokanee_microsatellite_data

Data from: Outlier detection in cylindrical data based on Mahalanobis...

Fifth Generation Wireless Channels Outlier Detection and Clustering

Find Outliers Minnesota Hospitals

Vision Based Building Energy Data Outlier Detection Dataset

Vision Based Building Energy Data Outlier Detection

Chemical outlier dataset

Outlier Shapes

Data from: Datasets for outlier detection