3 datasets found

Classification results for: A Study on Classification in Imbalanced and...
figshare.com
application/gzip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Lyon (2023). Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534548.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1534548.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Robert Lyon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data sets supporting the results reported in the paper: A Study on Classification in Imbalanced and Partially-Labelled Data Streams,R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, Systems, Man, and Cybernetics (SMC), 2013. DOI: 10.1109/SMC.2013.260 Contained in this distribution are results of stream and static classifier perfromance on four different data sets. These include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+Telescope MiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identification Skin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+Segmentation The forth data set is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.
Classification results for: Hellinger Distance Trees for Imbalanced Streams
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Lyon (2023). Classification results for: Hellinger Distance Trees for Imbalanced Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534549.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1534549.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Robert Lyon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data sets supporting the results reported in the paper: Hellinger Distance Trees for Imbalanced Streams, R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, 22nd International Conference on Pattern Recognition (ICPR), p.1969 - 1974, 2014. DOI: 10.1109/ICPR.2014.344 Contained in this distribution are results of stream classifier perfromance on four different data sets. Also included are the test results from our attempt at reproducing the outcome of the paper, Learning Decision Trees for Un-balanced Data, D. A. Cieslak and N. V. Chawla, in Machine Learning and Knowledge Discovery in Databases (W. Daelemans, B. Goethals, and K. Morik, eds.), vol. 5211 of LNCS, pp. 241-256, 2008. The data sets used for these experiments include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+TelescopeMiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identificationSkin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+SegmentationLetter Recognition Data Set : https://archive.ics.uci.edu/ml/datasets/Letter+RecognitionPen-Based Recognition of Handwritten Digits Data Set : https://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+DigitsStatlog (Landsat Satellite) Data Set : https://archive.ics.uci.edu/ml/datasets/Statlog+(Landsat+Satellite)Statlog (Image Segmentation) Data Set : https://archive.ics.uci.edu/ml/datasets/Statlog+(Image+Segmentation) A further data set used is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.

A readme file accompanies the data describing it in more detail.
Pulsar Dataset HTRU2
kaggle.com
Updated Jan 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Charitarth Chugh (2021). Pulsar Dataset HTRU2 [Dataset]. https://www.kaggle.com/charitarth/pulsar-dataset-htru2/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 10, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Charitarth Chugh
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description:

Pulsars are a rare type of Neutron star that produce radio emission detectable here on Earth. They are of considerable scientific interest as probes of space-time, the interstellar medium, and states of matter. Machine learning tools are now being used to automatically label pulsar candidates to facilitate rapid analysis. In particular, classification systems are widely adopted, which treat the candidate data sets as binary classification problems.

Attribute information:

Each candidate is described by 8 continuous variables and a single class variable. The first four are simple statistics obtained from the integrated pulse profile (folded profile). This is an array of continuous variables that describe a longitude-resolved version of the signal that has been averaged in both time and frequency. The remaining four variables are similarly obtained from the DM-SNR curve. These are summarised below: 1. Mean of the integrated profile. 2. Standard deviation of the integrated profile. 3. Excess kurtosis of the integrated profile. 4. Skewness of the integrated profile. 5. Mean of the DM-SNR curve. 6. Standard deviation of the DM-SNR curve. 7. Excess kurtosis of the DM-SNR curve. 8. Skewness of the DM-SNR curve. 9. Class Descriptions courtesy of Ustav Murarka: - Integrated Pulse Profile: Each pulsar produces a unique pattern of pulse emission known as its pulse profile. It is like a fingerprint of the pulsar. It is possible to identify pulsars from their pulse profile alone. But the pulse profile varies slightly in every period. This makes the pulsar hard to detect. This is because their signals are non-uniform and not entirely stable overtime. However, these profiles do become stable, when averaged over many thousands of rotations. - DM-SNR Curve: Radio waves emitted from pulsars reach earth after traveling long distances in space which is filled with free electrons. Since radio waves are electromagnetic in nature, they interact with these electrons, this interaction results in slowing down of the wave. The important point is that pulsars emit a wide range of frequencies, and the amount by which the electrons slow down the wave depends on the frequency. Waves with higher frequency are sowed down less as compared to waves with higher frequency. i.e. lower frequencies reach the telescope later than higher frequencies. This is called dispersion. Dataset Summary: - 17,898 total examples. 1,639 positive examples. 16,259 negative examples.

Example

Example from Prof. Anna Scaife at the University of Manchester, UK- https://as595.github.io/classification/

Source: (https://archive.ics.uci.edu/ml/datasets/HTRU2)

Dr. Robert Lyon University of Manchester School of Physics and Astronomy Alan Turing Building Manchester M13 9PL United Kingdom robert.lyon '@' manchester.ac.uk

-
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Robert Lyon (2023). Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534548.v1

Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.1534548.v1

Dataset updated

May 30, 2023

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Robert Lyon

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data sets supporting the results reported in the paper: A Study on Classification in Imbalanced and Partially-Labelled Data Streams,R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, Systems, Man, and Cybernetics (SMC), 2013. DOI: 10.1109/SMC.2013.260 Contained in this distribution are results of stream and static classifier perfromance on four different data sets. These include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+Telescope MiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identification Skin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+Segmentation The forth data set is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.

Clear search

Close search

Google apps

Main menu

Classification results for: A Study on Classification in Imbalanced and...

Classification results for: Hellinger Distance Trees for Imbalanced Streams

Pulsar Dataset HTRU2

Description:

Attribute information:

Example

Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams