3 datasets found
  1. Classification results for: A Study on Classification in Imbalanced and...

    • figshare.com
    application/gzip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Lyon (2023). Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534548.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Robert Lyon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sets supporting the results reported in the paper: A Study on Classification in Imbalanced and Partially-Labelled Data Streams,R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, Systems, Man, and Cybernetics (SMC), 2013. DOI: 10.1109/SMC.2013.260 Contained in this distribution are results of stream and static classifier perfromance on four different data sets. These include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+Telescope MiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identification Skin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+Segmentation The forth data set is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.

  2. Classification results for: Hellinger Distance Trees for Imbalanced Streams

    • figshare.com
    application/gzip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Lyon (2023). Classification results for: Hellinger Distance Trees for Imbalanced Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534549.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Robert Lyon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sets supporting the results reported in the paper: Hellinger Distance Trees for Imbalanced Streams, R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, 22nd International Conference on Pattern Recognition (ICPR), p.1969 - 1974, 2014. DOI: 10.1109/ICPR.2014.344 Contained in this distribution are results of stream classifier perfromance on four different data sets. Also included are the test results from our attempt at reproducing the outcome of the paper, Learning Decision Trees for Un-balanced Data, D. A. Cieslak and N. V. Chawla, in Machine Learning and Knowledge Discovery in Databases (W. Daelemans, B. Goethals, and K. Morik, eds.), vol. 5211 of LNCS, pp. 241-256, 2008. The data sets used for these experiments include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+TelescopeMiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identificationSkin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+SegmentationLetter Recognition Data Set : https://archive.ics.uci.edu/ml/datasets/Letter+RecognitionPen-Based Recognition of Handwritten Digits Data Set : https://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+DigitsStatlog (Landsat Satellite) Data Set : https://archive.ics.uci.edu/ml/datasets/Statlog+(Landsat+Satellite)Statlog (Image Segmentation) Data Set : https://archive.ics.uci.edu/ml/datasets/Statlog+(Image+Segmentation) A further data set used is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.

    A readme file accompanies the data describing it in more detail.

  3. Pulsar Dataset HTRU2

    • kaggle.com
    Updated Jan 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charitarth Chugh (2021). Pulsar Dataset HTRU2 [Dataset]. https://www.kaggle.com/charitarth/pulsar-dataset-htru2/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Charitarth Chugh
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Description:

    Pulsars are a rare type of Neutron star that produce radio emission detectable here on Earth. They are of considerable scientific interest as probes of space-time, the interstellar medium, and states of matter. Machine learning tools are now being used to automatically label pulsar candidates to facilitate rapid analysis. In particular, classification systems are widely adopted, which treat the candidate data sets as binary classification problems.

    Attribute information:

    Each candidate is described by 8 continuous variables and a single class variable. The first four are simple statistics obtained from the integrated pulse profile (folded profile). This is an array of continuous variables that describe a longitude-resolved version of the signal that has been averaged in both time and frequency. The remaining four variables are similarly obtained from the DM-SNR curve. These are summarised below: 1. Mean of the integrated profile. 2. Standard deviation of the integrated profile. 3. Excess kurtosis of the integrated profile. 4. Skewness of the integrated profile. 5. Mean of the DM-SNR curve. 6. Standard deviation of the DM-SNR curve. 7. Excess kurtosis of the DM-SNR curve. 8. Skewness of the DM-SNR curve. 9. Class Descriptions courtesy of Ustav Murarka: - Integrated Pulse Profile: Each pulsar produces a unique pattern of pulse emission known as its pulse profile. It is like a fingerprint of the pulsar. It is possible to identify pulsars from their pulse profile alone. But the pulse profile varies slightly in every period. This makes the pulsar hard to detect. This is because their signals are non-uniform and not entirely stable overtime. However, these profiles do become stable, when averaged over many thousands of rotations. - DM-SNR Curve: Radio waves emitted from pulsars reach earth after traveling long distances in space which is filled with free electrons. Since radio waves are electromagnetic in nature, they interact with these electrons, this interaction results in slowing down of the wave. The important point is that pulsars emit a wide range of frequencies, and the amount by which the electrons slow down the wave depends on the frequency. Waves with higher frequency are sowed down less as compared to waves with higher frequency. i.e. lower frequencies reach the telescope later than higher frequencies. This is called dispersion. Dataset Summary: - 17,898 total examples. 1,639 positive examples. 16,259 negative examples.

    Example

    Example from Prof. Anna Scaife at the University of Manchester, UK- https://as595.github.io/classification/

    Source: (https://archive.ics.uci.edu/ml/datasets/HTRU2)

    Dr. Robert Lyon University of Manchester School of Physics and Astronomy Alan Turing Building Manchester M13 9PL United Kingdom robert.lyon '@' manchester.ac.uk

    -

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Robert Lyon (2023). Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams [Dataset]. http://doi.org/10.6084/m9.figshare.1534548.v1
Organization logoOrganization logo

Classification results for: A Study on Classification in Imbalanced and Partially-Labelled Data Streams

Explore at:
application/gzipAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Robert Lyon
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data sets supporting the results reported in the paper: A Study on Classification in Imbalanced and Partially-Labelled Data Streams,R. J. Lyon, J.M. Brooke, J.D. Knowles, B.W Stappers, Systems, Man, and Cybernetics (SMC), 2013. DOI: 10.1109/SMC.2013.260 Contained in this distribution are results of stream and static classifier perfromance on four different data sets. These include, MAGIC Gamma Telescope Data Set : https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+Telescope MiniBooNE particle identification Data Set : https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identification Skin Segmentation Data Set : https://archive.ics.uci.edu/ml/datasets/Skin+Segmentation The forth data set is not publicly available at present. However we are in the process of releasing it for public use. Please get in touch if you'd like to use it.

Search
Clear search
Close search
Google apps
Main menu