2 datasets found
  1. SDSS Galaxy Subset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Sep 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nuno Ramos Carvalho; Nuno Ramos Carvalho (2022). SDSS Galaxy Subset [Dataset]. http://doi.org/10.5281/zenodo.7050898
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 6, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nuno Ramos Carvalho; Nuno Ramos Carvalho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.

    The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):

    • objid: unique SDSS object identifier
    • mjd: MJD of observation
    • plate: plate identifier
    • tile: tile identifier
    • fiberid: fiber identifier
    • run: run number
    • rerun: rerun number
    • camcol: camera column
    • field: field number
    • ra: right ascension
    • dec: declination
    • class: spectroscopic class (only objetcs with GALAXY are included)
    • subclass: spectroscopic subclass
    • modelMag_u: better of DeV/Exp magnitude fit for band u
    • modelMag_g: better of DeV/Exp magnitude fit for band g
    • modelMag_r: better of DeV/Exp magnitude fit for band r
    • modelMag_i: better of DeV/Exp magnitude fit for band i
    • modelMag_z: better of DeV/Exp magnitude fit for band z
    • redshift: final redshift from SDSS data z
    • stellarmass: stellar mass extracted from the eBOSS Firefly catalog
    • w1mag: WISE W1 "standard" aperture magnitude
    • w2mag: WISE W2 "standard" aperture magnitude
    • w3mag: WISE W3 "standard" aperture magnitude
    • w4mag: WISE W4 "standard" aperture magnitude
    • gz2c_f: Galaxy Zoo 2 classification from Willett et al 2013
    • gz2c_s: simplified version of Galaxy Zoo 2 classification (labels set)

    Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:

    sdss-gs/
    ├── data.csv
    ├── fits
    ├── img
    ├── spectra
    └── ssel

    Where, each directory contains:

    • img: RGB images from the object in JPEG format, 150x150 pixels, generated using the SkyServer DR16 API
    • fits: FITS data subsets around the object across the u, g, r, i, z bands; cut is done using the ImageCutter library
    • spectra: full best fit spectra data from SDSS between 4000 and 9000 wavelengths
    • ssel: best fit spectra data from SDSS for specific selected intervals of wavelengths discussed by Sánchez Almeida 2010

    Changelog

    • v0.0.4 - Increase number of objects to ~100k.
    • v0.0.3 - Increase number of objects to ~80k.
    • v0.0.2 - Increase number of objects to ~60k.
    • v0.0.1 - Initial import.
  2. h

    FunnyData

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sajil Awale, FunnyData [Dataset]. https://huggingface.co/datasets/SajilAwale/FunnyData
    Explore at:
    Authors
    Sajil Awale
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🃏 Labelled r/Jokes Dataset

    A dataset of Reddit jokes from r/Jokes annotated with humor, offensiveness, and sentiment using large language models (LLMs).

      📊 Dataset Overview
    

    LLM-Labeled Subset (Mistral-7B): 55,278 jokes
    Model-Predicted Subset (Fine-tuned RoBERTa): 518,124 jokes

      📄 Column Descriptions
    

    Column Description

    date Date the joke was posted on Reddit (r/Jokes)

    joke The full text of the joke

    score Number of… See the full description on the dataset page: https://huggingface.co/datasets/SajilAwale/FunnyData.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nuno Ramos Carvalho; Nuno Ramos Carvalho (2022). SDSS Galaxy Subset [Dataset]. http://doi.org/10.5281/zenodo.7050898
Organization logo

SDSS Galaxy Subset

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
application/gzipAvailable download formats
Dataset updated
Sep 6, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nuno Ramos Carvalho; Nuno Ramos Carvalho
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.

The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):

  • objid: unique SDSS object identifier
  • mjd: MJD of observation
  • plate: plate identifier
  • tile: tile identifier
  • fiberid: fiber identifier
  • run: run number
  • rerun: rerun number
  • camcol: camera column
  • field: field number
  • ra: right ascension
  • dec: declination
  • class: spectroscopic class (only objetcs with GALAXY are included)
  • subclass: spectroscopic subclass
  • modelMag_u: better of DeV/Exp magnitude fit for band u
  • modelMag_g: better of DeV/Exp magnitude fit for band g
  • modelMag_r: better of DeV/Exp magnitude fit for band r
  • modelMag_i: better of DeV/Exp magnitude fit for band i
  • modelMag_z: better of DeV/Exp magnitude fit for band z
  • redshift: final redshift from SDSS data z
  • stellarmass: stellar mass extracted from the eBOSS Firefly catalog
  • w1mag: WISE W1 "standard" aperture magnitude
  • w2mag: WISE W2 "standard" aperture magnitude
  • w3mag: WISE W3 "standard" aperture magnitude
  • w4mag: WISE W4 "standard" aperture magnitude
  • gz2c_f: Galaxy Zoo 2 classification from Willett et al 2013
  • gz2c_s: simplified version of Galaxy Zoo 2 classification (labels set)

Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:

sdss-gs/
├── data.csv
├── fits
├── img
├── spectra
└── ssel

Where, each directory contains:

  • img: RGB images from the object in JPEG format, 150x150 pixels, generated using the SkyServer DR16 API
  • fits: FITS data subsets around the object across the u, g, r, i, z bands; cut is done using the ImageCutter library
  • spectra: full best fit spectra data from SDSS between 4000 and 9000 wavelengths
  • ssel: best fit spectra data from SDSS for specific selected intervals of wavelengths discussed by Sánchez Almeida 2010

Changelog

  • v0.0.4 - Increase number of objects to ~100k.
  • v0.0.3 - Increase number of objects to ~80k.
  • v0.0.2 - Increase number of objects to ~60k.
  • v0.0.1 - Initial import.
Search
Clear search
Close search
Google apps
Main menu