Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.
The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):
Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:
sdss-gs/
├── data.csv
├── fits
├── img
├── spectra
└── ssel
Where, each directory contains:
Changelog
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
🃏 Labelled r/Jokes Dataset
A dataset of Reddit jokes from r/Jokes annotated with humor, offensiveness, and sentiment using large language models (LLMs).
📊 Dataset Overview
LLM-Labeled Subset (Mistral-7B): 55,278 jokes
Model-Predicted Subset (Fine-tuned RoBERTa): 518,124 jokes
📄 Column Descriptions
Column Description
date Date the joke was posted on Reddit (r/Jokes)
joke The full text of the joke
score Number of… See the full description on the dataset page: https://huggingface.co/datasets/SajilAwale/FunnyData.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.
The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):
Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:
sdss-gs/
├── data.csv
├── fits
├── img
├── spectra
└── ssel
Where, each directory contains:
Changelog