18 datasets found

S
Galaxy, star, quasar dataset
scidb.cn
Updated Feb 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li Xin (2023). Galaxy, star, quasar dataset [Dataset]. http://doi.org/10.57760/sciencedb.07177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07177
Dataset updated
Feb 3, 2023
Dataset provided by
Science Data Bank
Authors
Li Xin
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data used in this paper is from the 16th issue of SDSS. SDSS-DR16 contains a total of 930,268 photometric images, with 1.2 billion observation sources and tens of millions of spectra. The data obtained in this paper is downloaded from the official website of SDSS. Specifically, the data is obtained through the SkyServerAPI structure by using SQL query statements in the subwebsite CasJobs. As the current SDSS photometric table PhotoObj can only classify all observed sources as point sources and surface sources, the target sources can be better classified as galaxies, stars and quasars through spectra. Therefore, we obtain calibrated sources in CasJobs by crossing SpecPhoto with the PhotoObj star list, and obtain target position information (right ascension and declination). Calibrated sources can tell them apart precisely and quickly. Each calibrated source is labeled with the parameter "Class" as "galaxy", "star", or "quasar". In this paper, observation day area 3462, 3478, 3530 and other 4 areas in SDSS-DR16 are selected as experimental data, because a large number of sources can be obtained in these areas to provide rich sample data for the experiment. For example, there are 9891 sources in the 3462-day area, including 2790 galactic sources, 2378 stellar sources and 4723 quasar sources. There are 3862 sources in the 3478 day area, including 1759 galactic sources, 577 stellar sources and 1526 quasar sources. FITS files are a commonly used data format in the astronomical community. By cross-matching the star list and FITS files in the local celestial region, we obtained images of 5 bands of u, g, r, i and z of 12499 galaxy sources, 16914 quasar sources and 16908 star sources as training and testing data.1.1 Image SynthesisSDSS photometric data includes photometric images of five bands u, g, r, i and z, and these photometric image data are respectively packaged in single-band format in FITS files. Images of different bands contain different information. Since the three bands g, r and i contain more feature information and less noise, Astronomical researchers typically use the g, r, and i bands corresponding to the R, G, and B channels of the image to synthesize photometric images. Generally, different bands cannot be directly synthesized. If three bands are directly synthesized, the image of different bands may not be aligned. Therefore, this paper adopts the RGB multi-band image synthesis software written by He Zhendong et al. to synthesize images in g, r and i bands. This method effectively avoids the problem that images in different bands cannot be aligned. The pixel of each photometry image in this paper is 2048×1489.1.2 Data tailoringThis paper first clipped the target image, image clipping can use image segmentation tools to solve this problem, this paper uses Python to achieve this process. In the process of clipping, we convert the right ascension and declination of the source in the star list into pixel coordinates on the photometric image through the coordinate conversion formula, and determine the specific position of the source through the pixel coordinates. The coordinates are regarded as the center point and clipping is carried out in the form of a rectangular box. We found that the input image size affects the experimental results. Therefore, according to the target size of the source, we selected three different cutting sizes, 40×40, 60×60 and 80×80 respectively. Through experiment and analysis, we find that convolutional neural network has better learning ability and higher accuracy for data with small image size. In the end, we chose to divide the surface source galaxies, point source quasars, and stars into 40×40 sizes.1.3 Division of training and test dataIn order to make the algorithm have more accurate recognition performance, we need enough image samples. The selection of training set, verification set and test set is an important factor affecting the final recognition accuracy. In this paper, the training set, verification set and test set are set according to the ratio of 8:1:1. The purpose of verification set is used to revise the algorithm, and the purpose of test set is used to evaluate the generalization ability of the final algorithm. Table 1 shows the specific data partitioning information. The total sample size is 34,000 source images, including 11543 galaxy sources, 11967 star sources, and 10490 quasar sources.1.4 Data preprocessingIn this experiment, the training set and test set can be used as the training and test input of the algorithm after data preprocessing. The data quantity and quality largely determine the recognition performance of the algorithm. The pre-processing of the training set and the test set are different. In the training set, we first perform vertical flip, horizontal flip and scale on the cropped image to enrich the data samples and enhance the generalization ability of the algorithm. Since the features in the celestial object source have the flip invariability, the labels of galaxies, stars and quasars will not change after rotation. In the test set, our preprocessing process is relatively simple compared with the training set. We carry out simple scaling processing on the input image and test input the obtained image.
f
UniverseMachine Data Release 1
arizona.figshare.com
png
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Behroozi; Risa Wechsler; Andrew Hearin; Charlie Conroy (2023). UniverseMachine Data Release 1 [Dataset]. http://doi.org/10.25422/azu.data.12093972.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.25422/azu.data.12093972.v1
Dataset updated
May 31, 2023
Dataset provided by
University of Arizona Research Data Repository
Authors
Peter Behroozi; Risa Wechsler; Andrew Hearin; Charlie Conroy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The UniverseMachine is a self-consistent empirical model of galaxy formation in dark matter halos. It is constrained via observed galaxy stellar mass functions, star formation rates, clustering, luminosity functions, and quenched fractions. This dataset includes derived constraints on galaxy-halo relationships, star formation histories, merger histories, and predicted observables.Full mock catalogs with galaxy properties are available here.For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu
Training data for 'Exome sequencing data analysis' tutorial (Galaxy Training...
zenodo.org
bin
Updated Aug 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Maier; Wolfgang Maier (2022). Training data for 'Exome sequencing data analysis' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.3054169
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3054169
Dataset updated
Aug 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wolfgang Maier; Wolfgang Maier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data used in this tutorial are a subset of the data published previously in Training material for the course "Exome analysis with GALAXY". Credit for uploading the original data goes to Paolo Uva and Gianmauro Cuccuru!

Specifically, you may need the following datasets for following the tutorial:

Raw sequencing reads

https://zenodo.org/record/3243160/files/father_R1.fq.gz

https://zenodo.org/record/3243160/files/father_R2.fq.gz

https://zenodo.org/record/3243160/files/mother_R1.fq.gz

https://zenodo.org/record/3243160/files/mother_R2.fq.gz

https://zenodo.org/record/3243160/files/proband_R1.fq.gz

https://zenodo.org/record/3243160/files/proband_R2.fq.gz

Premapped sequencing reads

https://zenodo.org/record/3243160/files/mapped_reads_father.bam

https://zenodo.org/record/3243160/files/mapped_reads_mother.bam

https://zenodo.org/record/3243160/files/mapped_reads_proband.bam

Reference sequence (human chromosome 8)

https://zenodo.org/record/3243160/files/hg19_chr8.fa.gz

If you would just like to play with GEMINI rather than work through the full tutorial, you'll find below a prebuilt GEMINI database (for GEMINI version 0.20.1) for the family trio. You can start exploring this database without having to run GEMINI load and, in fact, without having to install GEMINI's bundled annotation data.
Photometric Galaxy Redshift Prediction
zenodo.org
csv
Updated Apr 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
grigorios tsagkatakis; grigorios tsagkatakis (2024). Photometric Galaxy Redshift Prediction [Dataset]. http://doi.org/10.5281/zenodo.11073039
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11073039
Dataset updated
Apr 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
grigorios tsagkatakis; grigorios tsagkatakis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sloan Digital Sky Survey (SDSS) Galaxy Redshift Dataset

This dataset comprises a curated collection of galaxy observations from the Sloan Digital Sky Survey (SDSS). It features photometric and spectroscopic data for 100 galaxies, specifically selected to cover a range of redshifts from 0 to 0.4. The dataset includes the following key parameters for each galaxy:

Photometric Data: Magnitudes in the SDSS 'u', 'g', 'r', 'i', and 'z' bands.

Spectroscopic Data: Measured redshift (redshift) and its error (redshift_error).

Additional Metadata:

objid: Unique identifier for the photometric object.

specObjID: Unique identifier for the spectroscopic object.

ra: Right ascension in decimal degrees.

dec: Declination in decimal degrees.

class: Classification of the object, all marked as 'GALAXY'.

Purpose and Use

This dataset is intended for use in astronomical research and education, particularly in studies involving galaxy properties and distribution, cosmology, and machine learning applications such as redshift prediction models. The data is well-suited for developing and testing predictive models that estimate redshifts from photometric data, aiding in the expansion of accessible astronomical analysis tools.

Data Collection Method

The data was extracted using SQL queries against the public SDSS DR16 database, ensuring accuracy and relevance in current astronomical research contexts.

Accessibility

The dataset is made available under a CC0 license to promote open scientific research and collaboration within the astronomical community and beyond.
o
Bulk RNA-Seq Deconvolution with single-cell RNA-Seq Datasets
explore.openaire.eu
Updated Oct 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wendi Bacon; Mehmet Tekman (2021). Bulk RNA-Seq Deconvolution with single-cell RNA-Seq Datasets [Dataset]. http://doi.org/10.5281/zenodo.5719228
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5719228
Dataset updated
Oct 6, 2021
Authors
Wendi Bacon; Mehmet Tekman
Description
Bulk data of human pancreas The dataset from Fadista et al. (2014) contains raw read counts data from bulk RNA-seq of human pancreatic islets to study glucose metabolism in healthy and hyper-hypoglycemic conditions. For the purpose of this vignette, the dataset is pre-processed and made available on the data download page. In addition to read counts, this dataset also contains HbA1c levels, BMI, gender and age information for each subject. Single Cell Data of Human Pancreas The single cell data are from Segerstolpe et al. (2016), which constrains read counts for 25453 genes across 2209 cells. Here we only include the 1097 cells from 6 healthy subjects. The read counts are available on the data download page, in the form of an ExpressionSet. Another single cell data is from Xin et al. (2016), which have 39849 genes and 1492 cells. The read counts are available on the data download page, in the form of an ExpressionSet. The deconvolution of 89 subjects from Fadista et al. (2014) are preformed with bulk data GSE50244.bulk.eset and single cell reference EMTAB.eset. We constrained our estimation on 6 major cell types: alpha, beta, delta, gamma, acinar and ductal, which make up over 90% of the whole islet.
Galaxy Zoo 2: Images
kaggle.com
Updated Jan 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jaime Trickz (2021). Galaxy Zoo 2: Images [Dataset]. https://www.kaggle.com/jaimetrickz/galaxy-zoo-2-images/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 26, 2021
Dataset provided by
Kaggle
Authors
Jaime Trickz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

The Galaxy Zoo team regularly receives requests for subject images for various versions of Galaxy Zoo, in order to facilitate other investigations, e.g. machine learning projects. This repository is an updated attempt to provide those in a way that is useful to the wider community.

Content

There are 243,434 images in total. This is off by about 0.08% from the total count in the tables - it's not clear what the cause of the discrepancy is

The images are available in the file images_gz2.

The most recent and reliable source for morphology measurements is "GZ2 - Table 1 - Normal-depth sample with new debiasing method – CSV" (from Hart et al. 2016), which is available at data.galaxyzoo.org To cross-reference the images with Table 1, this sample includes another CSV table (gz2_filename_mapping.csv) which contains three columns and 355,990 rows. The columns are:

objid: the Data Release 7 (DR7) object ID for each galaxy. This should match the first column in Table 1.

sample: string indicating the subsampling of the galaxy.

asset_id: an integer that corresponds to the filename of the image in the zipped file linked above.

Acknowledgements

They are the "original" sample of subject images in Galaxy Zoo 2 (Willett et al. 2013, MNRAS, 435, 2835, DOI: 10.1093/mnras/stt1458) as identified in Table 1 of Willett et al. and also in Hart et al. (2016, MNRAS, 461, 3663, DOI: 10.1093/mnras/stw1588).

Inspiration

I want to know if it's possible to cluster the images in galaxy shape types of Hubble - de Vaucouleurs Galaxy Morphology Diagram:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6067505%2F8ac7df09aa0f85a1a07ac9dc0a81b57f%2FHubble_-_de_Vaucouleurs_Galaxy_Morphology_Diagram.png?generation=1611680439647479&alt=media" alt="">

Ellipticals: with shapes from spherical to cilindrical almost homogeneous density.

Spirals: with two or more arms (like the classical view of Milky Way Galaxy) and a dense core.

Irregulars: with non defined shape, heterogeneous density.

If this three are not enough and you want to improve your notebook is possible to add:

Lenticulars: disk shaped galaxies with a dense core.

Barred Spirals: Type of spiral with straight arms near to the core and bended far of it.

Usual Spirals: Type of spiral with bended arms from the core to the end.

Intermediate Spirals: Type of spiral with non-defined arms.

Dwarf Galaxy: Tiny irregular heterogeneous galaxy.

Didn't add this to the first clusters due to depending on the angle of the galaxy some lenticulars may seem Ellipticals or Spirals, is hard to see always the arms of spiral galaxies and is hard to determine if a galaxy is tiny or big with just a photography and nothing to compare.
Data from: PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements...
seanoe.org
bin
Updated 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremy Berthelier; Nathalie Casse; Nicolas Daccord; Véronique Jamilloux; Bruno Saint-Jean; Gregory Carrier (2018). PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements [Dataset]. http://doi.org/10.17882/51795
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.17882/51795
Dataset updated
2018
Dataset provided by
SEANOE
Authors
Jeremy Berthelier; Nathalie Casse; Nicolas Daccord; Véronique Jamilloux; Bruno Saint-Jean; Gregory Carrier
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
to date, genome assembly of non-model organisms is usually not at chromosomal level and higly fragmented. this fragmentation is recognized to be, in part, the result of a bad assembly of the transposable elements (tes) copies, increasing the difficulty to detect and annotate them.in this context, we designed a new bioinformatics pipeline named pirate for detect, classify and annotate tes of non-model organisms. pirate combines multiple analysis packages representing all the major approaches for te detection. the goal is to promote the detection of complete te sequences of every te families. the detection of complete te sequences, bearing recognizable conserved domains or specific motifs, allows to facilitate the classification step. the classification step of pirate has been optimized for algal genomes.each tools used by pirate are automated into a stand-alone galaxy. this pirate-galaxy can be used through a virtual machine, which can be download below.this pirate-galaxy is a suitable and flexible platform to study tes in the genome of every organisms.you can find a tutorial below.please contact us if you have any issues or comments : berthelier.j [at] laposte.net or gregory.carrier [at] ifremer.fror you can leave a message on github: https://github.com/jberthelier/pirate/issues
Training data for 'Unicycler assembly of SARS-CoV-2 genome with...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Aug 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Maier; Wolfgang Maier (2022). Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.3732359
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3732359
Dataset updated
Aug 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wolfgang Maier; Wolfgang Maier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data here is a copy of the corresponding SRR records in the NCBI SRA. The duplication serves a dual purpose:

as a backup should there be problems connecting to NCBI servers, e.g., during Galaxy user trainings.

to illustrate how to obtain raw sequencing data from alternative sources, and to organize the data into the same collection structure in a Galaxy history that is generated by specialized Galaxy SRA download tools.
Galaxy Entertainment Group Ltd. Online Gambling Market Insights
statistics.technavio.org
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2024). Galaxy Entertainment Group Ltd. Online Gambling Market Insights [Dataset]. https://statistics.technavio.org/galaxy-entertainment-group-ltd-online-gambling-market-insights
Explore at:
Dataset updated
Feb 9, 2024
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Worldwide
Description
Download Free Sample
The online gambling market is expected to grow at a CAGR of 11% during the forecast period. This market growth can be attributed to various factors including rising popularity of the freemium model.

The online gambling market report offers several other valuable insights such as:

CAGR of the market during the forecast period 2020-2024 Detailed information on factors that will drive online gambling market growth during the next five years Precise estimation of the online gambling market size and its contribution to the parent market Accurate predictions on upcoming trends and changes in consumer behavior The growth of the online gambling market industry across APAC, Europe, MEA, North America, and South America A thorough analysis of the market’s competitive landscape and detailed information on vendors Comprehensive details of factors that will challenge the growth of online gambling market vendors
u
Galactic interstellar dust Gaia-2MASS 3D maps
cdsarc.u-strasbg.fr
Updated May 28, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDS (2019). Galactic interstellar dust Gaia-2MASS 3D maps [Dataset]. http://doi.org/10.26093/cds/vizier.36250135
Explore at:
Unique identifier
https://doi.org/10.26093/cds/vizier.36250135
Dataset updated
May 28, 2019
Dataset provided by
CDS
Description
VizieR Online Data Catalog: Galactic interstellar dust Gaia-2MASS 3D maps(Lallement R.+, 2019)
Data from: Genetic Characteristics and Phylogenetic Relationships of 18...
figshare.com
zip
Updated Mar 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenyu Sun (2025). Genetic Characteristics and Phylogenetic Relationships of 18 Anchovy Species Based on Mitochondrial Genomes in the Seas Around China [Dataset]. http://doi.org/10.6084/m9.figshare.28227167.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28227167.v2
Dataset updated
Mar 29, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Wenyu Sun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We downloaded the complete mitochondrial genome data of 18 Engraulidae fish species from the NCBI database (https://www.ncbi.nlm.nih.gov/). These files were stored in the “Download data” folder. Subsequently, we reannotated these mitochondrial genomes using the MITOS2 online tool available on the Galaxy website (https://usegalaxy.org/) and manually modified the original gb files to adjust the inaccurately annotated control regions and to add the annotation information for the light-strand replication origin. The revised files were saved in the “Reannotation” folder and were used for subsequent analyses.
Data from: The Massive and Distant Clusters of WISE Survey. XII. Exploring...
zenodo.org
bin, zip
Updated Apr 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mustafa Muhibullah; Mustafa Muhibullah (2024). The Massive and Distant Clusters of WISE Survey. XII. Exploring X-ray AGN in Dynamically Active Massive Galaxy Clusters at z ∼ 1 [Dataset]. http://doi.org/10.5281/zenodo.11074555
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11074555
Dataset updated
Apr 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mustafa Muhibullah; Mustafa Muhibullah
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset corresponds to the manuscript titled "The Massive and Distant Clusters of WISE Survey. XII. Exploring X-ray AGN in Dynamically Active Massive Galaxy Clusters at z ∼ 1," which has been submitted to The Astrophysical Journal. To reproduce the plots and access the catalogs used in the paper, please download and extract all the zip folders and the Jupyter Notebook "madcows_master_notebook.ipynb" provided under the same directory. Then, open the notebook and follow the instructions provided within. If you encounter any issues, please contact the corresponding author for assistance.
Global smartphone unit shipments of Samsung 2010-2024, by quarter
statista.com
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global smartphone unit shipments of Samsung 2010-2024, by quarter [Dataset]. https://www.statista.com/statistics/299144/samsung-smartphone-shipments-worldwide/
Explore at:
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In the fourth quarter of 2024, Samsung shipped around ** million smartphones, a decrease from the both the previous quarter and the same quarter of the previous year. Samsung’s sales consistently place the smartphone giant among the top three smartphone vendors in the world, alongside Xiaomi and Apple. Samsung smartphone sales – how many phones does Samsung sell? Global smartphone sales reached over *** billion units during 2024. While the global smartphone market is led by Samsung and Apple, Xiaomi has gained ground following the decline of Huawei. Together, these three companies hold more than ** percent of the global smartphone market share.
Sloan Digital Sky Survey - DR18
kaggle.com
Updated Jul 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farid R (2023). Sloan Digital Sky Survey - DR18 [Dataset]. https://www.kaggle.com/datasets/diraf0/sloan-digital-sky-survey-dr18/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 29, 2023
Dataset provided by
Kaggle
Authors
Farid R
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16012776%2Fdb7fd8faf4277c85822f8bbfe5e113d2%2Farnaud-mariat-45Z6hW1dQMI-unsplash.jpg?generation=1690636699354713&alt=media" alt="">

This dataset consists of 100,000 observations from the Data Release (DR) 18 of the Sloan Digital Sky Survey (SDSS). Each observation is described by 42 features and 1 class column classifying the observation as either:

a STAR

a GALAXY

a QSO (Quasi-Stellar Object) or a Quasar.

You can read more about the features below:

Objid, Specobjid - Object Identifiers

ra - J2000 Right Ascension

dec - J2000 Declination

redshift - Final Redshift of the celestial object

u, g, r, i, and z - better of DeV/Exp magnitude fit for u, g, r, i, and z. u, g, r, i, and z correspond to the five photometric bands namely ultraviolet band, green band, red band, infrared band, and near infrared band respectively.

run - Run number

rerun - Rerun number

camcol - Camera column

field - Field number

The run number refers to a specific period in which the SDSS observes a part of the sky. SDSS is divided into several runs, each lasting for a certain amount of time, which are then combined to cover an extensive portion of the sky. The rerun number refers to the reprocessing of the data obtained.

In each run, multiple charge-coupled device (CCD) cameras are arranged into a column which are responsible for imaging a specific portion of the sky. camcol refers to the camera column number which imaged a specific observation. A field is a specific portion of the sky that is imaged during a single exposure of the telescope. The entire sky is divided into a portion of fields and the field number column refers to the field or portion of the sky from which an observation was obtained.

plate - Plate number

fiberID - Optical Fiber ID

A number of physical glass plates are mounted on the telescope, each containing a number of optical fibers corresponding to a specific position in the sky. When light hits these optical fibers, it is sent to spectrographs for analysis. plate number and fiberID refer to the number of the plate and the ID of the optical fiber responsible for gathering light from the celestial object respectively.

mjd - Modified Julian Date

Modified Julian Date represents the number of days that have passed since midnight Nov. 17, 1858. It is used in SDSS to keep track of the time of each observation.

petroRad_u, petroRad_g, petroRad_r, petroRad_i, and petroRad_z - Petrosian Radii for the five photometric bands u (ultraviolet), g (green), r (red), i (infrared), and z (near-infrared) respectively.

The petrosian radius is a measure of the size of a galaxy, and it is calculated using the petrosian flux profile. The petrosian flux profile measures how the brightness of an object varies with distance from its center. The petrosian radius is defined as the distance from the galaxy's center where the ratio of the local surface brightness to the average surface brightness reaches a certain predefined value. The local surface brightness refers to the brightness of a specific small region or pixel on the surface of an extended object. It is a measure of how much light is detected from that particular region. The average surface brightness, on the other hand, represents the mean or average brightness measured over the entire surface of the extended object. It is the total amount of light received from the object divided by its total area.

These parameters help in characterizing the properties of celestial objects, especially when studying their morphologies, sizes, and how they evolve over time.

petroFlux_u, petroFlux_g, petroFlux_r, petroFlux_i, and petroFlux_z - Petrosian Fluxes for the five photometric bands u (ultraviolet), g (green), r (red), i (infrared), and z (near-infrared) respectively. These features describe the total amount of light emitted from the celestial objects.

These parameters help in studying the photometric properties of the celestial objects, particularly in analyzing the brightness, colors, and spectral energy distribution of the objects. By using petrosian fluxes in different bands, astronomers can obtain a comprehensive view of an object's light emission across the electromagnetic spectrum.

petroR50_u, petroR50_g, petroR50_r, petroR50_i, and petroR50_z - Petrosian half-light radii for the five photometric bands u (ultraviolet), g (green), r (red), i (infrared), and z (near-infrared) respectively. PetroR50 is a measure of the radius at which half of the total light (or flux) emitted from a celestial object is enclosed with the petrosian aperture. The petrosian aperture is defined based on the petrosian radius, which is a measure of the size of the celestial object. The petrosian aperture allows a...
Pan-cancer Aberrant Pathway Activity Analysis (PAPAA)
zenodo.org
explore.openaire.eu
application/gzip, csv +1
Updated Dec 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI (2020). Pan-cancer Aberrant Pathway Activity Analysis (PAPAA) [Dataset]. http://doi.org/10.5281/zenodo.3629709
Explore at:
application/gzip, tsv, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3629709
Dataset updated
Dec 5, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Information about the dataset files:

1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]

2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]

4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]

6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]

7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]

8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]

9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]

10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]

11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.

12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]

13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]

14) compounds.csv: list of pharmacological compounds tested for our analysis

15) tcga_dictonary.tsv: list of cancer types used in the analysis.

16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]
X-ray investigation of the remarkable galaxy group Nest200047
zenodo.org
application/gzip
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anwesh Majumder; Anwesh Majumder; Aurora Simionescu; Aurora Simionescu; Tomáš Plšek; Tomáš Plšek; Marisa Brienza; Marisa Brienza; Eugene Churazov; Eugene Churazov; Ildar Khabibullin; Ildar Khabibullin; Fabio Gastaldello; Fabio Gastaldello; ANDREA BOTTEON; ANDREA BOTTEON; Huub Rottgering; Huub Rottgering; Marcus Brüggen; Marcus Brüggen; Natalia Lyskova; Natalia Lyskova; Kamlesh Rajpurohit; Kamlesh Rajpurohit; Rashid Sunyaev; Rashid Sunyaev; Michael Wise; Michael Wise (2025). X-ray investigation of the remarkable galaxy group Nest200047 [Dataset]. http://doi.org/10.5281/zenodo.15650741
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15650741
Dataset updated
Jul 26, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anwesh Majumder; Anwesh Majumder; Aurora Simionescu; Aurora Simionescu; Tomáš Plšek; Tomáš Plšek; Marisa Brienza; Marisa Brienza; Eugene Churazov; Eugene Churazov; Ildar Khabibullin; Ildar Khabibullin; Fabio Gastaldello; Fabio Gastaldello; ANDREA BOTTEON; ANDREA BOTTEON; Huub Rottgering; Huub Rottgering; Marcus Brüggen; Marcus Brüggen; Natalia Lyskova; Natalia Lyskova; Kamlesh Rajpurohit; Kamlesh Rajpurohit; Rashid Sunyaev; Rashid Sunyaev; Michael Wise; Michael Wise
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data reproduction package for the paper "X-ray investigation of the remarkable galaxy group Nest200047" by Anwesh Majumder, M.W. Wise, A. Simionescu, M.N. de Vries (accepted in MNRAS).

Raw data: The Chandra and XMM data can be downloaded from https://cda.harvard.edu/chaser/ and http://nxsa.esac.esa.int/nxsa-web/#search using the observation IDs. See the 'Data' section of the paper to know what to download. Any additional data source has been mentioned in the paper as footnotes.

Software required:

CIAO (https://cxc.cfa.harvard.edu/ciao/)

XMM-SAS (https://www.cosmos.esa.int/web/xmm-newton/download-and-install-sas)

Jupyter Notebook and Python-3.9 or higher (https://jupyter.org)

SPEX (https://spex-xray.github.io/spex-help/index.html)

PyProffit (https://pyproffit.readthedocs.io/en/latest/index.html)

CXBups (https://zenodo.org/records/2575495)

There are README files inside directories.
Datasets of the DIMet manuscript
zenodo.org
zip
Updated Apr 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johanna Galvis; Joris Guyon; Benjamin Dartigues; Helge Hecht; Florian Specque; Hayssam Soueidan; Slim Karkar; Thomas Daubon; Macha Nikolski; Johanna Galvis; Joris Guyon; Benjamin Dartigues; Helge Hecht; Florian Specque; Hayssam Soueidan; Slim Karkar; Thomas Daubon; Macha Nikolski (2024). Datasets of the DIMet manuscript [Dataset]. http://doi.org/10.5281/zenodo.10579862
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10579862
Dataset updated
Apr 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Johanna Galvis; Joris Guyon; Benjamin Dartigues; Helge Hecht; Florian Specque; Hayssam Soueidan; Slim Karkar; Thomas Daubon; Macha Nikolski; Johanna Galvis; Joris Guyon; Benjamin Dartigues; Helge Hecht; Florian Specque; Hayssam Soueidan; Slim Karkar; Thomas Daubon; Macha Nikolski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets for reproducing the results of the manuscript "DIMet : An open-source tool for Differential analysis of targeted Isotope-labeled Metabolomics data". DIMet tool is available here, and the tool documentation is accessible in the DIMet wiki page and in its Galaxy site.

Users of the Galaxy version of DIMet:

download and decompress (unzip) the .zip file.

within the 'datasets_manuscript_DIMet/' there is a sub-folder data/, preserve.

within 'datasets_manuscript_DIMet/' there is a sub-folder config/, the user can delete it as it is not used in the Galaxy version.

use the .csv files that are provided in data/ . The specific .csv files to be given as input are explained in each 'dimet_' module in Galaxy.

check metadata_endo_ldh.csv and metadata_timeseries.csv files: if all the content has quotes (") for delimiting the strings, please edit the file in a plain text editor (e.g. pad, gedit, etc) and delete such quotes (replace all " by no character). These quotes (") in the samples metadata, which are tolerated in the command line version, are not allowed in the galaxy version.

Users of the command-line version of DIMet:

download, decompress it and follow the instructions of the documentation in the DIMet wiki page.
Anti-Spoofing Dataset, 95,000 sets
kaggle.com
Updated Jul 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Axon Labs (2025). Anti-Spoofing Dataset, 95,000 sets [Dataset]. https://www.kaggle.com/datasets/axondata/face-anti-spoofing-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 20, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Axon Labs
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Anti-Spoofing dataset: live, replay, cut, print, 3D masks - large-scale face anti spoofing

This dataset delivers a single, end-to-end resource for training and benchmarking facial liveness-detection systems. By aggregating live sessions and eleven realistic presentation-attack classes into one collection, it accelerates development toward iBeta Level 1/2 compliance and strengthens model robustness against the full spectrum of spoofing tactics

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20109613%2F6432e95d7b7fef1d271457f172e11e0c%2FFrame%20103-3.png?generation=1753867895186569&alt=media" alt="">

Why Comprehensive Anti-Spoofing Data?

Modern certification pipelines demand proof that a system resists all common attack vectors—not just prints or replays. This dataset delivers those vectors in one place, allowing you to: - Benchmark a model’s true generalisation - Fine-tune against rare but high-impact threats (e.g., silicone or textile masks) - Streamline audits by demonstrating coverage of every ISO 30107-3 attack category

Dataset Features

Dataset Size: ≈ 95 000 videos / image sequences spanning live captures and eleven spoof classes

Attack Diversity: 3D paper mask, wrapped 3D mask, photo print, mobile replay, display replay, cut-out 2D mask, silicone mask, latex mask, textile mask

Active Liveness Cues: Natural blinks, and head rotations included across live and mask sessions

Attribute Range: different combinations of hairstyles, eyewear, facial hair, and accessories.

Environmental Variability: Indoor/outdoor scenes under various lighting conditions

Multi-angle Capture: Mainly used selfie camera, also back

Capture Devices: Footage from flagship and mid-range phones (iPhone 14 / 13 Pro, Galaxy S23, Pixel 7, Redmi Note 12 Pro+, Galaxy A54, Honor 70)

Additional Flexibility: Custom re-captures available on request

Full version of dataset is availible for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

Technical Specifications

File Format: MP4 for video, JPEG/PNG for still sequences; all compatible with mainstream ML frameworks

Resolution & FPS: Up to 4K @ 60 fps; balanced presets included for rapid training

Best Uses

Ideal for companies pursuing or maintaining iBeta Level 1/2 certification, research groups exploring new PAD architectures, and vendors stress-testing production face-verification pipelines

Attack Classes

Live / Genuine Natural faces with spontaneous movements across varied devices and lighting

3D Paper Mask Folded paper masks with protruding nose/forehead

Wrapped 3D Print Rigid paper moulds reproducing head geometry

Photo Print Glossy still photos at multiple angles—the classic 2D spoof

Cylinder 3D Paper Mask A folded or cylindrical sheet of paper that simulates volume

Mobile Replay Face videos played on phone screens; includes glare and auto-brightness shifts

Display Replay Attacks via monitors, and laptops

Cut-out 2D Mask Flat printed masks with eye/mouth holes plus active head motion

On-actor Print / Cuts Paper elements (photos, cutouts) are glued directly onto the actor's face

Silicone and Latex Masks High-detail silicone/latex overlays with blinking and subtle mimicry

Cloth 3D Mask Elastic fabric masks hugging facial contours during movement

High-Fidelity Resin Mask Hyperrealistic masks with detailed skin texture

Conclusion

This dataset’s scale, breadth of attack types, and real-world capture conditions make it indispensable for anyone building or evaluating biometric anti-spoofing solutions. Deploy it to harden your systems against today’s—and tomorrow’s—most sophisticated presentation attacks
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Li Xin (2023). Galaxy, star, quasar dataset [Dataset]. http://doi.org/10.57760/sciencedb.07177

Galaxy, star, quasar dataset

Explore at:

288 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.57760/sciencedb.07177

Dataset updated

Feb 3, 2023

Dataset provided by

Science Data Bank

Authors

Li Xin

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The data used in this paper is from the 16th issue of SDSS. SDSS-DR16 contains a total of 930,268 photometric images, with 1.2 billion observation sources and tens of millions of spectra. The data obtained in this paper is downloaded from the official website of SDSS. Specifically, the data is obtained through the SkyServerAPI structure by using SQL query statements in the subwebsite CasJobs. As the current SDSS photometric table PhotoObj can only classify all observed sources as point sources and surface sources, the target sources can be better classified as galaxies, stars and quasars through spectra. Therefore, we obtain calibrated sources in CasJobs by crossing SpecPhoto with the PhotoObj star list, and obtain target position information (right ascension and declination). Calibrated sources can tell them apart precisely and quickly. Each calibrated source is labeled with the parameter "Class" as "galaxy", "star", or "quasar". In this paper, observation day area 3462, 3478, 3530 and other 4 areas in SDSS-DR16 are selected as experimental data, because a large number of sources can be obtained in these areas to provide rich sample data for the experiment. For example, there are 9891 sources in the 3462-day area, including 2790 galactic sources, 2378 stellar sources and 4723 quasar sources. There are 3862 sources in the 3478 day area, including 1759 galactic sources, 577 stellar sources and 1526 quasar sources. FITS files are a commonly used data format in the astronomical community. By cross-matching the star list and FITS files in the local celestial region, we obtained images of 5 bands of u, g, r, i and z of 12499 galaxy sources, 16914 quasar sources and 16908 star sources as training and testing data.1.1 Image SynthesisSDSS photometric data includes photometric images of five bands u, g, r, i and z, and these photometric image data are respectively packaged in single-band format in FITS files. Images of different bands contain different information. Since the three bands g, r and i contain more feature information and less noise, Astronomical researchers typically use the g, r, and i bands corresponding to the R, G, and B channels of the image to synthesize photometric images. Generally, different bands cannot be directly synthesized. If three bands are directly synthesized, the image of different bands may not be aligned. Therefore, this paper adopts the RGB multi-band image synthesis software written by He Zhendong et al. to synthesize images in g, r and i bands. This method effectively avoids the problem that images in different bands cannot be aligned. The pixel of each photometry image in this paper is 2048×1489.1.2 Data tailoringThis paper first clipped the target image, image clipping can use image segmentation tools to solve this problem, this paper uses Python to achieve this process. In the process of clipping, we convert the right ascension and declination of the source in the star list into pixel coordinates on the photometric image through the coordinate conversion formula, and determine the specific position of the source through the pixel coordinates. The coordinates are regarded as the center point and clipping is carried out in the form of a rectangular box. We found that the input image size affects the experimental results. Therefore, according to the target size of the source, we selected three different cutting sizes, 40×40, 60×60 and 80×80 respectively. Through experiment and analysis, we find that convolutional neural network has better learning ability and higher accuracy for data with small image size. In the end, we chose to divide the surface source galaxies, point source quasars, and stars into 40×40 sizes.1.3 Division of training and test dataIn order to make the algorithm have more accurate recognition performance, we need enough image samples. The selection of training set, verification set and test set is an important factor affecting the final recognition accuracy. In this paper, the training set, verification set and test set are set according to the ratio of 8:1:1. The purpose of verification set is used to revise the algorithm, and the purpose of test set is used to evaluate the generalization ability of the final algorithm. Table 1 shows the specific data partitioning information. The total sample size is 34,000 source images, including 11543 galaxy sources, 11967 star sources, and 10490 quasar sources.1.4 Data preprocessingIn this experiment, the training set and test set can be used as the training and test input of the algorithm after data preprocessing. The data quantity and quality largely determine the recognition performance of the algorithm. The pre-processing of the training set and the test set are different. In the training set, we first perform vertical flip, horizontal flip and scale on the cropped image to enrich the data samples and enhance the generalization ability of the algorithm. Since the features in the celestial object source have the flip invariability, the labels of galaxies, stars and quasars will not change after rotation. In the test set, our preprocessing process is relatively simple compared with the training set. We carry out simple scaling processing on the input image and test input the obtained image.

Clear search

Close search

Google apps

Main menu

Galaxy, star, quasar dataset

UniverseMachine Data Release 1

Training data for 'Exome sequencing data analysis' tutorial (Galaxy Training...

Photometric Galaxy Redshift Prediction

Sloan Digital Sky Survey (SDSS) Galaxy Redshift Dataset

Purpose and Use

Data Collection Method

Accessibility

Bulk RNA-Seq Deconvolution with single-cell RNA-Seq Datasets

Galaxy Zoo 2: Images

Context

Content

Acknowledgements

Inspiration

Data from: PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements...

Training data for 'Unicycler assembly of SARS-CoV-2 genome with...

Galaxy Entertainment Group Ltd. Online Gambling Market Insights

Galactic interstellar dust Gaia-2MASS 3D maps

Data from: Genetic Characteristics and Phylogenetic Relationships of 18...

Data from: The Massive and Distant Clusters of WISE Survey. XII. Exploring...

Global smartphone unit shipments of Samsung 2010-2024, by quarter

Sloan Digital Sky Survey - DR18

Pan-cancer Aberrant Pathway Activity Analysis (PAPAA)

X-ray investigation of the remarkable galaxy group Nest200047

Datasets of the DIMet manuscript

Anti-Spoofing Dataset, 95,000 sets

Anti-Spoofing dataset: live, replay, cut, print, 3D masks - large-scale face anti spoofing

Why Comprehensive Anti-Spoofing Data?

Dataset Features

Full version of dataset is availible for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

Technical Specifications

Best Uses

Attack Classes

Conclusion

Galaxy, star, quasar dataset