89 datasets found

R
Data from: Satellite Image Classification Dataset
universe.roboflow.com
zip
Updated Mar 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
modelexamples (2025). Satellite Image Classification Dataset [Dataset]. https://universe.roboflow.com/modelexamples-eo848/satellite-image-classification-khyyl/model/7
Explore at:
zipAvailable download formats
Dataset updated
Mar 10, 2025
Dataset authored and provided by
modelexamples
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects
Description
Satellite Image Classification

## Overview Satellite Image Classification is a dataset for classification tasks - it contains Objects annotations for 2,000 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Z
Data from: Dataset of very-high-resolution satellite RGB images to train...
data.niaid.nih.gov
explore.openaire.eu
+2more
Updated Jul 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Puertas (2022). Dataset of very-high-resolution satellite RGB images to train deep learning models to recognize high-mountain juniper shrubs from Sierra Nevada (Spain) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6793421
Explore at:
Dataset updated
Jul 6, 2022
Dataset provided by
Rohaifa Khaldi
Siham Tabik
Sergio Puertas
Domingo Alcaraz-Segura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Sierra Nevada, Spain
Description
This dataset provides annotated very-high-resolution satellite RGB images extracted from Google Earth to train deep learning models to recognize Juniperus communis L. and Juniperus sabina L. shrubs. All images are from the high mountain of Sierra Nevada in Spain. The dataset contains 2000 images (.jpg) of size 512x512 pixels partitioned into two classes: Shrubs and NoShrubs. We also provide partitioning of the data into Train (1800 images), Test (100 images), and Validation (100 images) subsets.
R
Landscape Object Detection On Satellite Images With Ai Dataset
universe.roboflow.com
zip
Updated Jun 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satellite Images (2023). Landscape Object Detection On Satellite Images With Ai Dataset [Dataset]. https://universe.roboflow.com/satellite-images-i8zj5/landscape-object-detection-on-satellite-images-with-ai
Explore at:
zipAvailable download formats
Dataset updated
Jun 28, 2023
Dataset authored and provided by
Satellite Images
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Landscape Objects Bounding Boxes
Description
Detecting Landscape Objects on Satellite Images with Artificial Intelligence In recent years, there has been a significant increase in the use of artificial intelligence (AI) for image recognition and object detection. This technology has proven to be useful in a wide range of applications, from self-driving cars to facial recognition systems. In this project, the focus lies on using AI to detect landscape objects in satellite images (aerial photography angle) with the goal to create an annotated map of The Netherlands with all the coordinates of the given landscape objects.

Background Information

Problem Statement One of the things that Naturalis does is conducting research into the distribution of wild bees (Naturalis, n.d.). For their research they use a model that predicts whether or not a certain species can occur at a given location. Representing the real world in a digital form, there is at the moment not yet a way to generate an inventory of landscape features such as presence of trees, ponds and hedges, with their precise location on the digital map. The current models rely on species observation data and climate variables, but it is expected that adding detailed physical landscape information could increase the prediction accuracy. Common maps do not contain this level of detail, but high-resolution satellite images do.

Possible opportunities Based on the problem statement, there is at the moment at Naturalis not a map that does contain the level of detail where detection of landscape elements could be made, according to their wishes. The idea emerged that it should be possible to use satellite images to find the locations of small landscape elements and produce an annotated map. Therefore, by refining the accuracy of the current prediction model, researchers can gain a profound understanding of wild bees in the Netherlands with the goal to take effective measurements to protect wild bees and their living environment.

Goal of project The goal of the project is to develop an artificial intelligence model for landscape detection on satellite images to create an annotated map of The Netherlands that would therefore increase the accuracy prediction of the current model that is used at Naturalis. The project aims to address the problem of a lack of detailed maps of landscapes that could revolutionize the way Naturalis conduct their research on wild bees. Therefore, the ultimate aim of the project in the long term is to utilize the comprehensive knowledge to protect both the wild bees population and their natural habitats in the Netherlands.

Data Collection Google Earth One of the main challenges of this project was the difficulty in obtaining a qualified dataset (with or without data annotation). Obtaining high-quality satellite images for the project presents challenges in terms of cost and time. The costs in obtaining high-quality satellite images of the Netherlands is 1,038,575 $ in total (for further details and information of the costs of satellite images. On top of that, the acquisition process for such images involves various steps, from the initial request to the actual delivery of the images, numerous protocols and processes need to be followed.

After conducting further research, the best possible solution was to use Google Earth as the primary source of data. While Google Earth is not allowed to be used for commercial or promotional purposes, this project is for research purposes only for Naturalis on their research of wild bees, hence the regulation does not apply in this case.
f
Power Plant Satellite Imagery Dataset
figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyle Bradbury; Benjamin Brigman; Gouttham Chandrasekar; Leslie Collins; Shamikh Hossain; Marc Jeuland; Timothy Johnson; Boning Li; Trishul Nagenalli (2023). Power Plant Satellite Imagery Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.5307364.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5307364.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Kyle Bradbury; Benjamin Brigman; Gouttham Chandrasekar; Leslie Collins; Shamikh Hossain; Marc Jeuland; Timothy Johnson; Boning Li; Trishul Nagenalli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains satellite imagery of 4,454 power plants within the United States. The imagery is provided at two resolutions: 1m (4-band NAIP iamgery with near-infrared) and 30m (Landsat 8, pansharpened to 15m). The NAIP imagery is available for the U.S. and Landsat 8 is available globally. This dataset may be of value for computer vision work, machine learning, as well as energy and environmental analyses.Additionally, annotations of the specific locations of the spatial extent of the power plants in each image is provided. These annotations were collected via the crowdsourcing platform, Amazon Mechanical Turk, using multiple annotators for each image to ensure quality. Links to the sources of the imagery data, the annotation tool, and the team that created the dataset are included in the "References" section.To read more on these data, please refer to the "Power Plant Satellite Imagery Dataset Overview.pdf" file. To download a sample of the data without downloading the entire dataset, download "sample.zip" which includes two sample powerplants and the NAIP, Landsat 8, and binary annotations for each.Note: the NAIP imagery may appear "washed out" when viewed in standard image viewing software because it includes a near-infrared band in addition to the standard RGB data.
U
Coast Train--Labeled imagery for training and evaluation of data-driven...
data.usgs.gov
catalog.data.gov
Updated Aug 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand (2024). Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation [Dataset]. http://doi.org/10.5066/P91NP87I
Explore at:
Unique identifier
https://doi.org/10.5066/P91NP87I
Dataset updated
Aug 31, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Jan 1, 2008 - Dec 31, 2020
Description
Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...
Data from: VME: A Satellite Imagery Dataset and Benchmark for Detecting...
zenodo.org
zip
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noora Al-Emadi; Noora Al-Emadi; Ingmar Weber; Ingmar Weber; Yin Yang; Yin Yang; Ferda Ofli; Ferda Ofli (2025). VME: A Satellite Imagery Dataset and Benchmark for Detecting Vehicles in the Middle East and Beyond [Dataset]. http://doi.org/10.5281/zenodo.14185684
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14185684
Dataset updated
Apr 10, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Noora Al-Emadi; Noora Al-Emadi; Ingmar Weber; Ingmar Weber; Yin Yang; Yin Yang; Ferda Ofli; Ferda Ofli
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Area covered
Middle East
Description
This repository has VME dataset (images and annotations files). Also, it has the script for constructing CDSI dataset.

VME is a satellite imagery dataset built for vehicle detection in the Middle East. VME images (satellite_images folder) are under https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank" rel="noopener">CC BY-NC-ND 4.0 license, whereas the rest of folders (annotations_HBB, annotations_OBB, CDSI_construction_scripts) are under https://creativecommons.org/licenses/by/4.0/" target="_blank" rel="noopener">CC BY 4.0 license.

VME_CDSI_datasets.zip has four components:

annotations_OBB: It holds TXT files in YOLO format with Oriented Bounding Box (OBB) annotations. Each annotation file is named after the corresponding image name

annotations_HBB: This component contains HBB annotation files in JSON file formatted in MS-COCO format defined by four values in pixels (x_min, y_min, width, height) of training, validation, and test splits

satellite_images: This folder consists of VME images of size 512x512 in PNG format

CDSI_construction_scripts: This directory comprises all instructions needed to build the CDSI dataset in detail: a) instructions for downloading each dataset from its repository, b) The conversion to MS-COCO format script for each dataset is under the dataset name folder, c) The combination instructions. The training, validation, and test splits are available under "CDSI_construction_scripts/data_utils" folder.

annotations_HBB, annotations_OBB, CDSI_construction_scripts, are available in our GitHub repository

Please cite our dataset & paper with the preferred format as shown in the "Citation" section

@article{al-emadi_vme_2025, title = {{VME: A Satellite Imagery Dataset and Benchmark for Detecting Vehicles in the Middle East and Beyond}}, volume = {12}, issn = {2052-4463}, url = {https://doi.org/10.1038/s41597-025-04567-y}, doi = {10.1038/s41597-025-04567-y}, pages = {500}, number = {1}, journal = {Scientific Data}, author = {Al-Emadi, Noora and Weber, Ingmar and Yang, Yin and Ofli, Ferda}, date = {2025-03-25}, publisher={Spring Nature}, year={2025} }
Data from: Satellite Image Classification
kaggle.com
Updated Aug 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahmoud Reda (2021). Satellite Image Classification [Dataset]. https://www.kaggle.com/mahmoudreda55/satellite-image-classification/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 21, 2021
Dataset provided by
Kaggle
Authors
Mahmoud Reda
Description
Context

Satellite image Classification Dataset-RSI-CB256 , This dataset has 4 different classes mixed from Sensors and google map snapshot

Content

The past years have witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images. In this context, the benchmark datasets serve as essential prerequisites for developing and testing intelligent interpretation algorithms. After reviewing existing benchmark datasets in the research community of RS image interpretation, this article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation. Specifically, we first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations. We then present the general guidance on creating benchmark datasets in efficient manners. Following the presented guidance, we also provide an example on building RS image dataset, i.e., Million-AID, a new large-scale benchmark dataset containing a million instances for RS image scene classification. Several challenges and perspectives in RS image annotation are finally discussed to facilitate the research in benchmark dataset construction. We do hope this paper will provide the RS community an overall perspective on constructing large-scale and practical image datasets for further research, especially data-driven ones.

Acknowledgements

Annotated Datasets for RS Image Interpretation The interpretation of RS images has been playing an increasingly important role in a large diversity of applications, and thus, has attracted remarkable research attentions. Consequently, various datasets have been built to advance the development of interpretation algorithms for RS images. Covering literature published over the past decade, we perform a systematic review of the existing RS image datasets concerning the current mainstream of RS image interpretation tasks, including scene classification, object detection, semantic segmentation and change detection.

Inspiration

Artificial Intelligence, Computer Vision, Image Processing, Deep Learning, Satellite Image, Remote Sensing
m
Oil and Gas Tank Dataset
data.mendeley.com
Updated Feb 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jakaria Rabbi (2020). Oil and Gas Tank Dataset [Dataset]. http://doi.org/10.17632/bkxj8z84m9.3
Explore at:
Unique identifier
https://doi.org/10.17632/bkxj8z84m9.3
Dataset updated
Feb 17, 2020
Authors
Jakaria Rabbi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Tank Detection and Count Dataset contains 760 satellite image tiles of size 512*512 pixels and one-pixel cover 30cm*30cm at ground level. Each tile is associated with .xml and .txt files. Both .xml and .txt file contains the same annotations of oil/gas tanks but in a different format. .xml contains the pascal VOC format and in .txt file, every line contains the class of the tank and four coordinates of the bounding box: xmin, ymin, xmax, ymax.
m
Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating...
data.mendeley.com
Updated Jun 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yudhi Prabowo (2022). Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating Burned Areas in Indonesia [Dataset]. http://doi.org/10.17632/fs7mtkg2wk.5
Explore at:
Unique identifier
https://doi.org/10.17632/fs7mtkg2wk.5
Dataset updated
Jun 6, 2022
Authors
Yudhi Prabowo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Indonesia
Description
The dataset consist of three categories; image subsets, burned area masks and quicklooks. The image subsets are derived from Landsat-8 scenes taken during the years 2019 and 2021. Each image has a size of 512x512 pixels and consists of 8 multispectral. The sequence of band names from band 1 to band 7 of the image subset is same as the sequence of band names of landsat-8 scene, except for band 8 of the image subset which is band 9 (cirrus band) in the original landsat-8 scene. The image subsets are saved in GeoTIFF file format with the latitude longitude coordinate system and WGS 1984 as the datum. The spatial resolution of image subsets is 0.00025 degree and the pixel values are stored in 16 bit unsigned integer with the range of value from 0 to 65535. The total of the dataset is 227 images which containing object of burned area surrounded by various ecological diversity backgrounds such as forest, shrub, grassland, waterbody, bare land, settlement, cloud and cloud shadow. In some cases, there are some image subsets with the burned areas covered by smoke due to the fire is still active. Some image subsets also overlap each other to cover the area of burned scar which the area is too large. The burned area mask is a binary annotation image which consists of two classes; burned area as the foreground and non-burned area as the background. These binary images are saved in 8 bit unsigned integer where the burned area is indicated by the pixel value of 1, whereas the non-burned area is indicated by 0. The burned area masks in this dataset contain only burned scars and are not contaminated with thick clouds, shadows, and vegetation. Among 227 images, 206 images contain burned areas whereas 21 images contain only background. The highest number of images in this dataset is dominated by images with coverage percentage of burned area between 0 and 10 percent. Our dataset also provides quicklook image as a quick preview of image subset. It offers a fast and full size preview of image subset without opening the file using any GIS software. The quicklook images can also be used for training and evaluating the model as a substitute of image subsets. The image size is 512x512 pixels same as the size of image subset and annotation image. It consists of three bands as a false color composite quicklook images, with combination of band 7 (SWIR-2), band 5 (NIR), and band 4 (red). These RGB composite images have been performed contrast stretching to enhance the images visualizations. The quicklook images are stored in GeoTIFF file format with 8 bit unsigned integer.

This work was financed by Riset Inovatif Produktif (RISPRO) fund through Prioritas Riset Nasional (PRN) project, grant no. 255/E1/PRN/2020 for 2020 - 2021 contract period.
R
Merged Satellite Flood Images Dataset
universe.roboflow.com
zip
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fire (2023). Merged Satellite Flood Images Dataset [Dataset]. https://universe.roboflow.com/fire-fs3r3/merged-satellite-flood-images
Explore at:
zipAvailable download formats
Dataset updated
Jun 21, 2023
Dataset authored and provided by
Fire
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Flood Bounding Boxes
Description
Merged Satellite Flood Images

## Overview Merged Satellite Flood Images is a dataset for object detection tasks - it contains Flood annotations for 440 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
D
Data Labeling Market Report
datainsightsmarket.com
doc, pdf, ppt
Updated Mar 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Data Labeling Market Report [Dataset]. https://www.datainsightsmarket.com/reports/data-labeling-market-20383
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Mar 8, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The data labeling market is experiencing robust growth, projected to reach $3.84 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 28.13% from 2025 to 2033. This expansion is fueled by the increasing demand for high-quality training data across various sectors, including healthcare, automotive, and finance, which heavily rely on machine learning and artificial intelligence (AI). The surge in AI adoption, particularly in areas like autonomous vehicles, medical image analysis, and fraud detection, necessitates vast quantities of accurately labeled data. The market is segmented by sourcing type (in-house vs. outsourced), data type (text, image, audio), labeling method (manual, automatic, semi-supervised), and end-user industry. Outsourcing is expected to dominate the sourcing segment due to cost-effectiveness and access to specialized expertise. Similarly, image data labeling is likely to hold a significant share, given the visual nature of many AI applications. The shift towards automation and semi-supervised techniques aims to improve efficiency and reduce labeling costs, though manual labeling will remain crucial for tasks requiring high accuracy and nuanced understanding. Geographical distribution shows strong potential across North America and Europe, with Asia-Pacific emerging as a key growth region driven by increasing technological advancements and digital transformation. Competition in the data labeling market is intense, with a mix of established players like Amazon Mechanical Turk and Appen, alongside emerging specialized companies. The market's future trajectory will likely be shaped by advancements in automation technologies, the development of more efficient labeling techniques, and the increasing need for specialized data labeling services catering to niche applications. Companies are focusing on improving the accuracy and speed of data labeling through innovations in AI-powered tools and techniques. Furthermore, the rise of synthetic data generation offers a promising avenue for supplementing real-world data, potentially addressing data scarcity challenges and reducing labeling costs in certain applications. This will, however, require careful attention to ensure that the synthetic data generated is representative of real-world data to maintain model accuracy. This comprehensive report provides an in-depth analysis of the global data labeling market, offering invaluable insights for businesses, investors, and researchers. The study period covers 2019-2033, with 2025 as the base and estimated year, and a forecast period of 2025-2033. We delve into market size, segmentation, growth drivers, challenges, and emerging trends, examining the impact of technological advancements and regulatory changes on this rapidly evolving sector. The market is projected to reach multi-billion dollar valuations by 2033, fueled by the increasing demand for high-quality data to train sophisticated machine learning models. Recent developments include: September 2024: The National Geospatial-Intelligence Agency (NGA) is poised to invest heavily in artificial intelligence, earmarking up to USD 700 million for data labeling services over the next five years. This initiative aims to enhance NGA's machine-learning capabilities, particularly in analyzing satellite imagery and other geospatial data. The agency has opted for a multi-vendor indefinite-delivery/indefinite-quantity (IDIQ) contract, emphasizing the importance of annotating raw data be it images or videos—to render it understandable for machine learning models. For instance, when dealing with satellite imagery, the focus could be on labeling distinct entities such as buildings, roads, or patches of vegetation.October 2023: Refuel.ai unveiled a new platform, Refuel Cloud, and a specialized large language model (LLM) for data labeling. Refuel Cloud harnesses advanced LLMs, including its proprietary model, to automate data cleaning, labeling, and enrichment at scale, catering to diverse industry use cases. Recognizing that clean data underpins modern AI and data-centric software, Refuel Cloud addresses the historical challenge of human labor bottlenecks in data production. With Refuel Cloud, enterprises can swiftly generate the expansive, precise datasets they require in mere minutes, a task that traditionally spanned weeks.. Key drivers for this market are: Rising Penetration of Connected Cars and Advances in Autonomous Driving Technology, Advances in Big Data Analytics based on AI and ML. Potential restraints include: Rising Penetration of Connected Cars and Advances in Autonomous Driving Technology, Advances in Big Data Analytics based on AI and ML. Notable trends are: Healthcare is Expected to Witness Remarkable Growth.
P
DOTA Dataset
paperswithcode.com
Updated Feb 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gui-Song Xia; Xiang Bai; Jian Ding; Zhen Zhu; Serge Belongie; Jiebo Luo; Mihai Datcu; Marcello Pelillo; Liangpei Zhang (2021). DOTA Dataset [Dataset]. https://paperswithcode.com/dataset/dota
Explore at:
Dataset updated
Feb 2, 2021
Authors
Gui-Song Xia; Xiang Bai; Jian Ding; Zhen Zhu; Serge Belongie; Jiebo Luo; Mihai Datcu; Marcello Pelillo; Liangpei Zhang
Description
DOTA is a large-scale dataset for object detection in aerial images. It can be used to develop and evaluate object detectors in aerial images. The images are collected from different sensors and platforms. Each image is of the size in the range from 800 × 800 to 20,000 × 20,000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. The instances in DOTA images are annotated by experts in aerial image interpretation by arbitrary (8 d.o.f.) quadrilateral. We will continue to update DOTA, to grow in size and scope to reflect evolving real-world conditions. Now it has three versions:

DOTA-v1.0 contains 15 common categories, 2,806 images and 188, 282 instances. The proportions of the training set, validation set, and testing set in DOTA-v1.0 are 1/2, 1/6, and 1/3, respectively.

DOTA-v1.5 uses the same images as DOTA-v1.0, but the extremely small instances (less than 10 pixels) are also annotated. Moreover, a new category, ”container crane” is added. It contains 403,318 instances in total. The number of images and dataset splits are the same as DOTA-v1.0. This version was released for the DOAI Challenge 2019 on Object Detection in Aerial Images in conjunction with IEEE CVPR 2019.

DOTA-v2.0 collects more Google Earth, GF-2 Satellite, and aerial images. There are 18 common categories, 11,268 images and 1,793,658 instances in DOTA-v2.0. Compared to DOTA-v1.5, it further adds the new categories of ”airport” and ”helipad”. The 11,268 images of DOTA are split into training, validation, test-dev, and test-challenge sets. To avoid the problem of overfitting, the proportion of training and validation set is smaller than the test set. Furthermore, we have two test sets, namely test-dev and test-challenge. Training contains 1,830 images and 268,627 instances. Validation contains 593 images and 81,048 instances. We released the images and ground truths for training and validation sets. Test-dev contains 2,792 images and 353,346 instances. We released the images but not the ground truths. Test-challenge contains 6,053 images and 1,090,637 instances.
P
Burned Area Delineation from Satellite Imagery Dataset
paperswithcode.com
Updated Nov 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Burned Area Delineation from Satellite Imagery Dataset [Dataset]. https://paperswithcode.com/dataset/burned-area-delineation-from-satellite
Explore at:
Dataset updated
Nov 21, 2021
Description
The dataset contains 73 satellite images of different forests damaged by wildfires across Europe with a resolution of up to 10m per pixel. Data were collected from the Sentinel-2 L2A satellite mission and the target labels were generated from the Copernicus Emergency Management Service (EMS) annotations, with five different severity levels, ranging from undamaged to completely destroyed.
R
Data from: Satellite Image Dataset
universe.roboflow.com
zip
Updated Oct 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
project (2022). Satellite Image Dataset [Dataset]. https://universe.roboflow.com/project-5jlv8/satellite-image-gwhnn/model/1
Explore at:
zipAvailable download formats
Dataset updated
Oct 28, 2022
Dataset authored and provided by
project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Land Detection Bounding Boxes
Description
Satellite Image

## Overview Satellite Image is a dataset for object detection tasks - it contains Land Detection annotations for 377 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
OHID-1 Hyperspectral Image Dataset
kaggle.com
Updated Mar 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rusab Sarmun (2025). OHID-1 Hyperspectral Image Dataset [Dataset]. https://www.kaggle.com/datasets/rusabsarmun/ohid-1-hyperspectral-image-dataset/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 14, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rusab Sarmun
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Mani et al build a new set of hyperspectral data with complex characteristics using data from Orbita and named it Orbita Hyperspectral Images Dataset-1 (OHID-1). It describes different type of areas at Zhuhai City, China.

This dataset provides access to the raw data and annotations of the OHID-1 dataset, which includes two different data formats: .mat and .tif. All data have a size of 5056x5056 pixels. The raw data consists of 32 bands, while the annotation data consists of 1 band.

This is the original raw data for this dataset: https://figshare.com/articles/online_resource/OHID-1/27966024?file=51215300
Satellite Images of Vehicles
kaggle.com
Updated Aug 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shubham Ghosal (2024). Satellite Images of Vehicles [Dataset]. https://www.kaggle.com/datasets/shubhamghosal/satellite-images-of-vehicles/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shubham Ghosal
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Shubham Ghosal

Released under MIT

Contents
f
Bonn Roof Material + Satellite Imagery Dataset
figshare.com
zip
Updated Apr 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julian Huang; Yue Lin; Alex Nhancololo (2025). Bonn Roof Material + Satellite Imagery Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28713194.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28713194.v2
Dataset updated
Apr 18, 2025
Dataset provided by
figshare
Authors
Julian Huang; Yue Lin; Alex Nhancololo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Bonn
Description
This dataset consists of annotated high-resolution aerial imagery of roof materials in Bonn, Germany, in the Ultralytics YOLO instance segmentation dataset format. Aerial imagery was sourced from OpenAerialMap, specifically from the Maxar Open Data Program. Roof material labels and building outlines were sourced from OpenStreetMap. Images and labels are split into training, validation, and test sets, meant for future machine learning models to be trained upon, for both building segmentation and roof type classification.The dataset is intended for applications such as informing studies on thermal efficiency, roof durability, heritage conservation, or socioeconomic analyses. There are six roof material types: roof tiles, tar paper, metal, concrete, gravel, and glass.Note: The data is in a .zip due to file upload limits. Please find a more detailed dataset description in the README.md
D
DOTA Dataset
datasetninja.com
Updated Feb 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jian Ding; Nan Xue; Gui-Song Xia (2021). DOTA Dataset [Dataset]. https://datasetninja.com/dota
Explore at:
Dataset updated
Feb 25, 2021
Dataset provided by
Dataset Ninja
Authors
Jian Ding; Nan Xue; Gui-Song Xia
License
https://captain-whu.github.io/DOTA/dataset.htmlhttps://captain-whu.github.io/DOTA/dataset.html
Description
In the past decade, significant progress in object detection has been made in natural images, but authors of the DOTA v2.0: Dataset of Object deTection in Aerial images note that this progress hasn't extended to aerial images. The main reason for this discrepancy is the substantial variations in object scale and orientation caused by the bird's-eye view of aerial images. One major obstacle to the development of object detection in aerial images (ODAI) is the lack of large-scale benchmark datasets. The DOTA dataset contains 1,793,658 object instances spanning 18 different categories, all annotated with oriented bounding box annotations (OBB). These annotations were collected from a total of 11,268 aerial images. Using this extensive and meticulously annotated dataset, the authors establish baselines covering ten state-of-the-art algorithms, each with over 70 different configurations. These configurations are evaluated for both speed and accuracy performance.
Data from: CloudTracks: A Dataset for Localizing Ship Tracks in Satellite...
zenodo.org
zip
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Ahmed Chaudhry; Muhammad Ahmed Chaudhry; Lyna Kim; Jeremy Irvin; Jeremy Irvin; Yuzu Ido; Sonia Chu; Jared Thomas Isobe; Andrew Y. Ng; Duncan Watson-Parris; Lyna Kim; Yuzu Ido; Sonia Chu; Jared Thomas Isobe; Andrew Y. Ng; Duncan Watson-Parris (2023). CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds [Dataset]. http://doi.org/10.5281/zenodo.8412855
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8412855
Dataset updated
Nov 1, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Muhammad Ahmed Chaudhry; Muhammad Ahmed Chaudhry; Lyna Kim; Jeremy Irvin; Jeremy Irvin; Yuzu Ido; Sonia Chu; Jared Thomas Isobe; Andrew Y. Ng; Duncan Watson-Parris; Lyna Kim; Yuzu Ido; Sonia Chu; Jared Thomas Isobe; Andrew Y. Ng; Duncan Watson-Parris
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
[Please use version 1.0.1]
The CloudTracks dataset consists of 1,780 MODIS satellite images hand-labeled for the presence of more than 12,000 ship tracks. More information about how the dataset was constructed may be found at github.com/stanfordmlgroup/CloudTracks. The file structure of the dataset is as follows:
CloudTracks/
full/
images/
(sample image name) mod2002121.1920D.png
jsons/
(sample json name) mod2002121.1920D.json
The naming convention is as follows:
mod2002121.1920D: the first 3 letters specify which of the sensors on the two MODIS satellites captured the image, mod for Terra and myd for Aqua. This is followed by a 4 digit year (2002) and a 3 digit day of the year (121). The following 4 digits specify the time of day (1920; 24 hour format in the UTC timezone), followed by D or N for Day or Night.
The 1,780 MODIS Terra and Aqua images were collected between 2002 and 2021 inclusive over various stratocumulus cloud regions (such as the East Pacific and East Atlantic) where ship tracks have commonly been observed. Each image has dimension 1354 x 2030 and a spatial resolution of 1km. Of the 36 bands collected by the instruments, we selected channels 1, 20, and 32 to capture useful physical properties of cloud formations.
The labels are found in the corresponding JSON files for each image. The following keys in the json are particularly important:
imagePath: the filename of the image.
shapes: the list of annotations corresponding to the image, where each element of the list is a dictionary corresponding to a single instance annotation. The dictionary has a key with value "shiptrack" or "uncertain" which is the label of the annotation and the corresponding value is a linestrip detailing the ship track path.
Further pre-processing details may be found at the GitHub link above. If you have any questions about the dataset, contact us at:
mahmedch@stanford.edu, lynakim@stanford.edu, jirvin16@cs.stanford.edu

Data from: SeasoNet: A Seasonal Scene Classification, Segmentation and...

zenodo.org
data.niaid.nih.gov

csv, zip

Updated Aug 10, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Dominik Koßmann; Dominik Koßmann; Viktor Brack; Viktor Brack; Thorsten Wilhelm; Thorsten Wilhelm (2022). SeasoNet: A Seasonal Scene Classification, Segmentation and Retrieval Dataset for Satellite Imagery over Germany [Dataset]. http://doi.org/10.5281/zenodo.6979994

Explore at:

zip, csvAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6979994

Dataset updated

Aug 10, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Dominik Koßmann; Dominik Koßmann; Viktor Brack; Viktor Brack; Thorsten Wilhelm; Thorsten Wilhelm

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Germany

Description

This dataset consists of 1,759,830 multi-spectral image patches from the Sentinel-2 mission, annotated with image- and pixel-level land cover and land usage labels from the German land cover model LBM-DE2018 with land cover classes based on the CORINE Land Cover database (CLC) 2018. It includes pixel synchronous examples from each of the four seasons, plus an additional snowy set, spanning the time from April 2018 to February 2019. The patches were taken from 519,547 unique locations, covering the whole surface area of Germany, with each patch covering an area of 1.2km x 1.2km. The set is split into two overlapping grids, consisting of roughly 880,000 samples each, which are shifted by half the patch size in both dimensions. The images in each of the both grids themselves do not overlap.

Contents

Each sample includes:

3 10m resolution bands (RGB), 120px x 120px
1 10m resolution band (infrared), 120px x 120px
6 20m resolution bands, 60px x 60px
2 60m resolution bands, 20xp x 20px
1 pixel-level label map
2 binary masks for cloud and snow coverage
2 binary masks for easy and medium segmentation difficulties, marks areas <300px and <100px respectively
1 JSON-file containing additional meta-information

The meta.csv contains the following information about each sample:

Which season it belongs to
Which of the two grids it belongs to
Coordinates of the patch center
Whether it was acquired from Sentinel-2 Satellite A or B
Date and time of image acquisition
Snow and cloud coverage percentages
Image-level multi-class labels
Three additional image-level urbanization labels, based on the center pixel (details below)
The path to the sample

Classes

ID	Class
1	Continuous urban fabric
2	Discontinuous urban fabric
3	Industrial or commercial units
4	Road and rail networks and associated land
5	Port areas
6	Airports
7	Mineral extraction sites
8	Dump sites
9	Construction sites
10	Green urban areas
11	Sport and leisure facilities
12	Non-irrigated arable land
13	Vineyards
14	Fruit trees and berry plantations
15	Pastures
16	Broad-leaved forest
17	Coniferous forest
18	Mixed forest
19	Natural grasslands
20	Moors and heathland
21	Transitional woodland/shrub
22	Beaches, dunes, sands
23	Bare rock
24	Sparsely vegetated areas
25	Inland marshes
26	Peat bogs
27	Salt marshes
28	Intertidal flats
29	Water courses
30	Water bodies
31	Coastal lagoons
32	Estuaries
33	Sea and ocean

Urbanization classes

SLRAUM
- 0: None
- 1: Ländlicher Raum (~ rural area)
- 2: Städtischer Raum (~ urban area)
RTYP3
- 0: None
- 1: Ländliche Regionen (~ rural areas)
- 2: Regionen mit Verstädterungsansätzen (~ urbanizing areas)
- 3: Städtische Regionen (~ urban areas)
KTYP4
- 0: None
- 1: Dünn besiedelte ländliche Kreise
- 2: Kreisfreie Großstädte
- 3: Ländliche Kreise mit Verdichtungsansätzen
- 4: Städtische Kreise

Further information on the urbanization classes can be found here:

SLRAUM

https://www.bbsr.bund.de/BBSR/DE/forschung/raumbeobachtung/Raumabgrenzungen/deutschland/kreise/staedtischer-laendlicher-raum/kreistypen.html

RTYP3

https://www.bbsr.bund.de/BBSR/DE/forschung/raumbeobachtung/Raumabgrenzungen/deutschland/regionen/siedlungsstrukturelle-regionstypen/regionstypen.html

KTYP4

https://www.bbsr.bund.de/BBSR/DE/forschung/raumbeobachtung/Raumabgrenzungen/deutschland/kreise/siedlungsstrukturelle-kreistypen/kreistypen.html

License of landcover model

Bundesamt für Kartographie und Geodäsie

dl-de/by-2-0 from https://www.govdata.de/dl-de/by-2-0

Source of landcover model

https://gdz.bkg.bund.de/index.php/default/catalog/product/view/id/1071/s/corine-land-cover-5-ha-stand-2018-clc5-2018/

Facebook

Twitter

Click to copy link

Link copied

Cite

modelexamples (2025). Satellite Image Classification Dataset [Dataset]. https://universe.roboflow.com/modelexamples-eo848/satellite-image-classification-khyyl/model/7

Data from: Satellite Image Classification Dataset

satellite-image-classification-khyyl

satellite-image-classification-dataset

Explore at:

zipAvailable download formats

Dataset updated

Mar 10, 2025

Dataset authored and provided by

modelexamples

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured

Objects

Description

Satellite Image Classification

## Overview

Satellite Image Classification is a dataset for classification tasks - it contains Objects annotations for 2,000 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

Clear search

Close search

Google apps

Main menu

Data from: Satellite Image Classification Dataset

Satellite Image Classification

Data from: Dataset of very-high-resolution satellite RGB images to train...

Landscape Object Detection On Satellite Images With Ai Dataset

Power Plant Satellite Imagery Dataset

Coast Train--Labeled imagery for training and evaluation of data-driven...

Data from: VME: A Satellite Imagery Dataset and Benchmark for Detecting...

Data from: Satellite Image Classification

Context

Content

Acknowledgements

Inspiration

Oil and Gas Tank Dataset

Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating...

Merged Satellite Flood Images Dataset

Merged Satellite Flood Images

Data Labeling Market Report

DOTA Dataset

Burned Area Delineation from Satellite Imagery Dataset

Data from: Satellite Image Dataset

Satellite Image

OHID-1 Hyperspectral Image Dataset

Satellite Images of Vehicles

Dataset

Contents

Bonn Roof Material + Satellite Imagery Dataset

DOTA Dataset

Data from: CloudTracks: A Dataset for Localizing Ship Tracks in Satellite...

Data from: SeasoNet: A Seasonal Scene Classification, Segmentation and...

Data from: Satellite Image Classification DatasetSee More Versions

satellite-image-classification-khyyl

satellite-image-classification-dataset

Satellite Image Classification

Data from: Satellite Image Classification Dataset