Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.
Facebook
TwitterCoast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The SkySeaLand Dataset is a high-resolution satellite imagery collection developed for object detection, classification, and aerial analysis tasks. It focuses on transportation-related objects observed from diverse geospatial contexts, offering precise YOLO-formatted annotations for four categories: airplane, boat, car, and ship.
This dataset bridges terrestrial, maritime, and aerial domains, providing a unified resource for developing and benchmarking computer vision models in complex real-world environments.
.txt file per image) The SkySeaLand Dataset is divided into the following subsets for training, validation, and testing:
This split ensures a balanced distribution for training, validating, and testing models, facilitating robust model evaluation and performance analysis.
| Class Name | Object Count |
|---|---|
| Airplane | 4,847 |
| Boat | 3,697 |
| Car | 6,932 |
| Ship | 3,627 |
The dataset maintains a moderately balanced distribution among categories, ensuring stable model performance during multi-class training and evaluation.
Each label file contains normalized bounding box annotations in YOLO format.
The format for each line is:
Where: - class_id: The class of the object (refer to the table below). - x_center, y_center: The center coordinates of the bounding box, normalized between 0 and 1 relative to the image width and height. - width, height: The width and height of the bounding box, also normalized between 0 and 1.
| Class ID | Category |
|---|---|
| 0 | Airplane |
| 1 | Boat |
| 2 | Car |
| 3 | Ship |
All coordinates are normalized between 0 and 1 relative to the image width and height.
Data Source:
- Satellite imagery was obtained from Google Earth Pro under fair-use and research guidelines.
- The dataset was prepared solely for academic and educational computer vision research.
Annotation Tools:
- Manual annotations were performed and verified using:
- CVAT (Computer Vision Annotation Tool)
- Roboflow
These tools were used to ensure consistent annotation quality and accurate bounding box placement across all object classes.
Facebook
TwitterThis dataset features over 1,300,000 high-quality images capturing a wide spectrum of weather conditions, sourced from photographers worldwide. Curated specifically for AI and machine learning applications, it provides richly annotated, visually diverse content across atmospheric states and environmental settings.
Key Features: 1. Comprehensive Metadata: includes full EXIF data along with detailed annotations for weather type (e.g., sunny, cloudy, foggy, rainy, snowy, stormy), visibility, lighting conditions, and environmental context. This enables use in classification, detection, forecasting support, and image-to-text training. Location and timestamp metadata are also included where available.
Unique Sourcing Capabilities: images are collected via a proprietary gamified photography platform with competitions focused on weather, seasons, and natural phenomena. Custom datasets can be delivered within 72 hours to target specific conditions, times of day, regions, or intensity levels (e.g., heavy rain vs. light drizzle).
Global Diversity: contributions from over 100 countries provide coverage of diverse climates, geographic regions, and seasonal cycles. From tropical downpours and desert heatwaves to arctic snowfalls and monsoon clouds, the dataset captures an unparalleled range of real-world weather phenomena.
High-Quality Imagery: includes standard to ultra-HD images, featuring both dramatic weather events and subtle atmospheric changes. A mix of landscape, street-level, and aerial perspectives enhances model training for real-world recognition and simulation.
Popularity Scores: each image carries a popularity score based on GuruShots competition performance, helping guide aesthetic evaluation and dataset curation for applications involving user engagement or weather-related visual content.
AI-Ready Design: ideal for training models in weather classification, environmental analysis, climate research, visual forecasting, and style transfer. Fully compatible with popular machine learning and geospatial frameworks.
Licensing & Compliance: dataset complies with international data use regulations and is licensed transparently for both commercial and academic use.
Use Cases: 1. Training AI for weather condition recognition in autonomous systems, drones, and outdoor devices. 2. Enhancing image captioning, climate classification, and weather-aware generative models. 3. Supporting AR/VR simulations and environmental visualization tools. 4. Powering content moderation, scene adaptation, and weather-sensitive product personalization.
This dataset provides a scalable, high-fidelity resource for AI models that require real-world weather context, visual diversity, and global applicability. Custom subsets and filters available. Contact us to learn more!
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Our strategy is to reuse images from existing benchmark datasets as much as possible and manually annotate new land cover labels. We selected xBD, Inria, Open Cities AI, SpaceNet, Landcover.ai, AIRS, GeoNRW, and HTCD datasets. For countries and regions not covered by the existing datasets, aerial images publicly available in such countries or regions were collected to mitigate the regional gap, which is an issue in most of the existing benchmark datasets. The open data were downloaded from OpenAerialMap and geospatial agencies in Peru and Japan. The attribution of source data is summarized here.
We provide annotations with eight classes: bareland, rangeland, developed space, road, tree, water, agriculture land, and building. Their color and proportion of pixels are summarized below. All the labeling was done manually, and it took 2.5 hours per image on average.
| Color (HEX) | Class | % |
|---|---|---|
| #800000 | Bareland | 1.5 |
| #00FF24 | Rangeland | 22.9 |
| #949494 | Developed space | 16.1 |
| #FFFFFF | Road | 6.7 |
| #226126 | Tree | 20.2 |
| #0045FF | Wate | 3.3 |
| #4BB549 | Agriculture land | 13.7 |
| #DE1F07 | Building | 15.6 |
Label data of OpenEarthMap are provided under the same license as the original RGB images, which varies with each source dataset. For more details, please see the attribution of source data here. Label data for regions where the original RGB images are in the public domain or where the license is not explicitly stated are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
@inproceedings{xia_2023_openearthmap,
title = {OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping},
author = {Junshi Xia and Naoto Yokoya and Bruno Adriano and Clifford Broni-Bediako},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2023},
pages = {6254-6264}
}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains annotations (i.e. polygons) for solar photovoltaic (PV) objects in the previously published dataset "Classification Training Dataset for Crop Types in Rwanda" published by RTI International (DOI: 10.34911/rdnt.r4p1fr [1]). These polygons are intended to enable the use of this dataset as a machine learning training dataset for solar PV identification in drone imagery. Note that this dataset contains ONLY the solar panel polygon labels and needs to be used with the original RGB UAV imagery “Drone Imagery Classification Training Dataset for Crop Types in Rwanda” (https://mlhub.earth/data/rti_rwanda_crop_type). The original dataset contains UAV imagery (RGB) in .tiff format in six provinces in Rwanda, each with three phases imaged and our solar PV annotation dataset follows the same data structure with province and phase label in each subfolder.Data processing:Please refer to this Github repository for further details: https://github.com/BensonRen/Drone_based_solar_PV_detection. The original dataset is divided into 8000x8000 pixel image tiles and manually labeled with polygons (mainly rectangles) to indicate the presence of solar PV. These polygons are converted into pixel-wise, binary class annotations.Other information:1. The six provinces that UAV imagery came from are: (1) Cyampirita (2) Kabarama (3) Kaberege (4) Kinyaga (5) Ngarama (6) Rwakigarati. These original data collections were staged across 18 phases, each collected a set of imagery from a given Province (each provinces had 3 phases of collection). We have annotated 15 out of 18 phases, with the missing ones being: Kabarama-Phase2, Kaberege-Phase3, and Kinyaga-Phase3 due to data compatibility issues of the unused phases.2. The annotated polygons are transformed into binary maps the size of the image tiles but where each pixel is either 0 or 1. In this case, 0 represents background and 1 represents solar PV pixels. These binary maps are in .png format and each Province/phase set has between 9 and 49 annotation patches. Using the code provided in the above repository, the same image patches can be cropped from the original RGB imagery.3. Solar PV densities vary across the image patches. In total, there were 214 solar PV instances labeled in the 15 phase.Associated publications:“Utilizing geospatial data for assessing energy security: Mapping small solar home systems using unmanned aerial vehicles and deep learning” [https://arxiv.org/abs/2201.05548]This dataset is published under CC-BY-NC-SA-4.0 license. (https://creativecommons.org/licenses/by-nc-sa/4.0/)
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
FASDD is a largest and most generalized Flame And Smoke Detection Dataset for object detection tasks, characterized by the utmost complexity in fire scenes, the highest heterogeneity in feature distribution, and the most significant variations in image size and shape. FASDD serves as a benchmark for developing advanced fire detection models, which can be deployed on watchtowers, drones, or satellites in a space-air-ground integrated observation network for collaborative fire warning. This endeavor provides valuable insights for government decision-making and fire rescue operations. FASDD contains fire, smoke, and confusing non-fire/non-smoke images acquired at different distances (near and far), different scenes (indoor and outdoor), different light intensities (day and night), and from various visual sensors (surveillance cameras, UAVs, and satellites). FASDD consists of three sub-datasets, a Computer Vision (CV) dataset (i.e. FASDD_CV), a Unmanned Aerial Vehicle (UAV) dataset (i.e. FASDD_UAV), and an Remote Sensing (RS) dataset (i.e. FASDD_RS). FASDD comprises 122,634 samples, with 70,581 annotated as positive samples and 52,073 labeled as negative samples. There are 113,154 instances of flame objects and 73,072 instances of smoke objects in the entire dataset. FASDD_CV contains 95,314 samples for general computer vision, while FASDD_UAV consists of 25,097 samples captured by UAV, and FASDD_RS comprises 2,223 samples from satellite imagery. FASDD_CV contains 73,297 fire instances and 53,080 smoke instances. The CV dataset exhibits considerable variation in image size, ranging from 78 to 10,600 pixels in width and 68 to 8,858 pixels in height. The aspect ratios of the images also vary significantly, ranging from 1:6.6 to 1:0.18. FASDD_UAV contains 36,308 fire instances and 17,222 smoke instances, with image aspect ratios primarily distributed between 4:3 and 16:9. In FASDD_RS, there are 2,770 smoke instances and 3,549 flame instances. The sizes of remote sensing images are predominantly around 1,000×1,000 pixels.FASDD is provided in three compressed files: FASDD_CV.zip, FASDD_UAV.zip, and FASDD_RS.zip, which correspond to the CV dataset, the UAV dataset, and the RS dataset, respectively. Additionally, there is a FASDD_RS_SWIR. zip folder storing pseudo-color images for detecting flame objects in remote sensing imagery. Each zip file contains two folders: "images" for storing the source data and "annotations" for storing the labels. The "annotations" folder consists of label files in four formats: YOLO, VOC, COCO, and TDML. The dataset is divided randomly into training, validation, and test sets, with a ratio of 1/2, 1/3, and 1/6, respectively, within each label format. In FASDD_CV, FASDD_UAV, and FASDD_RS, images and their corresponding annotation files have been individually sorted starting from 0. The flame and smoke objects in FASDD are given the labels "fire" and "smoke" for the object detection task, respectively. The names of all images and annotation files are prefixed with "Fire", "Smoke", "FireAndSmoke", and "NeitherFireNorSmoke", representing different categories for scene classification tasks.When using this dataset, please cite the following paper. Thank you very much for your support and cooperation:################################################################################使用数据集请引用对应论文,非常感谢您的关注和支持:Wang, M., Yue, P., Jiang, L., Yu, D., Tuo, T., & Li, J. (2025). An open flame and smoke detection dataset for deep learning in remote sensing based fire detection. Geo-spatial Information Science, 28(2), 511-526.################################################################################
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The RoadSens-4M Dataset provides a multimodal collection of sensor, video, weather, and GIS data designed to support research in intelligent transportation systems, road condition monitoring, and machine-learning-based anomaly detection. This dataset integrates synchronized smartphone sensor data (accelerometer, gyroscope, magnetometer, GPS) with video annotations, weather, and geospatial information to accurately identify and classify road surface anomalies, including bumps, potholes, and normal road segments.The dataset comprises 103 data sessions organized in a hierarchical structure to facilitate flexible access and multi-level analysis. It is divided into four main components: Raw Data, Combined CSV with GIS and Weather Data, Isolated Data, and GIS Data. Each session folder contains all corresponding sensor CSV files, including both calibrated and uncalibrated readings from the accelerometer, gyroscope, magnetometer, barometer, compass, gravity, and GPS sensors, along with annotation and metadata files. Within every session, a dedicated camera subfolder holds annotation data and a text file linking to the corresponding video stored on Google Drive, allowing researchers to access complete recordings without manual segmentation.The merged CSV files combine synchronized sensor, GIS, and weather information (temperature, humidity, wind speed, and atmospheric pressure) with a sampling interval of 0.01 seconds, ensuring high temporal resolution. The Isolated Data folder further separates normal and anomaly samples to enable focused comparative analysis, while the GIS Data folder contains QGIS and elevation files for spatial and topographical visualization.This well-structured organization ensures seamless integration of sensor, video, geographic, and environmental data, supporting efficient navigation and in-depth multimodal research. The raw data are hosted separately on Google Drive and can be accessed via the following link:🔗 https://drive.google.com/drive/folders/16tRSgXy6bjgIcJZzdw3U5unw7jpsKAHB?usp=drive_link
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.