Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.
Cityscapes is a dataset consisting of diverse urban street scenes across 50 different cities at varying times of the year as well as ground truths for several vision tasks including semantic segmentation, instance level segmentation (TODO), and stereo pair disparity inference.
For segmentation tasks (default split, accessible via 'cityscapes/semantic_segmentation'), Cityscapes provides dense pixel level annotations for 5000 images at 1024 * 2048 resolution pre-split into training (2975), validation (500) and test (1525) sets. Label annotations for segmentation tasks span across 30+ classes commonly encountered during driving scene perception. Detailed label information may be found here: https://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/helpers/labels.py#L52-L99
Cityscapes also provides coarse grain segmentation annotations (accessible via 'cityscapes/semantic_segmentation_extra') for 19998 images in a 'train_extra' split which may prove useful for pretraining / data-heavy models.
Besides segmentation, cityscapes also provides stereo image pairs and ground truths for disparity inference tasks on both the normal and extra splits (accessible via 'cityscapes/stereo_disparity' and 'cityscapes/stereo_disparity_extra' respectively).
Ingored examples:
WARNING: this dataset requires users to setup a login and password in order to get the files.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('cityscapes', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Chris1/cityscapes dataset hosted on Hugging Face and contributed by the HF Datasets community
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
The Cityscapes Dataset focuses on semantic understanding of urban street scenes. In the following, we give an overview on the design choices that were made to target the dataset’s focus.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Foggy Cityscapes Dataset is a dataset for object detection tasks - it contains People Vehicles Objects Jag4 annotations for 1,500 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Download the Cityscapes Dataset for urban scene segmentation, AI training, and autonomous driving research
This dataset is part of the CycleGAN datasets, originally hosted here: https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/
Citation
@article{DBLP:journals/corr/ZhuPIE17, author = {Jun{-}Yan Zhu and Taesung Park and Phillip Isola and Alexei A. Efros}, title = {Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks}, journal = {CoRR}, volume = {abs/1703.10593}, year… See the full description on the dataset page: https://huggingface.co/datasets/huggan/cityscapes.
Out-of-Context Cityscapes (OC-Cityscapes) is a new dataset build by replacing roads in the validation data of Cityscapes with various textures such as water, sand, grass, etc.
https://drive.google.com/file/d/1pKdlglcvsGseLzS1MX8SdjzQO2o1KZm6/view?usp=sharing
Cityscapes dataset is a large-scale urban scene dataset containing 30,000 images of street scenes.
Cityscapes-VPS is a video extension of the Cityscapes validation split. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time. It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Most semantic segmentation works have obtained accurate segmentation results through exploring the contextual dependencies. However, there are several major limitations that need further investigation. For example, most approaches rarely distinguish different types of contextual dependencies, which may pollute the scene understanding. Moreover, local convolutions are commonly used in deep learning models to learn attention and capture local patterns in the data. These convolutions operate on a small neighborhood of the input, focusing on nearby information and disregarding global structural patterns. To address these concerns, we propose a Global Domain Adaptation Attention with Data-Dependent Regulator (GDAAR) method to explore the contextual dependencies. Specifically, to effectively capture both the global distribution information and local appearance details, we suggest using a stacked relation approach. This involves incorporating the feature node itself and its pairwise affinities with all other feature nodes within the network, arranged in raster scan order. By doing so, we can learn a global domain adaptation attention mechanism. Meanwhile, to improve the features similarity belonging to the same segment region while keeping the discriminative power of features belonging to different segments, we design a data-dependent regulator to adjust the global domain adaptation attention on the feature map during inference. Extensive ablation studies demonstrate that our GDAAR better captures the global distribution information for the contextual dependencies and achieves the state-of-the-art performance on several popular benchmarks.
Cityscapes data (dataset home page) contains labeled videos taken from vehicles driven in Germany. This version is a processed subsample created as part of the Pix2Pix paper. The dataset has still images from the original videos, and the semantic segmentation labels are shown in images alongside the original image. This is one of the best datasets around for semantic segmentation tasks.
This dataset has 2975 training images files and 500 validation image files. Each image file is 256x512 pixels, and each file is a composite with the original photo on the left half of the image, alongside the labeled image (output of semantic segmentation) on the right half.
This dataset is the same as what is available here from the Berkeley AI Research group.
The Cityscapes data available from cityscapes-dataset.com has the following license:
This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree:
Can you identify you identify what objects are where in these images from a vehicle.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Object Segmentation Cityscapes is a dataset for instance segmentation tasks - it contains Objects annotations for 582 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
EduardoLawson1/cityscapes dataset hosted on Hugging Face and contributed by the HF Datasets community
Sajid121/Cityscapes dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1111111111111111111111111111111
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cityscape‑Adverse
A benchmark for evaluating semantic segmentation robustness under realistic adverse conditions.
Overview
Cityscape‑Adverse extends the original Cityscapes dataset by applying eight realistic environmental modifications—rainy, foggy, spring, autumn, snowy, sunny, night, and dawn—using diffusion‑based image editing. All transformations preserve the original 2048×1024 semantic labels, enabling direct evaluation of model robustness in out‑of‑distribution… See the full description on the dataset page: https://huggingface.co/datasets/naufalso/cityscape-adverse.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Cityscapes_full is a dataset for instance segmentation tasks - it contains Road annotations for 250 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Segmentation results of Cityscapes dataset experiment (unit: %).
The GTA5→Cityscapes dataset is a synthetic-to-real benchmark dataset for domain adaptation in semantic segmentation.
Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.