13 datasets found

P
Cityscapes-VPS Dataset
paperswithcode.com
Updated Jun 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon (2022). Cityscapes-VPS Dataset [Dataset]. https://paperswithcode.com/dataset/cityscapes-vps
Explore at:
Dataset updated
Jun 14, 2022
Authors
Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon
Description
Cityscapes-VPS is a video extension of the Cityscapes validation split. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time. It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.
P
All-day CityScapes Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qi Bi; ShaoDi You; Theo Gevers, All-day CityScapes Dataset [Dataset]. https://paperswithcode.com/dataset/all-day-cityscapes
Explore at:
Authors
Qi Bi; ShaoDi You; Theo Gevers
Description
We design an all-day semantic segmentation benchmark all-day CityScapes. It is the first semantic segmentation benchmark that contains samples from all-day scenarios, i.e., from dawn to night. Our dataset will be made publicly available at [https://isis-data.science.uva.nl/cv/1ADcityscape.zip].
The impact of different hyper-parameter α.
plos.figshare.com
xls
Updated Sep 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The impact of different hyper-parameter α. [Dataset]. https://plos.figshare.com/articles/dataset/The_impact_of_different_hyper-parameter_/24120249
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0291241.t009
Dataset updated
Sep 11, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Jian Wei; Qinzhao Wang; Zixu Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cross-domain object detection is a key problem in the research of intelligent detection models. Different from lots of improved algorithms based on two-stage detection models, we try another way. A simple and efficient one-stage model is introduced in this paper, comprehensively considering the inference efficiency and detection precision, and expanding the scope of undertaking cross-domain object detection problems. We name this gradient reverse layer-based model YOLO-G, which greatly improves the object detection precision in cross-domain scenarios. Specifically, we add a feature alignment branch following the backbone, where the gradient reverse layer and a classifier are attached. With only a small increase in computational, the performance is higher enhanced. Experiments such as Cityscapes→Foggy Cityscapes, SIM10k→Cityscape, PASCAL VOC→Clipart, and so on, indicate that compared with most state-of-the-art (SOTA) algorithms, the proposed model achieves much better mean Average Precision (mAP). Furthermore, ablation experiments were also performed on 4 components to confirm the reliability of the model. The project is available at https://github.com/airy975924806/yolo-G.
P
CityPersons Dataset
paperswithcode.com
opendatalab.com
+1more
Updated Mar 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanshan Zhang; Rodrigo Benenson; Bernt Schiele (2021). CityPersons Dataset [Dataset]. https://paperswithcode.com/dataset/citypersons
Explore at:
Dataset updated
Mar 23, 2021
Authors
Shanshan Zhang; Rodrigo Benenson; Bernt Schiele
Description
The CityPersons dataset is a subset of Cityscapes which only consists of person annotations. There are 2975 images for training, 500 and 1575 images for validation and testing. The average of the number of pedestrians in an image is 7. The visible-region and full-body annotations are provided.
f
The impact of training tricks used in YOLO.
plos.figshare.com
xls
Updated Sep 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jian Wei; Qinzhao Wang; Zixu Zhao (2023). The impact of training tricks used in YOLO. [Dataset]. http://doi.org/10.1371/journal.pone.0291241.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0291241.t007
Dataset updated
Sep 11, 2023
Dataset provided by
PLOS ONE
Authors
Jian Wei; Qinzhao Wang; Zixu Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cross-domain object detection is a key problem in the research of intelligent detection models. Different from lots of improved algorithms based on two-stage detection models, we try another way. A simple and efficient one-stage model is introduced in this paper, comprehensively considering the inference efficiency and detection precision, and expanding the scope of undertaking cross-domain object detection problems. We name this gradient reverse layer-based model YOLO-G, which greatly improves the object detection precision in cross-domain scenarios. Specifically, we add a feature alignment branch following the backbone, where the gradient reverse layer and a classifier are attached. With only a small increase in computational, the performance is higher enhanced. Experiments such as Cityscapes→Foggy Cityscapes, SIM10k→Cityscape, PASCAL VOC→Clipart, and so on, indicate that compared with most state-of-the-art (SOTA) algorithms, the proposed model achieves much better mean Average Precision (mAP). Furthermore, ablation experiments were also performed on 4 components to confirm the reliability of the model. The project is available at https://github.com/airy975924806/yolo-G.
f
Effect of additional modules on segmentation performance: Ablation study...
figshare.com
xls
Updated Jan 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Effect of additional modules on segmentation performance: Ablation study results in Cityscapes dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0305561.t005
Dataset updated
Jan 16, 2025
Dataset provided by
PLOS ONE
Authors
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Effect of additional modules on segmentation performance: Ablation study results in Cityscapes dataset.
HSICityV2: Urban Scene Understanding via Hyperspectral Images
zenodo.org
bin
Updated Aug 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You; Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You (2022). HSICityV2: Urban Scene Understanding via Hyperspectral Images [Dataset]. http://doi.org/10.5281/zenodo.7030857
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7030857
Dataset updated
Aug 29, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You; Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Light in all spectrum travel in the physical world. The trichromatism (RGB) human vision captures and understands it. Machine vision makes an analogy which use RGB camera for semantic segmentation and scene understanding. We argue that such machine vision suffers from metamerism, that different objects may appear in same RGB color while actually distinctive in spectrum. While learning based solutions, especially deep learning, have been heavily explored, they do not solve the fundamental physical limitation. In this paper, we propose to use Hyperspectral images (HSIs), which capture hundreds of consecutive narrow bands from the real visible world and therefore metamerism no longer exists. In short, we aim to 'see beyond human vision'. In practice, we introduce a novel large scale high quality HSI dataset for semantic segmentation in cityscapes. Namely, Hyperspectral City dataset. The dataset contains 1330 HSIs which are captured in typical urban driving scenes. Each HSI has 1889×1422 spatial resolution and 128 spectral channels ranged from 450nm to 950nm. The dataset provides semantic annotation at pixel level which is done manually by professional annotators. We believe this dataset enables a new direction for scene understanding.
f
Performance comparison of the proposed CycleGAN with other SOTA deep...
plos.figshare.com
xls
Updated Nov 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Balaji Ganesh Rajagopal; Manish Kumar; Abdulaziz H. Alshehri; Fayez Alanazi; Ahmed farouk Deifalla; Ahmed M. Yosri; Abdelhalim Azam (2023). Performance comparison of the proposed CycleGAN with other SOTA deep generation models. [Dataset]. http://doi.org/10.1371/journal.pone.0293978.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0293978.t003
Dataset updated
Nov 30, 2023
Dataset provided by
PLOS ONE
Authors
Balaji Ganesh Rajagopal; Manish Kumar; Abdulaziz H. Alshehri; Fayez Alanazi; Ahmed farouk Deifalla; Ahmed M. Yosri; Abdelhalim Azam
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performance comparison of the proposed CycleGAN with other SOTA deep generation models.
P
ADE20K Dataset
paperswithcode.com
Updated Apr 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bolei Zhou; Hang Zhao; Xavier Puig; Sanja Fidler; Adela Barriuso; Antonio Torralba (2022). ADE20K Dataset [Dataset]. https://paperswithcode.com/dataset/ade20k
Explore at:
Dataset updated
Apr 2, 2022
Authors
Bolei Zhou; Hang Zhao; Xavier Puig; Sanja Fidler; Adela Barriuso; Antonio Torralba
Description
The ADE20K semantic segmentation dataset contains more than 20K scene-centric images exhaustively annotated with pixel-level objects and object parts labels. There are totally 150 semantic categories, which include stuffs like sky, road, grass, and discrete objects like person, car, bed.
f
Performance comparison of semantic segmentation methods on Cityscapes,...
figshare.com
xls
Updated Jan 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Performance comparison of semantic segmentation methods on Cityscapes, DensPASS. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0305561.t006
Dataset updated
Jan 16, 2025
Dataset provided by
PLOS ONE
Authors
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performance comparison of semantic segmentation methods on Cityscapes, DensPASS.
P
DensePASS Dataset
paperswithcode.com
opendatalab.com
Updated Oct 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chaoxiang Ma; Jiaming Zhang; Kailun Yang; Alina Roitberg; Rainer Stiefelhagen (2021). DensePASS Dataset [Dataset]. https://paperswithcode.com/dataset/densepass
Explore at:
Dataset updated
Oct 22, 2021
Authors
Chaoxiang Ma; Jiaming Zhang; Kailun Yang; Alina Roitberg; Rainer Stiefelhagen
Description
DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic transfer and accompanied with pinhole camera training examples obtained from Cityscapes. DensePASS covers both, labelled- and unlabelled 360-degree images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source domain (i.e. pinhole) data.
f
Structure of improved ResNet50.
figshare.com
xls
Updated Jan 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Structure of improved ResNet50. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0305561.t002
Dataset updated
Jan 16, 2025
Dataset provided by
PLOS ONE
Authors
Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper presents a novel method for improving semantic segmentation performance in computer vision tasks. Our approach utilizes an enhanced UNet architecture that leverages an improved ResNet50 backbone. We replace the last layer of ResNet50 with deformable convolution to enhance feature representation. Additionally, we incorporate an attention mechanism, specifically ECA-ASPP (Attention Spatial Pyramid Pooling), in the encoding path of UNet to capture multi-scale contextual information effectively. In the decoding path of UNet, we explore the use of attention mechanisms after concatenating low-level features with high-level features. Specifically, we investigate two types of attention mechanisms: ECA (Efficient Channel Attention) and LKA (Large Kernel Attention). Our experiments demonstrate that incorporating attention after concatenation improves segmentation accuracy. Furthermore, we compare the performance of ECA and LKA modules in the decoder path. The results indicate that the LKA module outperforms the ECA module. This finding highlights the importance of exploring different attention mechanisms and their impact on segmentation performance. To evaluate the effectiveness of the proposed method, we conduct experiments on benchmark datasets, including Stanford and Cityscapes, as well as the newly introduced WildPASS and DensPASS datasets. Based on our experiments, the proposed method achieved state-of-the-art results including mIoU 85.79 and 82.25 for the Stanford dataset, and the Cityscapes dataset, respectively. The results demonstrate that our proposed method performs well on these datasets, achieving state-of-the-art results with high segmentation accuracy.
P
RailSem19 Dataset
paperswithcode.com
Updated Jun 15, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oliver Zendel; Markus Murschitz; Marcel Zeilinger; Daniel Steininger; Sara Abbasi; Csaba Beleznai (2019). RailSem19 Dataset [Dataset]. https://paperswithcode.com/dataset/railsem19
Explore at:
Dataset updated
Jun 15, 2019
Authors
Oliver Zendel; Markus Murschitz; Marcel Zeilinger; Daniel Steininger; Sara Abbasi; Csaba Beleznai
Description
RailSem19 offers 8500 unique images taken from a the ego-perspective of a rail vehicle (trains and trams). Extensive semantic annotations are provided, both geometry-based (rail-relevant polygons, all rails as polylines) and dense label maps with many Cityscapes-compatible road labels. Many frames show areas of intersection between road and rail vehicles (railway crossings, trams driving on city streets). RailSem19 is usefull for rail applications and road applications alike.

Image credit: https://wilddash.cc/railsem19
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon (2022). Cityscapes-VPS Dataset [Dataset]. https://paperswithcode.com/dataset/cityscapes-vps

Cityscapes-VPS Dataset

Explore at:

38 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 14, 2022

Authors

Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon

Description

Cityscapes-VPS is a video extension of the Cityscapes validation split. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time. It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

Clear search

Close search

Google apps

Main menu

Cityscapes-VPS Dataset

All-day CityScapes Dataset

The impact of different hyper-parameter α.

CityPersons Dataset

The impact of training tricks used in YOLO.

Effect of additional modules on segmentation performance: Ablation study...

HSICityV2: Urban Scene Understanding via Hyperspectral Images

Performance comparison of the proposed CycleGAN with other SOTA deep...

ADE20K Dataset

Performance comparison of semantic segmentation methods on Cityscapes,...

DensePASS Dataset

Structure of improved ResNet50.

RailSem19 Dataset

Cityscapes-VPS Dataset