13 datasets found
  1. P

    Cityscapes-VPS Dataset

    • paperswithcode.com
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon (2022). Cityscapes-VPS Dataset [Dataset]. https://paperswithcode.com/dataset/cityscapes-vps
    Explore at:
    Dataset updated
    Jun 14, 2022
    Authors
    Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon
    Description

    Cityscapes-VPS is a video extension of the Cityscapes validation split. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time. It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

  2. P

    All-day CityScapes Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qi Bi; ShaoDi You; Theo Gevers, All-day CityScapes Dataset [Dataset]. https://paperswithcode.com/dataset/all-day-cityscapes
    Explore at:
    Authors
    Qi Bi; ShaoDi You; Theo Gevers
    Description

    We design an all-day semantic segmentation benchmark all-day CityScapes. It is the first semantic segmentation benchmark that contains samples from all-day scenarios, i.e., from dawn to night. Our dataset will be made publicly available at [https://isis-data.science.uva.nl/cv/1ADcityscape.zip].

  3. The impact of different hyper-parameter α.

    • plos.figshare.com
    xls
    Updated Sep 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The impact of different hyper-parameter α. [Dataset]. https://plos.figshare.com/articles/dataset/The_impact_of_different_hyper-parameter_/24120249
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 11, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jian Wei; Qinzhao Wang; Zixu Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cross-domain object detection is a key problem in the research of intelligent detection models. Different from lots of improved algorithms based on two-stage detection models, we try another way. A simple and efficient one-stage model is introduced in this paper, comprehensively considering the inference efficiency and detection precision, and expanding the scope of undertaking cross-domain object detection problems. We name this gradient reverse layer-based model YOLO-G, which greatly improves the object detection precision in cross-domain scenarios. Specifically, we add a feature alignment branch following the backbone, where the gradient reverse layer and a classifier are attached. With only a small increase in computational, the performance is higher enhanced. Experiments such as Cityscapes→Foggy Cityscapes, SIM10k→Cityscape, PASCAL VOC→Clipart, and so on, indicate that compared with most state-of-the-art (SOTA) algorithms, the proposed model achieves much better mean Average Precision (mAP). Furthermore, ablation experiments were also performed on 4 components to confirm the reliability of the model. The project is available at https://github.com/airy975924806/yolo-G.

  4. P

    CityPersons Dataset

    • paperswithcode.com
    • opendatalab.com
    • +1more
    Updated Mar 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanshan Zhang; Rodrigo Benenson; Bernt Schiele (2021). CityPersons Dataset [Dataset]. https://paperswithcode.com/dataset/citypersons
    Explore at:
    Dataset updated
    Mar 23, 2021
    Authors
    Shanshan Zhang; Rodrigo Benenson; Bernt Schiele
    Description

    The CityPersons dataset is a subset of Cityscapes which only consists of person annotations. There are 2975 images for training, 500 and 1575 images for validation and testing. The average of the number of pedestrians in an image is 7. The visible-region and full-body annotations are provided.

  5. f

    The impact of training tricks used in YOLO.

    • plos.figshare.com
    xls
    Updated Sep 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jian Wei; Qinzhao Wang; Zixu Zhao (2023). The impact of training tricks used in YOLO. [Dataset]. http://doi.org/10.1371/journal.pone.0291241.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 11, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jian Wei; Qinzhao Wang; Zixu Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cross-domain object detection is a key problem in the research of intelligent detection models. Different from lots of improved algorithms based on two-stage detection models, we try another way. A simple and efficient one-stage model is introduced in this paper, comprehensively considering the inference efficiency and detection precision, and expanding the scope of undertaking cross-domain object detection problems. We name this gradient reverse layer-based model YOLO-G, which greatly improves the object detection precision in cross-domain scenarios. Specifically, we add a feature alignment branch following the backbone, where the gradient reverse layer and a classifier are attached. With only a small increase in computational, the performance is higher enhanced. Experiments such as Cityscapes→Foggy Cityscapes, SIM10k→Cityscape, PASCAL VOC→Clipart, and so on, indicate that compared with most state-of-the-art (SOTA) algorithms, the proposed model achieves much better mean Average Precision (mAP). Furthermore, ablation experiments were also performed on 4 components to confirm the reliability of the model. The project is available at https://github.com/airy975924806/yolo-G.

  6. f

    Effect of additional modules on segmentation performance: Ablation study...

    • figshare.com
    xls
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Effect of additional modules on segmentation performance: Ablation study results in Cityscapes dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Effect of additional modules on segmentation performance: Ablation study results in Cityscapes dataset.

  7. HSICityV2: Urban Scene Understanding via Hyperspectral Images

    • zenodo.org
    bin
    Updated Aug 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You; Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You (2022). HSICityV2: Urban Scene Understanding via Hyperspectral Images [Dataset]. http://doi.org/10.5281/zenodo.7030857
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You; Yuxing Huang; Tianqi Ren; Qiu Shen; Ying Fu; Shaodi You
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Light in all spectrum travel in the physical world. The trichromatism (RGB) human vision captures and understands it. Machine vision makes an analogy which use RGB camera for semantic segmentation and scene understanding. We argue that such machine vision suffers from metamerism, that different objects may appear in same RGB color while actually distinctive in spectrum. While learning based solutions, especially deep learning, have been heavily explored, they do not solve the fundamental physical limitation. In this paper, we propose to use Hyperspectral images (HSIs), which capture hundreds of consecutive narrow bands from the real visible world and therefore metamerism no longer exists. In short, we aim to 'see beyond human vision'. In practice, we introduce a novel large scale high quality HSI dataset for semantic segmentation in cityscapes. Namely, Hyperspectral City dataset. The dataset contains 1330 HSIs which are captured in typical urban driving scenes. Each HSI has 1889×1422 spatial resolution and 128 spectral channels ranged from 450nm to 950nm. The dataset provides semantic annotation at pixel level which is done manually by professional annotators. We believe this dataset enables a new direction for scene understanding.

  8. f

    Performance comparison of the proposed CycleGAN with other SOTA deep...

    • plos.figshare.com
    xls
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Balaji Ganesh Rajagopal; Manish Kumar; Abdulaziz H. Alshehri; Fayez Alanazi; Ahmed farouk Deifalla; Ahmed M. Yosri; Abdelhalim Azam (2023). Performance comparison of the proposed CycleGAN with other SOTA deep generation models. [Dataset]. http://doi.org/10.1371/journal.pone.0293978.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Balaji Ganesh Rajagopal; Manish Kumar; Abdulaziz H. Alshehri; Fayez Alanazi; Ahmed farouk Deifalla; Ahmed M. Yosri; Abdelhalim Azam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of the proposed CycleGAN with other SOTA deep generation models.

  9. P

    ADE20K Dataset

    • paperswithcode.com
    Updated Apr 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bolei Zhou; Hang Zhao; Xavier Puig; Sanja Fidler; Adela Barriuso; Antonio Torralba (2022). ADE20K Dataset [Dataset]. https://paperswithcode.com/dataset/ade20k
    Explore at:
    Dataset updated
    Apr 2, 2022
    Authors
    Bolei Zhou; Hang Zhao; Xavier Puig; Sanja Fidler; Adela Barriuso; Antonio Torralba
    Description

    The ADE20K semantic segmentation dataset contains more than 20K scene-centric images exhaustively annotated with pixel-level objects and object parts labels. There are totally 150 semantic categories, which include stuffs like sky, road, grass, and discrete objects like person, car, bed.

  10. f

    Performance comparison of semantic segmentation methods on Cityscapes,...

    • figshare.com
    xls
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Performance comparison of semantic segmentation methods on Cityscapes, DensPASS. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of semantic segmentation methods on Cityscapes, DensPASS.

  11. P

    DensePASS Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaoxiang Ma; Jiaming Zhang; Kailun Yang; Alina Roitberg; Rainer Stiefelhagen (2021). DensePASS Dataset [Dataset]. https://paperswithcode.com/dataset/densepass
    Explore at:
    Dataset updated
    Oct 22, 2021
    Authors
    Chaoxiang Ma; Jiaming Zhang; Kailun Yang; Alina Roitberg; Rainer Stiefelhagen
    Description

    DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic transfer and accompanied with pinhole camera training examples obtained from Cityscapes. DensePASS covers both, labelled- and unlabelled 360-degree images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source domain (i.e. pinhole) data.

  12. f

    Structure of improved ResNet50.

    • figshare.com
    xls
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh (2025). Structure of improved ResNet50. [Dataset]. http://doi.org/10.1371/journal.pone.0305561.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Effat Sahragard; Hassan Farsi; Sajad Mohamadzadeh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper presents a novel method for improving semantic segmentation performance in computer vision tasks. Our approach utilizes an enhanced UNet architecture that leverages an improved ResNet50 backbone. We replace the last layer of ResNet50 with deformable convolution to enhance feature representation. Additionally, we incorporate an attention mechanism, specifically ECA-ASPP (Attention Spatial Pyramid Pooling), in the encoding path of UNet to capture multi-scale contextual information effectively. In the decoding path of UNet, we explore the use of attention mechanisms after concatenating low-level features with high-level features. Specifically, we investigate two types of attention mechanisms: ECA (Efficient Channel Attention) and LKA (Large Kernel Attention). Our experiments demonstrate that incorporating attention after concatenation improves segmentation accuracy. Furthermore, we compare the performance of ECA and LKA modules in the decoder path. The results indicate that the LKA module outperforms the ECA module. This finding highlights the importance of exploring different attention mechanisms and their impact on segmentation performance. To evaluate the effectiveness of the proposed method, we conduct experiments on benchmark datasets, including Stanford and Cityscapes, as well as the newly introduced WildPASS and DensPASS datasets. Based on our experiments, the proposed method achieved state-of-the-art results including mIoU 85.79 and 82.25 for the Stanford dataset, and the Cityscapes dataset, respectively. The results demonstrate that our proposed method performs well on these datasets, achieving state-of-the-art results with high segmentation accuracy.

  13. P

    RailSem19 Dataset

    • paperswithcode.com
    Updated Jun 15, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oliver Zendel; Markus Murschitz; Marcel Zeilinger; Daniel Steininger; Sara Abbasi; Csaba Beleznai (2019). RailSem19 Dataset [Dataset]. https://paperswithcode.com/dataset/railsem19
    Explore at:
    Dataset updated
    Jun 15, 2019
    Authors
    Oliver Zendel; Markus Murschitz; Marcel Zeilinger; Daniel Steininger; Sara Abbasi; Csaba Beleznai
    Description

    RailSem19 offers 8500 unique images taken from a the ego-perspective of a rail vehicle (trains and trams). Extensive semantic annotations are provided, both geometry-based (rail-relevant polygons, all rails as polylines) and dense label maps with many Cityscapes-compatible road labels. Many frames show areas of intersection between road and rail vehicles (railway crossings, trams driving on city streets). RailSem19 is usefull for rail applications and road applications alike.

    Image credit: https://wilddash.cc/railsem19

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon (2022). Cityscapes-VPS Dataset [Dataset]. https://paperswithcode.com/dataset/cityscapes-vps

Cityscapes-VPS Dataset

Explore at:
38 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 14, 2022
Authors
Dahun Kim; Sanghyun Woo; Joon-Young Lee; In So Kweon
Description

Cityscapes-VPS is a video extension of the Cityscapes validation split. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time. It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

Search
Clear search
Close search
Google apps
Main menu