4 datasets found
  1. O

    Flickr30k

    • opendatalab.com
    • datasets.activeloop.ai
    • +2more
    zip
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Illinois Urbana-Champaign (2023). Flickr30k [Dataset]. https://opendatalab.com/OpenDataLab/Flickr30k
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 17, 2023
    Dataset provided by
    University of Illinois Urbana-Champaign
    License

    https://www.flickr.com/help/terms/https://www.flickr.com/help/terms/

    Description

    To produce the denotation graph, we have created an image caption corpus consisting of 158,915 crowd-sourced captions describing 31,783 images. This is an extension of our previous Flickr 8k Dataset. The new images and captions focus on people involved in everyday activities and events.Use of the images must abide by the Flickr Terms of Use. We do not own the copyright of the images.

  2. O

    Flickr Image

    • opendatalab.com
    zip
    Updated Sep 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Illinois Urbana-Champaign (2022). Flickr Image [Dataset]. https://opendatalab.com/OpenDataLab/Flickr_Image
    Explore at:
    zip(8859405232 bytes)Available download formats
    Dataset updated
    Sep 13, 2022
    Dataset provided by
    University of Illinois Urbana-Champaign
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Flickr30k dataset has become a standard benchmark for sentence-based image description. This paper presents Flickr30k Entities, which augments the 158k captions from Flickr30k with 244k coreference chains, linking mentions of the same entities across different captions for the same image, and associating them with 276k manually annotated bounding boxes. Such annotations are essential for continued progress in automatic image description and grounded language understanding. They enable us to define a new benchmark for localization of textual entity mentions in an image. We present a strong baseline for this task that combines an image-text embedding, detectors for common objects, a color classifier, and a bias towards selecting larger objects.

  3. f

    BLEU scores for different datasets in different languages.

    • plos.figshare.com
    xls
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rimsha Muzaffar; Syed Yasser Arafat; Junaid Rashid; Jungeun Kim; Usman Naseem (2025). BLEU scores for different datasets in different languages. [Dataset]. http://doi.org/10.1371/journal.pone.0320701.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Rimsha Muzaffar; Syed Yasser Arafat; Junaid Rashid; Jungeun Kim; Usman Naseem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BLEU scores for different datasets in different languages.

  4. P

    Localized Narratives Dataset

    • paperswithcode.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordi Pont-Tuset; Jasper Uijlings; Soravit Changpinyo; Radu Soricut; Vittorio Ferrari, Localized Narratives Dataset [Dataset]. https://paperswithcode.com/dataset/localized-narratives
    Explore at:
    Authors
    Jordi Pont-Tuset; Jasper Uijlings; Soravit Changpinyo; Radu Soricut; Vittorio Ferrari
    Description

    We propose Localized Narratives, a new form of multimodal image annotations connecting vision and language. We ask annotators to describe an image with their voice while simultaneously hovering their mouse over the region they are describing. Since the voice and the mouse pointer are synchronized, we can localize every single word in the description. This dense visual grounding takes the form of a mouse trace segment per word and is unique to our data. We annotated 849k images with Localized Narratives: the whole COCO, Flickr30k, and ADE20K datasets, and 671k images of Open Images, all of which we make publicly available. We provide an extensive analysis of these annotations showing they are diverse, accurate, and efficient to produce. We also demonstrate their utility on the application of controlled image captioning.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
University of Illinois Urbana-Champaign (2023). Flickr30k [Dataset]. https://opendatalab.com/OpenDataLab/Flickr30k

Flickr30k

OpenDataLab/Flickr30k

Explore at:
zipAvailable download formats
Dataset updated
Mar 17, 2023
Dataset provided by
University of Illinois Urbana-Champaign
License

https://www.flickr.com/help/terms/https://www.flickr.com/help/terms/

Description

To produce the denotation graph, we have created an image caption corpus consisting of 158,915 crowd-sourced captions describing 31,783 images. This is an extension of our previous Flickr 8k Dataset. The new images and captions focus on people involved in everyday activities and events.Use of the images must abide by the Flickr Terms of Use. We do not own the copyright of the images.

Search
Clear search
Close search
Google apps
Main menu