8 datasets found
  1. a

    Flickr8k Dataset

    • academictorrents.com
    bittorrent
    Updated Mar 9, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Micah Hodosh and Peter Young and Julia Hockenmaier (2019). Flickr8k Dataset [Dataset]. https://academictorrents.com/details/9dea07ba660a722ae1008c4c8afdd303b6f6e53b
    Explore at:
    bittorrent(1117760547)Available download formats
    Dataset updated
    Mar 9, 2019
    Dataset authored and provided by
    Micah Hodosh and Peter Young and Julia Hockenmaier
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    8,000 photos and up to 5 captions for each photo. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. … The images were chosen from six different Flickr groups, and tend not to contain any well-known people or locations, but were manually selected to depict a variety of scenes and situations ## Citation Hodosh, Micah, Peter Young, and Julia Hockenmaier. "Framing image description as a ranking task: Data, models and evaluation metrics." Journal of Artificial Intelligence Research 47 (2013): 853-899.

  2. FLickr 8k dataset

    • kaggle.com
    Updated Feb 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yash_Oza_12 (2023). FLickr 8k dataset [Dataset]. https://www.kaggle.com/datasets/yashoza12/flickr-8k-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yash_Oza_12
    Description

    This dataset consists of images, captions and a pickle file which contains the features of images extracted by VGG-16. This dataset was not made by me. I have used this dataset for a joint embeddings project.

  3. Flickr 8k Images

    • kaggle.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parth Deshmukh (2025). Flickr 8k Images [Dataset]. https://www.kaggle.com/datasets/parthd144/flickr-8k-images/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    Kaggle
    Authors
    Parth Deshmukh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Image dataset from flickr website. Contains a total of 8091 images with corresponding captions. Mainly used for image caption generation.

    Tasks supported:

    1. Image Feature Extraction
    2. Image caption generation
  4. Flickr 8k Audio Caption Corpus

    • kaggle.com
    Updated Apr 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chirag Chauhan (2023). Flickr 8k Audio Caption Corpus [Dataset]. https://www.kaggle.com/datasets/warcoder/flickr-8k-audio-caption-corpus/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2023
    Dataset provided by
    Kaggle
    Authors
    Chirag Chauhan
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Description: The wavs/ directory contains 40,000 spoken audio captions in .wav audio format, one for each caption included in the train, dev, and test splits in the original Flickr 8k corpus (as defined by the files Flickr_8k.trainImages.txt, Flickr_8k.devImages.txt, and Flickr_8k.testImages.txt)

    The audio is sampled at 16000 Hz with 16-bit depth, and stored in Microsoft WAVE audio format

    The file wav2capt.txt contains a mapping from the .wav file names to the corresponding .jpg images and the caption number. The .jpg file names and caption numbers can then be mapped to the caption text via the Flickr8k.token.txt file from the original Flickr 8k corpus.

    The file wav2spk.txt contains a mapping from the .wav file names to its speaker. Each unique speaker is numbered consecutively from 1 to 183 (the total number of unique speakers).

    Citing:

    D. Harwath and J. Glass, "Deep Multimodal Semantic Embeddings for Speech and Images," 2015 IEEE Automatic Speech Recognition and Understanding Workshop, pp. 237-244, Scottsdale, Arizona, USA, December 2015 (PDF)

    M. Hodosh, P. Young and J. Hockenmaier (2013) "Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics", Journal of Artificial Intelligence Research, Volume 47, pages 853-899 https://www.jair.org/index.php/jair/article/view/10833/25854

  5. Flickr Datasets

    • zenodo.org
    bin, zip
    Updated Jun 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    V D; V D (2024). Flickr Datasets [Dataset]. http://doi.org/10.5281/zenodo.12572919
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Jun 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    V D; V D
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Flickr image-to-text pair datasets (8k and 30k) each contain five captions per image.

  6. g

    Flickr8k Image Dataset

    • gts.ai
    json
    Updated Jan 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2025). Flickr8k Image Dataset [Dataset]. https://gts.ai/dataset-download/page/83/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 25, 2025
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore the Flickr 8k Image Dataset, featuring 8,092 images with descriptive captions, perfect for machine learning beginners.

  7. Eye For Blind Dataset

    • kaggle.com
    Updated Oct 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Venkata Hemanth Gubbala (2023). Eye For Blind Dataset [Dataset]. https://www.kaggle.com/datasets/venkatahemanthg/eye-for-blind-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Venkata Hemanth Gubbala
    Description

    Dataset

    This dataset was created by Venkata Hemanth Gubbala

    Released under Other (specified in description)

    Contents

  8. O

    Flickr30k

    • opendatalab.com
    • datasets.activeloop.ai
    • +1more
    zip
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Illinois Urbana-Champaign (2023). Flickr30k [Dataset]. https://opendatalab.com/OpenDataLab/Flickr30k
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 17, 2023
    Dataset provided by
    University of Illinois Urbana-Champaign
    License

    https://www.flickr.com/help/terms/https://www.flickr.com/help/terms/

    Description

    To produce the denotation graph, we have created an image caption corpus consisting of 158,915 crowd-sourced captions describing 31,783 images. This is an extension of our previous Flickr 8k Dataset. The new images and captions focus on people involved in everyday activities and events.Use of the images must abide by the Flickr Terms of Use. We do not own the copyright of the images.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Micah Hodosh and Peter Young and Julia Hockenmaier (2019). Flickr8k Dataset [Dataset]. https://academictorrents.com/details/9dea07ba660a722ae1008c4c8afdd303b6f6e53b

Flickr8k Dataset

Explore at:
bittorrent(1117760547)Available download formats
Dataset updated
Mar 9, 2019
Dataset authored and provided by
Micah Hodosh and Peter Young and Julia Hockenmaier
License

https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

Description

8,000 photos and up to 5 captions for each photo. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. … The images were chosen from six different Flickr groups, and tend not to contain any well-known people or locations, but were manually selected to depict a variety of scenes and situations ## Citation Hodosh, Micah, Peter Young, and Julia Hockenmaier. "Framing image description as a ranking task: Data, models and evaluation metrics." Journal of Artificial Intelligence Research 47 (2013): 853-899.

Search
Clear search
Close search
Google apps
Main menu