4 datasets found
  1. ForestFire-VQA Dataset

    • kaggle.com
    zip
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charuni Kosala (2025). ForestFire-VQA Dataset [Dataset]. https://www.kaggle.com/datasets/charunikosala/forestfire-vqa-dataset
    Explore at:
    zip(39658074 bytes)Available download formats
    Dataset updated
    Oct 24, 2025
    Authors
    Charuni Kosala
    Description

    Description The Forest Fire Visual Question Answering (FF-VQA) Dataset is a multimodal dataset designed to train and evaluate AI models capable of interpreting visual information about forest fires and responding to natural language questions. It combines images, audio recordings, and tabular data related to forest fires with corresponding human-annotated questions and answers to enable advanced reasoning, visual understanding, and decision support in environmental monitoring systems. The dataset includes question–answer pairs in two languages: Sinhala and English, supporting multilingual AI research and cross-lingual model development. This dataset aims to advance research in visual reasoning, environmental AI, disaster response automation, and remote sensing analysis. It provides rich, high-quality annotations for supervised training and benchmarking in both computer vision and natural language understanding tasks.

    Data Collection Method Images, audio, and tabular data were collected from publicly available wildfire imagery databases and verified open-source repositories, including sources such as the Kaggle Wildfire Archives. Visual Question Answering (VQA) pairs were created, annotated, and validated by domain experts in forest fire research — including professors and forest department officers — to ensure accuracy, contextual relevance, and domain consistency. The question–answer pairs were reviewed for grammatical correctness and linguistic clarity, then categorized into five thematic areas: situation awareness, safety and risk assessment, incident analysis and restoration, prevention and continuous learning, and environmental assessment. Each pair was labeled in both Sinhala and English to support bilingual training and evaluation for multilingual VQA models.

  2. Table_1_Evaluation of Deep Learning-Based Automated Detection of Primary...

    • frontiersin.figshare.com
    docx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanqiang Ouyang; Fanyu Meng; Jianfang Liu; Xinhang Song; Yuan Li; Yuan Yuan; Chunjie Wang; Ning Lang; Shuai Tian; Meiyi Yao; Xiaoguang Liu; Huishu Yuan; Shuqiang Jiang; Liang Jiang (2023). Table_1_Evaluation of Deep Learning-Based Automated Detection of Primary Spine Tumors on MRI Using the Turing Test.docx [Dataset]. http://doi.org/10.3389/fonc.2022.814667.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Hanqiang Ouyang; Fanyu Meng; Jianfang Liu; Xinhang Song; Yuan Li; Yuan Yuan; Chunjie Wang; Ning Lang; Shuai Tian; Meiyi Yao; Xiaoguang Liu; Huishu Yuan; Shuqiang Jiang; Liang Jiang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundRecently, the Turing test has been used to investigate whether machines have intelligence similar to humans. Our study aimed to assess the ability of an artificial intelligence (AI) system for spine tumor detection using the Turing test.MethodsOur retrospective study data included 12179 images from 321 patients for developing AI detection systems and 6635 images from 187 patients for the Turing test. We utilized a deep learning-based tumor detection system with Faster R-CNN architecture, which generates region proposals by Region Proposal Network in the first stage and corrects the position and the size of the bounding box of the lesion area in the second stage. Each choice question featured four bounding boxes enclosing an identical tumor. Three were detected by the proposed deep learning model, whereas the other was annotated by a doctor; the results were shown to six doctors as respondents. If the respondent did not correctly identify the image annotated by a human, his answer was considered a misclassification. If all misclassification rates were >30%, the respondents were considered unable to distinguish the AI-detected tumor from the human-annotated one, which indicated that the AI system passed the Turing test.ResultsThe average misclassification rates in the Turing test were 51.2% (95% CI: 45.7%–57.5%) in the axial view (maximum of 62%, minimum of 44%) and 44.5% (95% CI: 38.2%–51.8%) in the sagittal view (maximum of 59%, minimum of 36%). The misclassification rates of all six respondents were >30%; therefore, our AI system passed the Turing test.ConclusionOur proposed intelligent spine tumor detection system has a similar detection ability to annotation doctors and may be an efficient tool to assist radiologists or orthopedists in primary spine tumor detection.

  3. 2,504 Images – Chinese Handwriting OCR Data

    • nexdata.ai
    Updated Nov 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 2,504 Images – Chinese Handwriting OCR Data [Dataset]. https://www.nexdata.ai/datasets/ocr/1333
    Explore at:
    Dataset updated
    Nov 11, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Data size, Data format, Writing type, Accuracy rate, Data diversity, Writing content, Annotation content, Photographic angle, Collecting environment
    Description

    2,504 Images – Chinese Handwriting OCR Data. The writing environment includes A4 paper, square paper, lined paper, white board, color note, answer sheet, etc. The writing contents include poetry, prose, store activity notices, greetings, wish lists, excerpts,composition, notes, etc. The data diversity includes multiple writing papers, multiple fonts, multiple writing contents, multiple photographic angles. The collecting angeles are looking up angle and eye-level angle. For annotation, line-level/column-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data. The dataset can be used for tasks such as Chinese handwriting OCR.

  4. h

    Viet-Sketches-VQA

    • huggingface.co
    Updated Oct 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fifth Civil Defender - 5CD (2024). Viet-Sketches-VQA [Dataset]. https://huggingface.co/datasets/5CD-AI/Viet-Sketches-VQA
    Explore at:
    Dataset updated
    Oct 14, 2024
    Dataset authored and provided by
    Fifth Civil Defender - 5CD
    Description

    Dataset Overview

    This dataset is was created from 3088 Vietnamese Sketches 🇻🇳 images from books. Each image has been analyzed and annotated using advanced Visual Question Answering (VQA) techniques to produce a comprehensive dataset. There is a set of 18,000 detailed descriptions, and query-based questions and answers generated by the Gemini 1.5 Flash model, currently Google's leading model on the WildVision Arena Leaderboard. This results in a richly annotated dataset, ideal for… See the full description on the dataset page: https://huggingface.co/datasets/5CD-AI/Viet-Sketches-VQA.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Charuni Kosala (2025). ForestFire-VQA Dataset [Dataset]. https://www.kaggle.com/datasets/charunikosala/forestfire-vqa-dataset
Organization logo

ForestFire-VQA Dataset

A Benchmark for Multimodal Understanding of Forest Fire Scenes

Explore at:
zip(39658074 bytes)Available download formats
Dataset updated
Oct 24, 2025
Authors
Charuni Kosala
Description

Description The Forest Fire Visual Question Answering (FF-VQA) Dataset is a multimodal dataset designed to train and evaluate AI models capable of interpreting visual information about forest fires and responding to natural language questions. It combines images, audio recordings, and tabular data related to forest fires with corresponding human-annotated questions and answers to enable advanced reasoning, visual understanding, and decision support in environmental monitoring systems. The dataset includes question–answer pairs in two languages: Sinhala and English, supporting multilingual AI research and cross-lingual model development. This dataset aims to advance research in visual reasoning, environmental AI, disaster response automation, and remote sensing analysis. It provides rich, high-quality annotations for supervised training and benchmarking in both computer vision and natural language understanding tasks.

Data Collection Method Images, audio, and tabular data were collected from publicly available wildfire imagery databases and verified open-source repositories, including sources such as the Kaggle Wildfire Archives. Visual Question Answering (VQA) pairs were created, annotated, and validated by domain experts in forest fire research — including professors and forest department officers — to ensure accuracy, contextual relevance, and domain consistency. The question–answer pairs were reviewed for grammatical correctness and linguistic clarity, then categorized into five thematic areas: situation awareness, safety and risk assessment, incident analysis and restoration, prevention and continuous learning, and environmental assessment. Each pair was labeled in both Sinhala and English to support bilingual training and evaluation for multilingual VQA models.

Search
Clear search
Close search
Google apps
Main menu