1 dataset found
  1. Z

    Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101)

    • data.niaid.nih.gov
    Updated Mar 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lohit, Suhas (2023). Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7761799
    Explore at:
    Dataset updated
    Mar 28, 2023
    Dataset provided by
    Lohit, Suhas
    Cherian, Anoop
    Tenenbaum, Joshua B.
    Peng, Kuan-Chuan
    Smith, Kevin A.
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Introduction

    Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task (and the associated SMART-101 dataset) for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children of younger age (6--8). Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including pattern recognition, algebra, and spatial reasoning, among others. To train deep neural networks, we programmatically augment each puzzle to 2,000 new instances; each instance varied in appearance, associated natural language question, and its solution. To foster research and make progress in the quest for artificial general intelligence, we are publicly releasing our SMART-101 dataset, consisting of the full set of programmatically-generated instances of 101 puzzles and their solutions.

    The dataset was introduced in our paper Are Deep Neural Networks SMARTer than Second Graders? by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin A. Smith, and Joshua B. Tenenbaum, CVPR 2023

    Files in the unzipped folder:

    ./README.md: This Markdown file

    ./SMART101-Data: Folder containing all the puzzle data. See below for details.

    ./puzzle_type_info.csv: Puzzle categorization (into 8 skill classes).

    Dataset Organization

    The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle). There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000. Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv. The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the answer to the question.

    At a Glance

    The size of the unzipped dataset is ~12GB.

    The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle).

    There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000.

    Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv.

    The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv contains the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the correct answer to the question.

    Other Details In our paper Are Deep Neural Networks SMARTer than Second Graders?, we provide four different dataset splits for evaluation: (i) Instance Split (IS), (ii) Answer Split (AS), (iii) Puzzle Split (PS), and (iv) Few-shot Split (FS). Below, we provide the details of each split to make fair comparisons to the results reported in our paper.

    Puzzle Split (PS) We use the following root puzzle ids as the Train and Test sets.

        Split
        Root Puzzle Id Sets
    
    
    
    
        `Test`
        { 94,95, 96, 97, 98, 99, 101, 61,62, 65, 66,67, 69, 70, 71,72,73,74,75,76,77}
    
    
        `Train`
        {1,2,...,101} \ Test
    

    Evaluation is done on all the Test puzzles and their accuracies averaged. For the 'Test' puzzles, we use the instance indices 1701-2000 in the evaluation.

    Few-shot Split (FS)

    We randomly select k number of instances from the Test sets (that are used in the PS split above) for training in FS split (e.g., k=100). These k few-shot samples are taken from instance indices 1-1600 of the respective puzzles and evaluation is conducted on all instance ids from 1701-2000.

    Instance Split (IS)

    We split the instances under every root puzzle as: Train = 1-1600, Val = 1601-1700, Test = 1701-2000. We train the neural network models using the Train split puzzle instances from all the root puzzles together and evaluate on the Test split of all puzzles.

    Answer Split (AS)

    We find the median answer value among all the 2000 instances for every root puzzle and only use this set of the respective instances (with the median answer value) as the Test set for evaluation (this set is excluded from the training of the neural networks).

    Puzzle Categorization

    Please see puzzle_type_info.csv for details on the categorization of the puzzles into eight classes, namely (i) counting, (ii) logic, (iii) measure, (iv) spatial, (v) arithmetic, (vi) algebra, (vii) pattern finding, and (viii) path tracing.

    Other Resources

    PyTorch code for using the dataset to train deep neural networks is available here.

    Contact Anoop Cherian (cherian@merl.com), Kuan-Chuan Peng (kpeng@merl.com), or Suhas Lohit (slohit@merl.com)

    Citation If you use the SMART-101 dataset in your research, please cite our paper:

    @article{cherian2022deep, title={Are Deep Neural Networks SMARTer than Second Graders?}, author={Cherian, Anoop and Peng, Kuan-Chuan and Lohit, Suhas and Smith, Kevin and Tenenbaum, Joshua B}, journal={arXiv preprint arXiv:2212.09993}, year={2022} }

    Copyright and Licenses

    The SMART-101 dataset is released under CC-BY-SA-4.0.

    Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023

    SPDX-License-Identifier: CC-BY-SA-4.0

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lohit, Suhas (2023). Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7761799

Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101)

Explore at:
Dataset updated
Mar 28, 2023
Dataset provided by
Lohit, Suhas
Cherian, Anoop
Tenenbaum, Joshua B.
Peng, Kuan-Chuan
Smith, Kevin A.
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Introduction

Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task (and the associated SMART-101 dataset) for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children of younger age (6--8). Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including pattern recognition, algebra, and spatial reasoning, among others. To train deep neural networks, we programmatically augment each puzzle to 2,000 new instances; each instance varied in appearance, associated natural language question, and its solution. To foster research and make progress in the quest for artificial general intelligence, we are publicly releasing our SMART-101 dataset, consisting of the full set of programmatically-generated instances of 101 puzzles and their solutions.

The dataset was introduced in our paper Are Deep Neural Networks SMARTer than Second Graders? by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin A. Smith, and Joshua B. Tenenbaum, CVPR 2023

Files in the unzipped folder:

./README.md: This Markdown file

./SMART101-Data: Folder containing all the puzzle data. See below for details.

./puzzle_type_info.csv: Puzzle categorization (into 8 skill classes).

Dataset Organization

The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle). There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000. Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv. The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the answer to the question.

At a Glance

The size of the unzipped dataset is ~12GB.

The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle).

There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000.

Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv.

The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv contains the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the correct answer to the question.

Other Details In our paper Are Deep Neural Networks SMARTer than Second Graders?, we provide four different dataset splits for evaluation: (i) Instance Split (IS), (ii) Answer Split (AS), (iii) Puzzle Split (PS), and (iv) Few-shot Split (FS). Below, we provide the details of each split to make fair comparisons to the results reported in our paper.

Puzzle Split (PS) We use the following root puzzle ids as the Train and Test sets.

    Split
    Root Puzzle Id Sets




    `Test`
    { 94,95, 96, 97, 98, 99, 101, 61,62, 65, 66,67, 69, 70, 71,72,73,74,75,76,77}


    `Train`
    {1,2,...,101} \ Test

Evaluation is done on all the Test puzzles and their accuracies averaged. For the 'Test' puzzles, we use the instance indices 1701-2000 in the evaluation.

Few-shot Split (FS)

We randomly select k number of instances from the Test sets (that are used in the PS split above) for training in FS split (e.g., k=100). These k few-shot samples are taken from instance indices 1-1600 of the respective puzzles and evaluation is conducted on all instance ids from 1701-2000.

Instance Split (IS)

We split the instances under every root puzzle as: Train = 1-1600, Val = 1601-1700, Test = 1701-2000. We train the neural network models using the Train split puzzle instances from all the root puzzles together and evaluate on the Test split of all puzzles.

Answer Split (AS)

We find the median answer value among all the 2000 instances for every root puzzle and only use this set of the respective instances (with the median answer value) as the Test set for evaluation (this set is excluded from the training of the neural networks).

Puzzle Categorization

Please see puzzle_type_info.csv for details on the categorization of the puzzles into eight classes, namely (i) counting, (ii) logic, (iii) measure, (iv) spatial, (v) arithmetic, (vi) algebra, (vii) pattern finding, and (viii) path tracing.

Other Resources

PyTorch code for using the dataset to train deep neural networks is available here.

Contact Anoop Cherian (cherian@merl.com), Kuan-Chuan Peng (kpeng@merl.com), or Suhas Lohit (slohit@merl.com)

Citation If you use the SMART-101 dataset in your research, please cite our paper:

@article{cherian2022deep, title={Are Deep Neural Networks SMARTer than Second Graders?}, author={Cherian, Anoop and Peng, Kuan-Chuan and Lohit, Suhas and Smith, Kevin and Tenenbaum, Joshua B}, journal={arXiv preprint arXiv:2212.09993}, year={2022} }

Copyright and Licenses

The SMART-101 dataset is released under CC-BY-SA-4.0.

Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023

SPDX-License-Identifier: CC-BY-SA-4.0

Search
Clear search
Close search
Google apps
Main menu