2 datasets found
  1. h

    agi_eval_en

    • huggingface.co
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en
    Explore at:
    Dataset updated
    Nov 16, 2023
    Dataset authored and provided by
    Evaluation datasets
    Description

    Introduction

    AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.

  2. AGIEval

    • opendatalab.com
    zip
    Updated Apr 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2023). AGIEval [Dataset]. https://opendatalab.com/OpenDataLab/AGIEval
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 1, 2023
    Dataset provided by
    微軟http://microsoft.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions, lawyer qualification tests, and national civil service exams. For a full description of the benchmark

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en

agi_eval_en

lighteval/agi_eval_en

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 16, 2023
Dataset authored and provided by
Evaluation datasets
Description

Introduction

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.

Search
Clear search
Close search
Google apps
Main menu