41 datasets found

h
agi_eval_en
huggingface.co
Updated Nov 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en
Explore at:
Dataset updated
Nov 16, 2023
Dataset authored and provided by
Evaluation datasets
Description
Introduction

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.
h
Q-Eval-100K
huggingface.co
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AGI-Eval-Official (2025). Q-Eval-100K [Dataset]. https://huggingface.co/datasets/AGI-Eval-Official/Q-Eval-100K
Explore at:
Dataset updated
Jun 9, 2025
Authors
AGI-Eval-Official
Description
Q-Eval-100K Dataset （CVPR 2025 Oral）

📝 Introduction

The Q-Eval-100K dataset encompasses both text-to-image and text-to-video models, with 960K human annotations specifically focused on visual quality and alignment for 100K instances (60K images and 40K videos). We utilize multiple popular text-to- image and text-to-video models to ensure diversity, which include FLUX, Lumina-T2X, PixArt, Stable Diffusion 3, Stable Diffusion XL, DALL·E 3, Wanx, Midjourney, Hunyuan-DiT… See the full description on the dataset page: https://huggingface.co/datasets/AGI-Eval-Official/Q-Eval-100K.
agi eval
kaggle.com
Updated Sep 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Test013 (2024). agi eval [Dataset]. https://www.kaggle.com/datasets/test013/agi-eval/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Test013
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Test013

Released under MIT

Contents
h
OIBench
huggingface.co
Updated Jun 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AGI-Eval (2025). OIBench [Dataset]. https://huggingface.co/datasets/AGI-Eval/OIBench
Explore at:
Dataset updated
Jun 12, 2025
Dataset authored and provided by
AGI-Eval
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
OIBench Dataset

Dataset Overview

OIBench is a high-quality, private, and challenging olympiad-level informatics benchmark consisting of 250 carefully curated original problems. The OIBench Dataset's HuggingFace repo contains algorithm problem statements, solutions, and associated metadata such as test cases, pseudo code, and difficulty levels. The dataset has been processed and stored in Parquet format for efficient access and analysis. We provide complete information… See the full description on the dataset page: https://huggingface.co/datasets/AGI-Eval/OIBench.
h
agi-eval
huggingface.co
Updated Jul 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Orion Research (2023). agi-eval [Dataset]. https://huggingface.co/datasets/orion-research/agi-eval
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 10, 2023
Dataset authored and provided by
Orion Research
Description
orion-research/agi-eval dataset hosted on Hugging Face and contributed by the HF Datasets community
h
agi-eval-sat-math-judgments-no-multiple-choice
huggingface.co
Updated May 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Freddie Vargus (2025). agi-eval-sat-math-judgments-no-multiple-choice [Dataset]. https://huggingface.co/datasets/freddie/agi-eval-sat-math-judgments-no-multiple-choice
Explore at:
Dataset updated
May 1, 2025
Authors
Freddie Vargus
Description
freddie/agi-eval-sat-math-judgments-no-multiple-choice dataset hosted on Hugging Face and contributed by the HF Datasets community
i
Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Rwanda
catalog.ihsn.org
datacatalog.ihsn.org
+1more
Updated Mar 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Haddock (2019). Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Rwanda [Dataset]. http://catalog.ihsn.org/catalog/study/RWA_2012-2014_AGIE_v01_M
Explore at:
Dataset updated
Mar 29, 2019
Dataset provided by
Sarah Haddock
Shubha Chakravarty
Time period covered
2012 - 2014
Area covered
Rwanda
Description
Abstract

The Adolescent Girls Initiative (AGI) pilot was implemented by the Government of Rwanda as part of an eight-country initiative led by the World Bank aimed at promoting the economic empowerment of adolescent girls. The development objective of the Rwanda AGI was to improve employment, incomes and empowerment of disadvantaged adolescent girls and young women (aged 16-24), and to test two integrated models for promoting these goals.

The Rwanda AGI had three components: Component I: Skills Development and Entrepreneurship Support, Component II: Scholarships to Resume Formal Education, Component III: Project Implementation Support

This evaluation focused exclusively on Component I, which was carried out by the Workforce Development Authority (WDA), under the supervision of the Ministry of Gender and Family Promotion (MIGEPROF). It was delivered sequentially to roughly 2,000 vulnerable girls and young women in three equal-sized cohorts between 2012 and 2014. The project was targeted geographically in four districts (Gasabo, Kicukiro, Gicumbi, and Rulindo), where nine vocational training centers (VTCs) provided the training.

The three objectives of the evaluation were: - To examine how well the AGI project delivered the planned activities - To assess the usefulness of the training provided - To measure the change in beneficiary outcomes before and after the AGI project.

The evaluation was conducted on the second cohort of beneficiaries, from which 160 girls were randomly selected to participate in baseline and endline surveys.

Geographic coverage

The project targeted geographically to four districts that already had training centers: Gasabo, Kicukiro, Gicumbi and Rulindo.

Analysis unit

individuals

Kind of data

Sample survey data [ssd]

Sampling procedure

After the initial pre-screening for eligibility, the sample was stratified by the sector of participants' residence and selected through a public lottery conducted by Workforce Development Authority and the Ministry of Gender and Family Promotion in each of the 11 recruitment sectors. The girls were invited to attend, and directly after the lottery, Laterite Limited - an independently contracted research firm - conducted uniform random sampling (in Excel) to select a subset of admitted applicants for the baseline survey. However, the baseline survey was administered only to those who were physically present at the lottery. In 6 of the 11sectors of recruitment, girls who did not appear for the lottery were excluded from the project, so the evaluation sample reflects the project sample. In the other 5 sectors, absent applicants who were randomly selected for project admission were still allowed to join, but they were still excluded from the baseline survey. Specifically, cohort 2 had 1,364 applicants who passed the screening committee and 712 were randomly selected for project admission. Further, unsuccessful but eligible applicants were allowed to enter the lottery for the third cohort, which started just one month after the second cohort. Hence, there was no feasible way to use the rejected applicants as a control group for an evaluation.

A follow-up survey was administered to 160 of the 182 randomly sampled beneficiaries that responded to the baseline survey. Though special effort was made to follow up with the 43 individuals from the baseline survey who did not complete the project, the team was only able to interview 21 of them.

Mode of data collection

Computer Assisted Personal Interview [capi]

Cleaning operations

After the collection of survey data, Laterite Limited prepared the data for analysis by correcting duplicate identification numbers, renaming endline variable names in order to match baseline variable names, dropping confidential personal identification variables (e.g. name, mobile phone number), GPS coordinates, device numbers, codifying variables stored as names of income-generating activities (IGAs), and merging baseline and endline datasets.

A number of additional changes to the data were made during the quantitative analysis: - Values of specific variables (e.g. business type, first or second income-generating activity) recorded as "other" that fit existing answer options were re-codified; - To address inconsistencies between different sections of the survey, values entered for the IGA screening sections (whether respondents was engaged in any household agricultural activities, wage employment, non-farm business or internship) were corrected based on information provided in subsequent, more detailed, questions on the two main income-generating activities and/or business. No changes were made in the absence of supporting information. Where both wage employment and non-farm businesses were indicated for the same IGA, answers to screening questions were reconciled based on whether the respondent reported working for herself (business) or for a non-relative (paid job). - Because 86 out of 160 values for age at baseline were missing in the merged dataset provided by Laterite Limited, data on age was extracted from the baseline dataset; - Outliers - 3 income values (extra 0 at the end, or amount entered as in-kind daily payment instead of monthly income) and 4 in-kind amount values (divided by 10 to fit in ranges of reported in-kind amounts for same occupation) were considered typos; for the remaining outliers, values above the 99th quintile were dropped from the estimations.
h
agieval-logiqa-en
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
dmayhem93, agieval-logiqa-en [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-logiqa-en
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
dmayhem93
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Card for "agieval-logiqa-en"

Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. Raw datset: https://github.com/lgw863/LogiQA-dataset Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) @misc{zhong2023agieval, title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models}, author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-logiqa-en.
h
agieval
huggingface.co
Updated Aug 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baber Abbasi (2023). agieval [Dataset]. https://huggingface.co/datasets/baber/agieval
Explore at:
Dataset updated
Aug 4, 2023
Authors
Baber Abbasi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for AGIEval

Dataset Summary

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school… See the full description on the dataset page: https://huggingface.co/datasets/baber/agieval.
d
Rwanda - Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Dataset -...
waterdata3.staging.derilinx.com
Updated Mar 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset updated
Mar 16, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Rwanda
Description
The Adolescent Girls Initiative (AGI) pilot was implemented by the Government of Rwanda as part of an eight-country initiative led by the World Bank aimed at promoting the economic empowerment of adolescent girls. The development objective of the Rwanda AGI was to improve employment, incomes and empowerment of disadvantaged adolescent girls and young women (aged 16-24), and to test two integrated models for promoting these goals. The Rwanda AGI had three components: Component I: Skills Development and Entrepreneurship Support, Component II: Scholarships to Resume Formal Education, Component III: Project Implementation Support This evaluation focused exclusively on Component I, which was carried out by the Workforce Development Authority (WDA), under the supervision of the Ministry of Gender and Family Promotion (MIGEPROF). It was delivered sequentially to roughly 2,000 vulnerable girls and young women in three equal-sized cohorts between 2012 and 2014. The project was targeted geographically in four districts (Gasabo, Kicukiro, Gicumbi, and Rulindo), where nine vocational training centers (VTCs) provided the training. The three objectives of the evaluation were: To examine how well the AGI project delivered the planned activities To assess the usefulness of the training provided To measure the change in beneficiary outcomes before and after the AGI project. The evaluation was conducted on the second cohort of beneficiaries, from which 160 girls were randomly selected to participate in baseline and endline surveys.
u
Data from: Illumination and gaze effects on face evaluation: the Bi-AGI...
board.unimib.it
Updated Nov 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giulia Mattavelli (2023). Illumination and gaze effects on face evaluation: the Bi-AGI Database [Dataset]. http://doi.org/10.17632/rx6kpwmvtf.3
Explore at:
Unique identifier
https://doi.org/10.17632/rx6kpwmvtf.3
Dataset updated
Nov 6, 2023
Authors
Giulia Mattavelli
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
Face evaluation and first impression generation can be affected by multiple face elements such as invariant facial features, gaze direction and environmental context; however, the composite modulation of eye gaze and illumination on faces of different gender and ages has not been previously investigated. We aimed at testing how these different facial and contextual features affect ratings of social attributes. Thus, we created and validated the Bi-AGI Database, a freely available new set of male and female face stimuli varying in age across lifespan from 18 to 87 years, gaze direction and illumination conditions. Judgments on attractiveness, femininity-masculinity, dominance and trustworthiness were collected for each stimulus. Results evidence the interaction of the different variables in modulating social trait attribution, in particular illumination differently affects ratings across age, gaze and gender, with less impact on older adults and greater effect on young faces.
h
agi-eval-sat-math-judgments
huggingface.co
Updated May 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Freddie Vargus (2025). agi-eval-sat-math-judgments [Dataset]. https://huggingface.co/datasets/freddie/agi-eval-sat-math-judgments
Explore at:
Dataset updated
May 1, 2025
Authors
Freddie Vargus
Description
freddie/agi-eval-sat-math-judgments dataset hosted on Hugging Face and contributed by the HF Datasets community
h
agieval-sat-math
huggingface.co
Updated Jun 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
dmayhem93 (2023). agieval-sat-math [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-sat-math
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 18, 2023
Authors
dmayhem93
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "agieval-sat-math"

Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-sat-math.
h
agieval-lsat-lr
huggingface.co
Updated Jun 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
dmayhem93 (2023). agieval-lsat-lr [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-lsat-lr
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 18, 2023
Authors
dmayhem93
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "agieval-lsat-lr"

Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. Raw datset: https://github.com/zhongwanjun/AR-LSAT MIT License Copyright (c) 2022 Wanjun Zhong Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-lsat-lr.
f
Quantitative predictor and outcome variables used to analyze the indirect...
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nia King; Rachael Vriezen; Victoria L. Edge; James Ford; Michele Wood; Sherilee Harper (2023). Quantitative predictor and outcome variables used to analyze the indirect costs of acute gastrointestinal illness (AGI) in Rigolet, Nunatsiavut, Canada. [Dataset]. http://doi.org/10.1371/journal.pone.0196990.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0196990.t002
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Nia King; Rachael Vriezen; Victoria L. Edge; James Ford; Michele Wood; Sherilee Harper
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Canada, Rigolet, Nunatsiavut
Description
Quantitative predictor and outcome variables used to analyze the indirect costs of acute gastrointestinal illness (AGI) in Rigolet, Nunatsiavut, Canada.
P
AGIQA-1K Dataset
paperswithcode.com
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). AGIQA-1K Dataset [Dataset]. https://paperswithcode.com/dataset/agiqa-1k
Explore at:
Dataset updated
Mar 21, 2023
Description
AI Generated Content (AIGC) refers to any form of content, such as text, images, audio, or video, that is created with the help of artificial intelligence technology. With the flourishing development of deep learning, the efficiency of AIGC generation has increased, and AI-Generated Image (AGI) is becoming more prevalent in areas such as culture, entertainment, education, social media, etc.

Unlike Natural Scene Images (NSIs) captured from natural scenes, AGIs are directly generated from AI models. Thus, AGIs obtain some unique quality characteristics and viewers tend to evaluate the quality of AGIs from some different aspects of NSIs.

Therefore, we propose the first perceptual AGI Quality Assessment (AGIQA-1K) database, which provides 1,080 AGIs along with quality labels, including technical issues, AI artifacts, unnaturalness, discrepancy, and aesthetics as major evaluation aspects.
t
BIOGRID CURATED DATA FOR PUBLICATION: Novel ASK1 inhibitor AGI-1067 improves...
thebiogrid.org
zip
Updated Sep 6, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2018). BIOGRID CURATED DATA FOR PUBLICATION: Novel ASK1 inhibitor AGI-1067 improves AGE-induced cardiac dysfunction by inhibiting MKKs/p38 MAPK and NF-?B apoptotic signaling. [Dataset]. https://thebiogrid.org/255093/publication/novel-ask1-inhibitor-agi-1067-improves-age-induced-cardiac-dysfunction-by-inhibiting-mkksp38-mapk-and-nf-b-apoptotic-signaling.html
Explore at:
zipAvailable download formats
Dataset updated
Sep 6, 2018
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for Liu Z (2018):Novel ASK1 inhibitor AGI-1067 improves AGE-induced cardiac dysfunction by inhibiting MKKs/p38 MAPK and NF-?B apoptotic signaling. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Heart failure has been identified as one of the clinical manifestations of diabetic cardiovascular complications. Excessive myocardium apoptosis characterizes cardiac dysfunctions, which are correlated with an increased level of advanced glycation end products (AGEs). In this study, we investigated the participation of reactive oxygen species (ROS) and the involvements of apoptosis signal-regulating kinase 1 (ASK1)/mitogen-activated protein kinase (MAPK) kinases (MKKs)/p38 MAPK and nuclear factor ?B (NF-?B) pathways in AGE-induced apoptosis-mediated cardiac dysfunctions. The antioxidant and therapeutic effects of a novel ASK1 inhibitor, AGI-1067, were also studied. Myocardium and isolated primary myocytes were exposed to AGEs and treated with AGI-1067. Invasive hemodynamic and echocardiographic assessments were used to evaluate the cardiac functions. ROS formation was evaluated by dihydroethidium fluorescence staining. A terminal deoxynucleotidyl transferase dUTP nick end labelling assay was used to detect the apoptotic cells. ASK1 and NADPH activities were determined by kinase assays. The association between ASK1 and thioredoxin 1 (Trx1) was assessed by immunoprecipitation. Western blotting was used to evaluate the phosphorylation and expression levels of proteins. Our results showed that AGE exposure significantly activated ASK1/MKKs/p38 MAPK, which led to increased cardiac apoptosis and cardiac impairments. AGI-1067 administration inhibited the activation of MKKs/p38 MAPK by inhibiting the disassociation of ASK1 and Trx1, which suppressed the AGE-induced myocyte apoptosis. Moreover, the NF-?B activation as well as the ROS generation was inhibited. As a result, cardiac functions were improved. Our findings suggested that AGI-1067 recovered AGE-induced cardiac dysfunction by blocking both ASK1/MKKs/p38 and NF-?B apoptotic signaling pathways.
h
step-wise-eval-addtional-with-tao
huggingface.co
Updated Sep 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Language & AGI Lab (2024). step-wise-eval-addtional-with-tao [Dataset]. https://huggingface.co/datasets/LangAGI-Lab/step-wise-eval-addtional-with-tao
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 19, 2024
Dataset authored and provided by
Language & AGI Lab
Description
LangAGI-Lab/step-wise-eval-addtional-with-tao dataset hosted on Hugging Face and contributed by the HF Datasets community
h
step-wise-eval-additional-refined-tao
huggingface.co
Updated Sep 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Language & AGI Lab (2024). step-wise-eval-additional-refined-tao [Dataset]. https://huggingface.co/datasets/LangAGI-Lab/step-wise-eval-additional-refined-tao
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 19, 2024
Dataset authored and provided by
Language & AGI Lab
Description
LangAGI-Lab/step-wise-eval-additional-refined-tao dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Data from: Eval
huggingface.co
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RIT AGI (2024). Eval [Dataset]. https://huggingface.co/datasets/RIT4AGI/Eval
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 7, 2024
Dataset authored and provided by
RIT AGI
Description
RIT4AGI/Eval dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en

agi_eval_en

lighteval/agi_eval_en

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Nov 16, 2023

Dataset authored and provided by

Evaluation datasets

Description

Introduction

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.

Clear search

Close search

Google apps

Main menu

agi_eval_en

Q-Eval-100K

agi eval

Dataset

Contents

OIBench

agi-eval

agi-eval-sat-math-judgments-no-multiple-choice

Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Rwanda

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Cleaning operations

agieval-logiqa-en

agieval

Rwanda - Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Dataset -...

Data from: Illumination and gaze effects on face evaluation: the Bi-AGI...

agi-eval-sat-math-judgments

agieval-sat-math

agieval-lsat-lr

Quantitative predictor and outcome variables used to analyze the indirect...

AGIQA-1K Dataset

BIOGRID CURATED DATA FOR PUBLICATION: Novel ASK1 inhibitor AGI-1067 improves...

step-wise-eval-addtional-with-tao

step-wise-eval-additional-refined-tao

Data from: Eval

agi_eval_enSee More Versions

lighteval/agi_eval_en

agi_eval_en