41 datasets found
  1. h

    agi_eval_en

    • huggingface.co
    Updated Nov 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en
    Explore at:
    Dataset updated
    Nov 16, 2023
    Dataset authored and provided by
    Evaluation datasets
    Description

    Introduction

    AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.

  2. h

    Q-Eval-100K

    • huggingface.co
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AGI-Eval-Official (2025). Q-Eval-100K [Dataset]. https://huggingface.co/datasets/AGI-Eval-Official/Q-Eval-100K
    Explore at:
    Dataset updated
    Jun 9, 2025
    Authors
    AGI-Eval-Official
    Description

    Q-Eval-100K Dataset (CVPR 2025 Oral)

      📝 Introduction
    

    The Q-Eval-100K dataset encompasses both text-to-image and text-to-video models, with 960K human annotations specifically focused on visual quality and alignment for 100K instances (60K images and 40K videos). We utilize multiple popular text-to- image and text-to-video models to ensure diversity, which include FLUX, Lumina-T2X, PixArt, Stable Diffusion 3, Stable Diffusion XL, DALL·E 3, Wanx, Midjourney, Hunyuan-DiT… See the full description on the dataset page: https://huggingface.co/datasets/AGI-Eval-Official/Q-Eval-100K.

  3. agi eval

    • kaggle.com
    Updated Sep 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Test013 (2024). agi eval [Dataset]. https://www.kaggle.com/datasets/test013/agi-eval/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Test013
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Test013

    Released under MIT

    Contents

  4. h

    OIBench

    • huggingface.co
    Updated Jun 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AGI-Eval (2025). OIBench [Dataset]. https://huggingface.co/datasets/AGI-Eval/OIBench
    Explore at:
    Dataset updated
    Jun 12, 2025
    Dataset authored and provided by
    AGI-Eval
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    OIBench Dataset

      Dataset Overview
    

    OIBench is a high-quality, private, and challenging olympiad-level informatics benchmark consisting of 250 carefully curated original problems. The OIBench Dataset's HuggingFace repo contains algorithm problem statements, solutions, and associated metadata such as test cases, pseudo code, and difficulty levels. The dataset has been processed and stored in Parquet format for efficient access and analysis. We provide complete information… See the full description on the dataset page: https://huggingface.co/datasets/AGI-Eval/OIBench.

  5. h

    agi-eval

    • huggingface.co
    Updated Jul 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orion Research (2023). agi-eval [Dataset]. https://huggingface.co/datasets/orion-research/agi-eval
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2023
    Dataset authored and provided by
    Orion Research
    Description

    orion-research/agi-eval dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    agi-eval-sat-math-judgments-no-multiple-choice

    • huggingface.co
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Freddie Vargus (2025). agi-eval-sat-math-judgments-no-multiple-choice [Dataset]. https://huggingface.co/datasets/freddie/agi-eval-sat-math-judgments-no-multiple-choice
    Explore at:
    Dataset updated
    May 1, 2025
    Authors
    Freddie Vargus
    Description

    freddie/agi-eval-sat-math-judgments-no-multiple-choice dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. i

    Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Rwanda

    • catalog.ihsn.org
    • datacatalog.ihsn.org
    • +1more
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Haddock (2019). Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Rwanda [Dataset]. http://catalog.ihsn.org/catalog/study/RWA_2012-2014_AGIE_v01_M
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    Sarah Haddock
    Shubha Chakravarty
    Time period covered
    2012 - 2014
    Area covered
    Rwanda
    Description

    Abstract

    The Adolescent Girls Initiative (AGI) pilot was implemented by the Government of Rwanda as part of an eight-country initiative led by the World Bank aimed at promoting the economic empowerment of adolescent girls. The development objective of the Rwanda AGI was to improve employment, incomes and empowerment of disadvantaged adolescent girls and young women (aged 16-24), and to test two integrated models for promoting these goals.

    The Rwanda AGI had three components: Component I: Skills Development and Entrepreneurship Support, Component II: Scholarships to Resume Formal Education, Component III: Project Implementation Support

    This evaluation focused exclusively on Component I, which was carried out by the Workforce Development Authority (WDA), under the supervision of the Ministry of Gender and Family Promotion (MIGEPROF). It was delivered sequentially to roughly 2,000 vulnerable girls and young women in three equal-sized cohorts between 2012 and 2014. The project was targeted geographically in four districts (Gasabo, Kicukiro, Gicumbi, and Rulindo), where nine vocational training centers (VTCs) provided the training.

    The three objectives of the evaluation were: - To examine how well the AGI project delivered the planned activities - To assess the usefulness of the training provided - To measure the change in beneficiary outcomes before and after the AGI project.

    The evaluation was conducted on the second cohort of beneficiaries, from which 160 girls were randomly selected to participate in baseline and endline surveys.

    Geographic coverage

    The project targeted geographically to four districts that already had training centers: Gasabo, Kicukiro, Gicumbi and Rulindo.

    Analysis unit

    • individuals

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    After the initial pre-screening for eligibility, the sample was stratified by the sector of participants' residence and selected through a public lottery conducted by Workforce Development Authority and the Ministry of Gender and Family Promotion in each of the 11 recruitment sectors. The girls were invited to attend, and directly after the lottery, Laterite Limited - an independently contracted research firm - conducted uniform random sampling (in Excel) to select a subset of admitted applicants for the baseline survey. However, the baseline survey was administered only to those who were physically present at the lottery. In 6 of the 11sectors of recruitment, girls who did not appear for the lottery were excluded from the project, so the evaluation sample reflects the project sample. In the other 5 sectors, absent applicants who were randomly selected for project admission were still allowed to join, but they were still excluded from the baseline survey. Specifically, cohort 2 had 1,364 applicants who passed the screening committee and 712 were randomly selected for project admission. Further, unsuccessful but eligible applicants were allowed to enter the lottery for the third cohort, which started just one month after the second cohort. Hence, there was no feasible way to use the rejected applicants as a control group for an evaluation.

    A follow-up survey was administered to 160 of the 182 randomly sampled beneficiaries that responded to the baseline survey. Though special effort was made to follow up with the 43 individuals from the baseline survey who did not complete the project, the team was only able to interview 21 of them.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Cleaning operations

    After the collection of survey data, Laterite Limited prepared the data for analysis by correcting duplicate identification numbers, renaming endline variable names in order to match baseline variable names, dropping confidential personal identification variables (e.g. name, mobile phone number), GPS coordinates, device numbers, codifying variables stored as names of income-generating activities (IGAs), and merging baseline and endline datasets.

    A number of additional changes to the data were made during the quantitative analysis: - Values of specific variables (e.g. business type, first or second income-generating activity) recorded as "other" that fit existing answer options were re-codified; - To address inconsistencies between different sections of the survey, values entered for the IGA screening sections (whether respondents was engaged in any household agricultural activities, wage employment, non-farm business or internship) were corrected based on information provided in subsequent, more detailed, questions on the two main income-generating activities and/or business. No changes were made in the absence of supporting information. Where both wage employment and non-farm businesses were indicated for the same IGA, answers to screening questions were reconciled based on whether the respondent reported working for herself (business) or for a non-relative (paid job). - Because 86 out of 160 values for age at baseline were missing in the merged dataset provided by Laterite Limited, data on age was extracted from the baseline dataset; - Outliers - 3 income values (extra 0 at the end, or amount entered as in-kind daily payment instead of monthly income) and 4 in-kind amount values (divided by 10 to fit in ranges of reported in-kind amounts for same occupation) were considered typos; for the remaining outliers, values above the 99th quintile were dropped from the estimations.

  8. h

    agieval-logiqa-en

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dmayhem93, agieval-logiqa-en [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-logiqa-en
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    dmayhem93
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for "agieval-logiqa-en"

    Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. Raw datset: https://github.com/lgw863/LogiQA-dataset Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) @misc{zhong2023agieval, title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models}, author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-logiqa-en.

  9. h

    agieval

    • huggingface.co
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baber Abbasi (2023). agieval [Dataset]. https://huggingface.co/datasets/baber/agieval
    Explore at:
    Dataset updated
    Aug 4, 2023
    Authors
    Baber Abbasi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for AGIEval

      Dataset Summary
    

    AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school… See the full description on the dataset page: https://huggingface.co/datasets/baber/agieval.

  10. d

    Rwanda - Adolescent Girls Initiative (AGI) Evaluation 2012-2014 - Dataset -...

    • waterdata3.staging.derilinx.com
    Updated Mar 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    Mar 16, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Rwanda
    Description

    The Adolescent Girls Initiative (AGI) pilot was implemented by the Government of Rwanda as part of an eight-country initiative led by the World Bank aimed at promoting the economic empowerment of adolescent girls. The development objective of the Rwanda AGI was to improve employment, incomes and empowerment of disadvantaged adolescent girls and young women (aged 16-24), and to test two integrated models for promoting these goals. The Rwanda AGI had three components: Component I: Skills Development and Entrepreneurship Support, Component II: Scholarships to Resume Formal Education, Component III: Project Implementation Support This evaluation focused exclusively on Component I, which was carried out by the Workforce Development Authority (WDA), under the supervision of the Ministry of Gender and Family Promotion (MIGEPROF). It was delivered sequentially to roughly 2,000 vulnerable girls and young women in three equal-sized cohorts between 2012 and 2014. The project was targeted geographically in four districts (Gasabo, Kicukiro, Gicumbi, and Rulindo), where nine vocational training centers (VTCs) provided the training. The three objectives of the evaluation were: To examine how well the AGI project delivered the planned activities To assess the usefulness of the training provided To measure the change in beneficiary outcomes before and after the AGI project. The evaluation was conducted on the second cohort of beneficiaries, from which 160 girls were randomly selected to participate in baseline and endline surveys.

  11. u

    Data from: Illumination and gaze effects on face evaluation: the Bi-AGI...

    • board.unimib.it
    Updated Nov 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giulia Mattavelli (2023). Illumination and gaze effects on face evaluation: the Bi-AGI Database [Dataset]. http://doi.org/10.17632/rx6kpwmvtf.3
    Explore at:
    Dataset updated
    Nov 6, 2023
    Authors
    Giulia Mattavelli
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Face evaluation and first impression generation can be affected by multiple face elements such as invariant facial features, gaze direction and environmental context; however, the composite modulation of eye gaze and illumination on faces of different gender and ages has not been previously investigated. We aimed at testing how these different facial and contextual features affect ratings of social attributes. Thus, we created and validated the Bi-AGI Database, a freely available new set of male and female face stimuli varying in age across lifespan from 18 to 87 years, gaze direction and illumination conditions. Judgments on attractiveness, femininity-masculinity, dominance and trustworthiness were collected for each stimulus. Results evidence the interaction of the different variables in modulating social trait attribution, in particular illumination differently affects ratings across age, gaze and gender, with less impact on older adults and greater effect on young faces.

  12. h

    agi-eval-sat-math-judgments

    • huggingface.co
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Freddie Vargus (2025). agi-eval-sat-math-judgments [Dataset]. https://huggingface.co/datasets/freddie/agi-eval-sat-math-judgments
    Explore at:
    Dataset updated
    May 1, 2025
    Authors
    Freddie Vargus
    Description

    freddie/agi-eval-sat-math-judgments dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    agieval-sat-math

    • huggingface.co
    Updated Jun 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dmayhem93 (2023). agieval-sat-math [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-sat-math
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2023
    Authors
    dmayhem93
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "agieval-sat-math"

    Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-sat-math.

  14. h

    agieval-lsat-lr

    • huggingface.co
    Updated Jun 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dmayhem93 (2023). agieval-lsat-lr [Dataset]. https://huggingface.co/datasets/dmayhem93/agieval-lsat-lr
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2023
    Authors
    dmayhem93
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "agieval-lsat-lr"

    Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. Raw datset: https://github.com/zhongwanjun/AR-LSAT MIT License Copyright (c) 2022 Wanjun Zhong Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-lsat-lr.

  15. f

    Quantitative predictor and outcome variables used to analyze the indirect...

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nia King; Rachael Vriezen; Victoria L. Edge; James Ford; Michele Wood; Sherilee Harper (2023). Quantitative predictor and outcome variables used to analyze the indirect costs of acute gastrointestinal illness (AGI) in Rigolet, Nunatsiavut, Canada. [Dataset]. http://doi.org/10.1371/journal.pone.0196990.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Nia King; Rachael Vriezen; Victoria L. Edge; James Ford; Michele Wood; Sherilee Harper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Canada, Rigolet, Nunatsiavut
    Description

    Quantitative predictor and outcome variables used to analyze the indirect costs of acute gastrointestinal illness (AGI) in Rigolet, Nunatsiavut, Canada.

  16. P

    AGIQA-1K Dataset

    • paperswithcode.com
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). AGIQA-1K Dataset [Dataset]. https://paperswithcode.com/dataset/agiqa-1k
    Explore at:
    Dataset updated
    Mar 21, 2023
    Description

    AI Generated Content (AIGC) refers to any form of content, such as text, images, audio, or video, that is created with the help of artificial intelligence technology. With the flourishing development of deep learning, the efficiency of AIGC generation has increased, and AI-Generated Image (AGI) is becoming more prevalent in areas such as culture, entertainment, education, social media, etc.

    Unlike Natural Scene Images (NSIs) captured from natural scenes, AGIs are directly generated from AI models. Thus, AGIs obtain some unique quality characteristics and viewers tend to evaluate the quality of AGIs from some different aspects of NSIs.

    Therefore, we propose the first perceptual AGI Quality Assessment (AGIQA-1K) database, which provides 1,080 AGIs along with quality labels, including technical issues, AI artifacts, unnaturalness, discrepancy, and aesthetics as major evaluation aspects.

  17. t

    BIOGRID CURATED DATA FOR PUBLICATION: Novel ASK1 inhibitor AGI-1067 improves...

    • thebiogrid.org
    zip
    Updated Sep 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BioGRID Project (2018). BIOGRID CURATED DATA FOR PUBLICATION: Novel ASK1 inhibitor AGI-1067 improves AGE-induced cardiac dysfunction by inhibiting MKKs/p38 MAPK and NF-?B apoptotic signaling. [Dataset]. https://thebiogrid.org/255093/publication/novel-ask1-inhibitor-agi-1067-improves-age-induced-cardiac-dysfunction-by-inhibiting-mkksp38-mapk-and-nf-b-apoptotic-signaling.html
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 6, 2018
    Dataset authored and provided by
    BioGRID Project
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Protein-Protein, Genetic, and Chemical Interactions for Liu Z (2018):Novel ASK1 inhibitor AGI-1067 improves AGE-induced cardiac dysfunction by inhibiting MKKs/p38 MAPK and NF-?B apoptotic signaling. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Heart failure has been identified as one of the clinical manifestations of diabetic cardiovascular complications. Excessive myocardium apoptosis characterizes cardiac dysfunctions, which are correlated with an increased level of advanced glycation end products (AGEs). In this study, we investigated the participation of reactive oxygen species (ROS) and the involvements of apoptosis signal-regulating kinase 1 (ASK1)/mitogen-activated protein kinase (MAPK) kinases (MKKs)/p38 MAPK and nuclear factor ?B (NF-?B) pathways in AGE-induced apoptosis-mediated cardiac dysfunctions. The antioxidant and therapeutic effects of a novel ASK1 inhibitor, AGI-1067, were also studied. Myocardium and isolated primary myocytes were exposed to AGEs and treated with AGI-1067. Invasive hemodynamic and echocardiographic assessments were used to evaluate the cardiac functions. ROS formation was evaluated by dihydroethidium fluorescence staining. A terminal deoxynucleotidyl transferase dUTP nick end labelling assay was used to detect the apoptotic cells. ASK1 and NADPH activities were determined by kinase assays. The association between ASK1 and thioredoxin 1 (Trx1) was assessed by immunoprecipitation. Western blotting was used to evaluate the phosphorylation and expression levels of proteins. Our results showed that AGE exposure significantly activated ASK1/MKKs/p38 MAPK, which led to increased cardiac apoptosis and cardiac impairments. AGI-1067 administration inhibited the activation of MKKs/p38 MAPK by inhibiting the disassociation of ASK1 and Trx1, which suppressed the AGE-induced myocyte apoptosis. Moreover, the NF-?B activation as well as the ROS generation was inhibited. As a result, cardiac functions were improved. Our findings suggested that AGI-1067 recovered AGE-induced cardiac dysfunction by blocking both ASK1/MKKs/p38 and NF-?B apoptotic signaling pathways.

  18. h

    step-wise-eval-addtional-with-tao

    • huggingface.co
    Updated Sep 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Language & AGI Lab (2024). step-wise-eval-addtional-with-tao [Dataset]. https://huggingface.co/datasets/LangAGI-Lab/step-wise-eval-addtional-with-tao
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 19, 2024
    Dataset authored and provided by
    Language & AGI Lab
    Description

    LangAGI-Lab/step-wise-eval-addtional-with-tao dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    step-wise-eval-additional-refined-tao

    • huggingface.co
    Updated Sep 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Language & AGI Lab (2024). step-wise-eval-additional-refined-tao [Dataset]. https://huggingface.co/datasets/LangAGI-Lab/step-wise-eval-additional-refined-tao
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 19, 2024
    Dataset authored and provided by
    Language & AGI Lab
    Description

    LangAGI-Lab/step-wise-eval-additional-refined-tao dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    Data from: Eval

    • huggingface.co
    Updated May 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RIT AGI (2024). Eval [Dataset]. https://huggingface.co/datasets/RIT4AGI/Eval
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2024
    Dataset authored and provided by
    RIT AGI
    Description

    RIT4AGI/Eval dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Evaluation datasets (2023). agi_eval_en [Dataset]. https://huggingface.co/datasets/lighteval/agi_eval_en

agi_eval_en

lighteval/agi_eval_en

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 16, 2023
Dataset authored and provided by
Evaluation datasets
Description

Introduction

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/agi_eval_en.

Search
Clear search
Close search
Google apps
Main menu