68 datasets found

h
gpqa
huggingface.co
opendatalab.com
Updated Nov 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Rein (2023). gpqa [Dataset]. https://huggingface.co/datasets/Idavidrein/gpqa
Explore at:
Dataset updated
Nov 21, 2023
Authors
David Rein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for GPQA

GPQA is a multiple-choice, Q&A dataset of very hard questions written and validated by experts in biology, physics, and chemistry. When attempting questions out of their own domain (e.g., a physicist answers a chemistry question), these experts get only 34% accuracy, despite spending >30m with full access to Google. We request that you do not reveal examples from this dataset in plain text or images online, to reduce the risk of leakage into foundation model… See the full description on the dataset page: https://huggingface.co/datasets/Idavidrein/gpqa.
P
GPQA Dataset
paperswithcode.com
Updated Jan 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Rein; Betty Li Hou; Asa Cooper Stickland; Jackson Petty; Richard Yuanzhe Pang; Julien Dirani; Julian Michael; Samuel R. Bowman (2025). GPQA Dataset [Dataset]. https://paperswithcode.com/dataset/gpqa
Explore at:
Dataset updated
Jan 30, 2025
Authors
David Rein; Betty Li Hou; Asa Cooper Stickland; Jackson Petty; Richard Yuanzhe Pang; Julien Dirani; Julian Michael; Samuel R. Bowman
Description
GPQA stands for Graduate-Level Google-Proof Q&A Benchmark. It's a challenging dataset designed to evaluate the capabilities of Large Language Models (LLMs) and scalable oversight mechanisms. Let me provide more details about it:

Description: GPQA consists of 448 multiple-choice questions meticulously crafted by domain experts in biology, physics, and chemistry. These questions are intentionally designed to be high-quality and extremely difficult. Expert Accuracy: Even experts who hold or are pursuing PhDs in the corresponding domains achieve only 65% accuracy on these questions (or 74% when excluding clear mistakes identified in retrospect). Google-Proof: The questions are "Google-proof," meaning that even with unrestricted access to the web, highly skilled non-expert validators only reach an accuracy of 34% despite spending over 30 minutes searching for answers. AI Systems Difficulty: State-of-the-art AI systems, including our strongest GPT-4 based baseline, achieve only 39% accuracy on this challenging dataset.

The difficulty of GPQA for both skilled non-experts and cutting-edge AI systems makes it an excellent resource for conducting realistic scalable oversight experiments. These experiments aim to explore ways for human experts to reliably obtain truthful information from AI systems that surpass human capabilities¹³.

In summary, GPQA serves as a valuable benchmark for assessing the robustness and limitations of language models, especially when faced with complex and nuanced questions. Its difficulty level encourages research into effective oversight methods, bridging the gap between AI and human expertise.

(1) [2311.12022] GPQA: A Graduate-Level Google-Proof Q&A Benchmark - arXiv.org. https://arxiv.org/abs/2311.12022. (2) GPQA: A Graduate-Level Google-Proof Q&A Benchmark — Klu. https://klu.ai/glossary/gpqa-eval. (3) GPA Dataset (Spring 2010 through Spring 2020) - Data Science Discovery. https://discovery.cs.illinois.edu/dataset/gpa/. (4) GPQA: A Graduate-Level Google-Proof Q&A Benchmark - GitHub. https://github.com/idavidrein/gpqa. (5) Data Sets - OpenIntro. https://www.openintro.org/data/index.php?data=satgpa. (6) undefined. https://doi.org/10.48550/arXiv.2311.12022. (7) undefined. https://arxiv.org/abs/2311.12022%29.
h
gpqa
huggingface.co
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
math-ai (2025). gpqa [Dataset]. https://huggingface.co/datasets/math-ai/gpqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 15, 2025
Dataset authored and provided by
math-ai
Description
math-ai/gpqa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
PPE-GPQA-Best-of-K
huggingface.co
Updated Oct 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LMArena (2024). PPE-GPQA-Best-of-K [Dataset]. https://huggingface.co/datasets/lmarena-ai/PPE-GPQA-Best-of-K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 25, 2024
Dataset authored and provided by
LMArena
Description
Overview

This contains the GPQA correctness preference evaluation set for Preference Proxy Evaluations. The prompts are sampled from GPQA. This dataset is meant for benchmarking and evaluation, not for training. Paper Code

License

User prompts are licensed under CC BY 4.0, and model outputs are governed by the terms of use set by the respective model providers.

Citation

@misc{frick2024evaluaterewardmodelsrlhf, title={How to Evaluate Reward Models for… See the full description on the dataset page: https://huggingface.co/datasets/lmarena-ai/PPE-GPQA-Best-of-K.
h
gpqa
huggingface.co
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casimir Nuesperling (2025). gpqa [Dataset]. https://huggingface.co/datasets/casimiir/gpqa
Explore at:
Dataset updated
Jun 2, 2025
Authors
Casimir Nuesperling
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is a reformatted version of the original GPQA dataset from Idavidrein/gpqa. It includes only the main question, four shuffled answer choices, the correct answer index, subdomain, and a unique id for each entry.Please cite the GPQA paper if you use this data: GPQA: A Graduate-Level Google-Proof Q&A Benchmark.
h
gpqa
huggingface.co
Updated Mar 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tianjian Li (2025). gpqa [Dataset]. https://huggingface.co/datasets/dogtooth/gpqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 7, 2025
Authors
Tianjian Li
Description
dogtooth/gpqa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
leaderboard-documents-gpqa
huggingface.co
Updated Feb 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jerome White (2025). leaderboard-documents-gpqa [Dataset]. https://huggingface.co/datasets/jerome-white/leaderboard-documents-gpqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2025
Authors
Jerome White
Description
jerome-white/leaderboard-documents-gpqa dataset hosted on Hugging Face and contributed by the HF Datasets community
P
GPA Dataset
paperswithcode.com
Updated Jul 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhe Wang; Liyan Chen; Shaurya Rathore; Daeyun Shin; Charless Fowlkes (2022). GPA Dataset [Dataset]. https://paperswithcode.com/dataset/gpa
Explore at:
Dataset updated
Jul 11, 2022
Authors
Zhe Wang; Liyan Chen; Shaurya Rathore; Daeyun Shin; Charless Fowlkes
Description
multi-view imagery of people interacting with a variety of rich 3D environments
GPA & IQ
kaggle.com
Updated Aug 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joakim Arvidsson (2023). GPA & IQ [Dataset]. https://www.kaggle.com/datasets/joebeachcapital/gpa-and-iq
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 9, 2023
Dataset provided by
Kaggle
Authors
Joakim Arvidsson
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Data on 78 students including GPA, IQ, and gender.

A data frame with 78 observations representing students on the following 5 variables.

obs: a numeric vector

gpa: Grade point average (GPA).

iq: IQ.

gender: Gender.

concept: a numeric vector
h
gpqa
huggingface.co
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdul Waheed (2025). gpqa [Dataset]. https://huggingface.co/datasets/macabdul9/gpqa
Explore at:
Dataset updated
Apr 10, 2025
Authors
Abdul Waheed
Description
macabdul9/gpqa dataset hosted on Hugging Face and contributed by the HF Datasets community
a
Pricing by Model
artificialanalysis.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset authored and provided by
Artificial Analysis
Description
Comparison of Cost (USD) to run the evaluation by Model
h
gpqa_formatted
huggingface.co
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorin Eggers (2023). gpqa_formatted [Dataset]. https://huggingface.co/datasets/jeggers/gpqa_formatted
Explore at:
Dataset updated
Nov 21, 2023
Authors
Jorin Eggers
Description
Dataset Card for GPQA

Formatted version of original GPQA dataset. This removes most columns and adds single columns options and answer to contain a list of the possible answers and the index of the correct one. GPQA is a multiple-choice, Q&A dataset of very hard questions written and validated by experts in biology, physics, and chemistry. When attempting questions out of their own domain (e.g., a physicist answers a chemistry question), these experts get only 34% accuracy… See the full description on the dataset page: https://huggingface.co/datasets/jeggers/gpqa_formatted.
T
GPA | PCAR3 - PE Price to Earnings
tradingeconomics.com
csv, excel, json, xml
Updated Sep 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). GPA | PCAR3 - PE Price to Earnings [Dataset]. https://tradingeconomics.com/pcar3:bz:pe
Explore at:
csv, json, xml, excelAvailable download formats
Dataset updated
Sep 15, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Jul 1, 2025
Description
GPA reported 2.64 in PE Price to Earnings for its fiscal quarter ending in September of 2024. Data for GPA | PCAR3 - PE Price to Earnings including historical, tables and charts were last updated by Trading Economics this last July in 2025.
T
GPA | PCAR3 - Stock Price | Live Quote | Historical Chart
tradingeconomics.com
csv, excel, json, xml
Updated Sep 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). GPA | PCAR3 - Stock Price | Live Quote | Historical Chart [Dataset]. https://tradingeconomics.com/pcar3:bz
Explore at:
xml, excel, json, csvAvailable download formats
Dataset updated
Sep 27, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Jul 1, 2025
Description
GPA stock price, live market quote, shares value, historical data, intraday chart, earnings per share and news.
T
GPA | PCAR3 - Dividend Yield
tradingeconomics.com
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, GPA | PCAR3 - Dividend Yield [Dataset]. https://tradingeconomics.com/pcar3:bz:dy
Explore at:
csv, xml, json, excelAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Jul 1, 2025
Description
GPA reported 9.85 in Dividend Yield for its fiscal quarter ending in March of 2024. Data for GPA | PCAR3 - Dividend Yield including historical, tables and charts were last updated by Trading Economics this last July in 2025.
a
Intelligence Index by Claude Endpoint
artificialanalysis.ai
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artificial Analysis (2025). Intelligence Index by Claude Endpoint [Dataset]. https://artificialanalysis.ai/models/claude-2
Explore at:
Dataset updated
May 15, 2025
Dataset authored and provided by
Artificial Analysis
Description
Comparison of Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, MATH-500 by Model
a
Intelligence Index by GPT-4 Endpoint
artificialanalysis.ai
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artificial Analysis (2025). Intelligence Index by GPT-4 Endpoint [Dataset]. https://artificialanalysis.ai/models/gpt-4
Explore at:
Dataset authored and provided by
Artificial Analysis
Description
Comparison of Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, MATH-500 by Model
t
BIOGRID CURATED DATA FOR GPA-13 (Caenorhabditis elegans)
thebiogrid.org
zip
Updated Jun 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2024). BIOGRID CURATED DATA FOR GPA-13 (Caenorhabditis elegans) [Dataset]. https://thebiogrid.org/44766/table/caenorhabditis-elegans/gpa-13.html
Explore at:
zipAvailable download formats
Dataset updated
Jun 11, 2024
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for GPA-13 (Caenorhabditis elegans) curated by BioGRID (https://thebiogrid.org); DEFINITION: Protein GPA-13
Simulations of GpA-based dimers of various lengths in DEPC, DOPC, and DLPC...
zenodo.org
bin
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matti Javanainen; Waldemar Kulig; Ilpo Vattulainen; Matti Javanainen; Waldemar Kulig; Ilpo Vattulainen (2020). Simulations of GpA-based dimers of various lengths in DEPC, DOPC, and DLPC bilayers, part 1/2 [Dataset]. http://doi.org/10.5281/zenodo.573257
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.573257
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matti Javanainen; Waldemar Kulig; Ilpo Vattulainen; Matti Javanainen; Waldemar Kulig; Ilpo Vattulainen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dimers of transmembrane (TM) peptides based on the Glycophorin A (GpA) dimer are simulated in different membrane environments. Three different homodimers with varying TM domain lengths and one heterodimer are considered. The homodimers are formed of either

17L (GRPNLKLLLGVLLGVLLTLLLLEYP)

23L (GRPNLKLLLLLLGVLLGVLLTLLLLLLLEYP)

29L (GRPNLKLLLLLLLLLGVLLGVLLTLLLLLLLLLLEYP)

peptides, while the heterodimer consists of one 17L peptide and one 29L peptide. In the sequences, the bold letters denote the amino acids involved in the GpA dimerization motif. The dimers are simulated in DLPC (12:0 PC), DOPC (18:1 PC), or DEPC (22:1 PC) bilayers. Additionally, a polyleucine dimer is simulated in a DOPC bilayer. Bilayers consist of 400 lipids and they are adequately hydrated with 24000 water molecules and 134 mM NaCl. The simulations are 100 ns long with trajectories written every 100 ps.

The files are named as XXX-YYYY.ZZZ, where XXX denotes to the peptide type ('het' for the heterodimer and 'polyl' for the polyleucine), YYYY denotes the bilayer type, and ZZZ denotes the file type. Files are in Gromacs format: .xtc for trajectories, .edr for energy data, .cpt for continue points, .ndx for index files, .top for topology files, and .tpr for run input files (Gromacs 5.1). The simulation parameter file (md.mdp) is common for all systems. The CHARMM36 force field is used; topologies are obtained from CHARMM-GUI, and those of the peptides are included in Gromacs format (.itp).

More information on the systems is available in the publication, available here: (TO BE INCLUDED!)

Note that the data for the heterodimer and for the polyleucine are in part 2/2, available at https://doi.org/10.5281/zenodo.573274
t
BIOGRID CURATED DATA FOR GPA-16 (Caenorhabditis elegans)
thebiogrid.org
zip
Updated Jun 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2021). BIOGRID CURATED DATA FOR GPA-16 (Caenorhabditis elegans) [Dataset]. https://thebiogrid.org/37175/table/caenorhabditis-elegans/gpa-16.html
Explore at:
zipAvailable download formats
Dataset updated
Jun 2, 2021
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for GPA-16 (Caenorhabditis elegans) curated by BioGRID (https://thebiogrid.org); DEFINITION: Protein GPA-16

Facebook

Twitter

Click to copy link

Link copied

Cite

David Rein (2023). gpqa [Dataset]. https://huggingface.co/datasets/Idavidrein/gpqa

gpqa

GPQA

Idavidrein/gpqa

Explore at:

Dataset updated

Nov 21, 2023

Authors

David Rein

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset Card for GPQA

GPQA is a multiple-choice, Q&A dataset of very hard questions written and validated by experts in biology, physics, and chemistry. When attempting questions out of their own domain (e.g., a physicist answers a chemistry question), these experts get only 34% accuracy, despite spending >30m with full access to Google. We request that you do not reveal examples from this dataset in plain text or images online, to reduce the risk of leakage into foundation model… See the full description on the dataset page: https://huggingface.co/datasets/Idavidrein/gpqa.

Clear search

Close search

Google apps

Main menu

gpqa

GPQA Dataset

gpqa

PPE-GPQA-Best-of-K

gpqa

gpqa

leaderboard-documents-gpqa

GPA Dataset

GPA & IQ

gpqa

Pricing by Model

gpqa_formatted

GPA | PCAR3 - PE Price to Earnings

GPA | PCAR3 - Stock Price | Live Quote | Historical Chart

GPA | PCAR3 - Dividend Yield

Intelligence Index by Claude Endpoint

Intelligence Index by GPT-4 Endpoint

BIOGRID CURATED DATA FOR GPA-13 (Caenorhabditis elegans)

Simulations of GpA-based dimers of various lengths in DEPC, DOPC, and DLPC...

BIOGRID CURATED DATA FOR GPA-16 (Caenorhabditis elegans)

gpqaSee More Versions

GPQA

Idavidrein/gpqa

gpqa