Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Chat-UniVi/browsecomp dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
BrowseComp Long Context
BrowseComp Long Context is a dataset based on BrowseComp to benchmark LLM’s capability to retrieve relevant information from noisy data in its context. It converts the agentic question answering tasks from Browsecomp into long context tasks. For each of the questions in a subset of BrowseComp, a list of urls are attached. Each url will be paired with an indicator indicating whether the content of the web page is required to answer the question or is… See the full description on the dataset page: https://huggingface.co/datasets/openai/BrowseCompLongContext.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Relevant links: * implementation: https://www.kaggle.com/code/aminmohamedmohami/browsecomp-benchmark-starter-code * publication: https://arxiv.org/abs/2504.12516 * original repository: https://github.com/openai/simple-evals/tree/main
Abstract We present BrowseComp, a simple yet challenging benchmark for measuring the ability for agents to browse the web. BrowseComp comprises 1,266 questions that require persistently navigating the internet in search of hard-to-find, entangled information. Despite the difficulty of the questions, BrowseComp is simple and easy-to-use, as predicted answers are short and easily verifiable against reference answers. BrowseComp for browsing agents can be seen as analogous to how programming competitions are an incomplete but useful benchmark for coding agents. While BrowseComp sidesteps challenges of a true user query distribution, like generating long answers or resolving ambiguity, it measures the important core capability of exercising persistence and creativity in finding information.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
BrowseComp-Plus
Project Page | Paper | Code BrowseComp-Plus is a new benchmark for Deep-Research system, isolating the effect of the retriever and the LLM agent to enable fair, transparent comparisons of Deep-Research agents. The benchmark sources challenging, reasoning-intensive queries from OpenAI's BrowseComp. However, instead of searching the live web, BrowseComp-Plus evaluates against a fixed, curated corpus of ~100K web documents from the web. The corpus includes both… See the full description on the dataset page: https://huggingface.co/datasets/Tevatron/browsecomp-plus-corpus.
ychaohao/browsecomp-filtered dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
smolagents/browse_comp dataset hosted on Hugging Face and contributed by the HF Datasets community
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to browsecomp.com (Domain). Get insights into ownership history and changes over time.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
BM25, embedding index used in BrowseComp-Plus. For downloading the index: huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="bm25/*" --local-dir ./indexes huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="qwen3-embedding-0.6b/*" --local-dir ./indexes huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="qwen3-embedding-4b/*" --local-dir ./indexes huggingface-cli download… See the full description on the dataset page: https://huggingface.co/datasets/Tevatron/browsecomp-plus-indexes.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper: https://arxiv.org/abs/2508.13186v1 Code: https://github.com/MMBrowseComp/MM-BrowseComp The specific evaluation methods can be found in our GitHub repository.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Chat-UniVi/browsecomp dataset hosted on Hugging Face and contributed by the HF Datasets community