21 datasets found

h
ConViS-Bench
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
submission1335, ConViS-Bench [Dataset]. https://huggingface.co/datasets/submission1335/ConViS-Bench
Explore at:
Authors
submission1335
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset is associated to submission 1335 at NeurIPS 2025 - Dataset and Benchmarks track. The benchmark is intended to be used with the proposed submission environments (see the source code). See the provided README for information about dataset downloading and running the evaluations.
h
ProactiveBench
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
submission1331 (2025). ProactiveBench [Dataset]. https://huggingface.co/datasets/submission1331/ProactiveBench
Explore at:
Dataset updated
May 11, 2025
Authors
submission1331
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This is the benchmark associated with submission 1331 at NeurIPS 2025 - Dataset and Benchmarks track. The benchmark is intended to be used with the proposed submission environments (see the source code). The .jsonl files do not contain proper image paths but rather image path templates, as each .jsonl entry is a sample, and each sample corresponds to a different environment with its own images. See the submitted code README for information about dataset downloading and preprocessing, and to… See the full description on the dataset page: https://huggingface.co/datasets/submission1331/ProactiveBench.
H
NIPS2025D&B_AMPBenchmark
dataverse.harvard.edu
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boyao Wan (2025). NIPS2025D&B_AMPBenchmark [Dataset]. http://doi.org/10.7910/DVN/E9A88D
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/E9A88D
Dataset updated
May 16, 2025
Dataset provided by
Harvard Dataverse
Authors
Boyao Wan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Dataset for "A Benchmark for Antimicrobial Peptide Recognition Based on Structure and Sequence Representation" at NeurIPS 2025 Dataset and Benchmark
h
SIMSHIFT_data
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SIMSHIFT (2025). SIMSHIFT_data [Dataset]. https://huggingface.co/datasets/simshift/SIMSHIFT_data
Explore at:
Dataset updated
May 11, 2025
Authors
SIMSHIFT
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SIMSHIFT: A Benchmark for Adapting Neural Surrogates to Distribution Shifts

This is the official data repository to the NeurIPS 2025 Datasets & Benchmarks Track Submission.

Usage

We provide dataset loading utilities and full training and evaluation pipelines in the accompanying code repository that will be released upon publication.
h
CodeR-Pile
huggingface.co
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nebula (2025). CodeR-Pile [Dataset]. https://huggingface.co/datasets/nebula2025/CodeR-Pile
Explore at:
Dataset updated
May 23, 2025
Authors
nebula
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Introduction

CodeR-Pile. Dataset of submission to NeurIPS 2025 Datasets and Benchmarks Track. Under review.

Load Dataset

An example to load the dataset: import datasets

dict for mapping task to main task type

task_to_main_task_type = { # text2code ## seed tasks "web_code_retrieval": "text2code", "code_contest_retrieval": "text2code", "text2sql_retrieval": "text2code", ## new tasks "error_message_retrieval": "text2code"… See the full description on the dataset page: https://huggingface.co/datasets/nebula2025/CodeR-Pile.
h
NLID
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GWDx (2025). NLID [Dataset]. https://huggingface.co/datasets/GWDx/NLID
Explore at:
Dataset updated
May 28, 2025
Authors
GWDx
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NLID: A Large-Scale Neuromorphic Liquid Identification Dataset

NeurIPS 2025 Datasets and Benchmarks Track paper: Bubbles Talk: A Neuromorphic Dataset for Liquid Identification from Pouring process.
CASE118 with energy cost variation
figshare.com
bin
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous USER (2025). CASE118 with energy cost variation [Dataset]. http://doi.org/10.6084/m9.figshare.29066399.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29066399.v1
Dataset updated
May 23, 2025
Dataset provided by
figshare
Authors
Anonymous USER
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the first dataset of the NeurIPS 2025 submission: The SafePowerGraph Benchmark: Toward Reliable and Realistic Graph Learning in Power Grids.
h
TMGBench
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haochuan Wang (2025). TMGBench [Dataset]. https://huggingface.co/datasets/pinkex/TMGBench
Explore at:
Dataset updated
May 28, 2025
Authors
Haochuan Wang
Description
TMGBench: TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs

This repository contains the code, data, and metadata for our NeurIPS 2025 Datasets and Benchmarks submission: TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs. The benchmark evaluates the strategic reasoning of large language models using 2x2 matrix games with narrative contexts and theory-of-mind variations.

Directory Structure… See the full description on the dataset page: https://huggingface.co/datasets/pinkex/TMGBench.
h
neurips_submission_23
huggingface.co
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous User (2024). neurips_submission_23 [Dataset]. https://huggingface.co/datasets/annonymousa378/neurips_submission_23
Explore at:
Dataset updated
Apr 15, 2024
Authors
Anonymous User
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
NeurIPS 2025 Dataset Track Submission #23

The repository provides the dataset for VideoMathQA benchmark. We provide a task implementation compatible with lmms_eval to improve the reproducibility of our experiments.

🏆 VideoMathQA Leaderboard (MCQ & MBin with Subtitles)

Rank Model Size MCQ (Subtitles) 🔍 MBin (Subtitles) 📊 Model Type

1️⃣

GPT-4o-mini

61.4 44.2 🔒 Proprietary

2️⃣ Qwen2.5-VL (72B) 72B 36.9 28.6 🪶 Open Weights

3️⃣ InternVL3 (78B) 78B 37.1… See the full description on the dataset page: https://huggingface.co/datasets/annonymousa378/neurips_submission_23.
h
Data from: COinCO
huggingface.co
Updated Jun 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tianze Yang (2025). COinCO [Dataset]. https://huggingface.co/datasets/ytz009/COinCO
Explore at:
Dataset updated
Jun 3, 2025
Authors
Tianze Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
🖼️ COinCO: Common Inpainted Objects In-N-Out of Context

Authors: Tianze Yang*, Tyson Jordan*, Ninghao Liu, Jin Sun*Equal contributionAffiliation: University of GeorgiaStatus: Submitted to NeurIPS 2025 Datasets and Benchmarks Track (under review)

📦 1. Dataset Overview

The COinCO dataset is a large-scale benchmark constructed from the COCO dataset to study object-scene contextual relationships via inpainting. Each image in COinCO contains one inpainted object, and… See the full description on the dataset page: https://huggingface.co/datasets/ytz009/COinCO.
h
mmlongbench-doc-results
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IXCLab@Shanghai AI Lab, mmlongbench-doc-results [Dataset]. https://huggingface.co/datasets/OpenIXCLab/mmlongbench-doc-results
Explore at:
Dataset authored and provided by
IXCLab@Shanghai AI Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📊 MMLongBench-Doc Evaluation Results

Official evaluation results: GPT-4.1 (2025-04-14) & GPT-4o (2024-11-20) 📄 Paper: MMLongBench-Doc, NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)
h
EgoGazeVQA-91-nips25DB
huggingface.co
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2025). EgoGazeVQA-91-nips25DB [Dataset]. https://huggingface.co/datasets/anonupload/EgoGazeVQA-91-nips25DB
Explore at:
Dataset updated
May 31, 2025
Authors
Anonymous
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
EgoGazeVQA-91 • NeurIPS 2025 Datasets & Benchmarks submission

In the Eye of MLLM: Benchmarking Egocentric Video Intent Understanding with Gaze-Guided Prompting

1 Folder layout

EgoGazeVQA-91-nips25DB/
├── qa_pairs/ # VQA supervision │ ├── causal_ego4d.{csv,json} │ ├── spatial_ego4d.{csv,json} │ ├── temporal_ego4d.{csv,json} │ └── ... ├── keyframe_tar/
│ ├── ego4d.tar.gz │ ├── egoexo.tar.gz │ └── egtea.tar.gz └──… See the full description on the dataset page: https://huggingface.co/datasets/anonupload/EgoGazeVQA-91-nips25DB.
h
M3DRS
huggingface.co
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HEIG-Vd Geomatic (2025). M3DRS [Dataset]. https://huggingface.co/datasets/heig-vd-geo/M3DRS
Explore at:
Dataset updated
May 15, 2025
Dataset authored and provided by
HEIG-Vd Geomatic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
M3DRS: Multi-Modal Multi-Resolution Remote Sensing Dataset

This repository hosts the M3DRS dataset, a comprehensive collection of 5-channel remote sensing images (RGB, NIR, nDSM) from Switzerland, France, and Italy. The dataset is unlabelled and specifically designed to support self-supervised learning tasks. It is part of our submission to the NeurIPS 2025 Datasets and Benchmarks Track. The dataset is organized into three folders, each containing ZIP archives of images grouped by… See the full description on the dataset page: https://huggingface.co/datasets/heig-vd-geo/M3DRS.
h
MTVBench
huggingface.co
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaodong Cun (2025). MTVBench [Dataset]. https://huggingface.co/datasets/vinthony/MTVBench
Explore at:
Dataset updated
May 26, 2025
Authors
Xiaodong Cun
Description
MTVBenth: Benchmarking Video Generation Models with Multiple Transition Text Prompts

Submission to NeurIPS 2025 Dataset and Benchmark track. Xiaodong Cun, Xiuli Bi, Ruihuan Yang, Jianfei Yuan, Bin Xiao, Bo Liu. GVC Lab, Great Bay University, and Chongqing University of Posts and Telecommunications We list all the generated videos here in each folder. The prompt benchmark we collected is shown in the prompt folder
h
3D-ADAM
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul McHard (2025). 3D-ADAM [Dataset]. http://doi.org/10.57967/hf/5526
Explore at:
Unique identifier
https://doi.org/10.57967/hf/5526
Dataset updated
May 28, 2025
Authors
Paul McHard
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Repository for the 3D-ADAM (3D Anomaly Detection in Advanced Manufacturing) Dataset. Submitted to NeurIPS 2025 - Datasets and Benchmarks Track. This project has been supported by the 1851 Royal Commission and HAL Robotics

license: cc-by-nc-sa-4.0
h
osiris
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hardware-Fab (2025). osiris [Dataset]. https://huggingface.co/datasets/hardware-fab/osiris
Explore at:
Dataset updated
May 28, 2025
Dataset authored and provided by
Hardware-Fab
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Osiris: A Scalable Dataset Generation Pipeline for Machine Learning in Analog Circuit Design

Osiris is an end-to-end analog circuits design pipeline capable of producing, validating, and evaluating layouts for generic analog circuits. The Osiris GitHub repository hosts the code that implements the randomized pipeline as well as the reinforcement learning-driven baseline methodology discussed in the paper proposed at the NeurIPS 2025 Datasets & Benchmarks Track. The Osiris 🤗… See the full description on the dataset page: https://huggingface.co/datasets/hardware-fab/osiris.
h
EmbedMol
huggingface.co
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhiyao Tang (2025). EmbedMol [Dataset]. https://huggingface.co/datasets/zhiyaot/EmbedMol
Explore at:
Dataset updated
May 29, 2025
Authors
Zhiyao Tang
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
EmbedMol: An Open Billion-scale Molecular Embedding Dataset for Molecular Discovery

(For NeurIPS 2025 Datasets and Benchmark Track Review ONLY)

Description of the Data and File Structure

This repository contains the complete EmbedMol-1B dataset, partitioned into 40 parts. Each part is provided as a tarball (e.g., embedmol-1b-part-39.tar.gz). Inside each tarball, the embeddings are further split into four batches, with each batch stored as a compressed NumPy array of… See the full description on the dataset page: https://huggingface.co/datasets/zhiyaot/EmbedMol.
h
EgoExOR
huggingface.co
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arda Mamur (2025). EgoExOR [Dataset]. https://huggingface.co/datasets/ardamamur/EgoExOR
Explore at:
Dataset updated
May 15, 2025
Authors
Arda Mamur
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
EgoExOR: An Egocentric–Exocentric Operating Room Dataset for Comprehensive Understanding of Surgical Activities

Official code of the paper "EgoExOR: An Egocentric–Exocentric Operating Room Dataset for Comprehensive Understanding of Surgical Activities" submitted at NeurIPS 2025 Datasets & Benchmarks Track. Authors: Ege Özsoy, Arda Mamur, Felix Tristram, Chantal Pellegrini… See the full description on the dataset page: https://huggingface.co/datasets/ardamamur/EgoExOR.
h
MVVLP
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IET, MVVLP [Dataset]. https://huggingface.co/datasets/TJIET/MVVLP
Explore at:
Authors
IET
Description
🅿️ MVVLP: A Multi-View Benchmark Dataset for Vision-and-Language Navigation in Real-World Parking Scenarios

Paper: MVVLP: A Multi-View Benchmark Dataset for Vision-and-Language Navigation in Real-World Parking Scenarios Conference: NeurIPS 2025 (submitted)Authors: Pengyu Fu*, Jincheng Hu*, Jihao Li*, Ming Liu, Jingjing Jiang, Yuanjian Zhang All code is available on GitHub.

📌 Dataset Summary

MVVLP includes a multi-view image dataset collected in real parking lots, a… See the full description on the dataset page: https://huggingface.co/datasets/TJIET/MVVLP.
h
FruitBench
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IET (2025). FruitBench [Dataset]. https://huggingface.co/datasets/TJIET/FruitBench
Explore at:
Dataset updated
May 11, 2025
Authors
IET
Description
🥭 FruitBench: A Multimodal Benchmark for Fruit Growth Understanding

Paper: FruitBench: A Multimodal Benchmark for Comprehensive Fruit Growth Understanding in Real-World AgricultureConference: NeurIPS 2025 (submitted)Authors: Jihao Li*, Jincheng Hu*, Pengyu Fu*, Ming Liu, et al.

📌 Dataset Summary

FruitBench is the first large-scale multimodal benchmark designed to evaluate vision-language models on real-world agricultural understanding. It focuses on fruit growth… See the full description on the dataset page: https://huggingface.co/datasets/TJIET/FruitBench.

Facebook

Twitter

Click to copy link

Link copied

Cite

submission1335, ConViS-Bench [Dataset]. https://huggingface.co/datasets/submission1335/ConViS-Bench

ConViS-Bench

submission1335/ConViS-Bench

Explore at:

Authors

submission1335

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

This dataset is associated to submission 1335 at NeurIPS 2025 - Dataset and Benchmarks track. The benchmark is intended to be used with the proposed submission environments (see the source code). See the provided README for information about dataset downloading and running the evaluations.

Clear search

Close search

Google apps

Main menu

ConViS-Bench

ProactiveBench

NIPS2025D&B_AMPBenchmark

SIMSHIFT_data

CodeR-Pile

dict for mapping task to main task type

NLID

CASE118 with energy cost variation

TMGBench

neurips_submission_23

GPT-4o-mini

Data from: COinCO

mmlongbench-doc-results

EgoGazeVQA-91-nips25DB

M3DRS

MTVBench

3D-ADAM

osiris

EmbedMol

EgoExOR

MVVLP

FruitBench

ConViS-Bench

submission1335/ConViS-Bench