Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📊 MMLongBench-Doc Evaluation Results
Official evaluation results: GPT-4.1 (2025-04-14) & GPT-4o (2024-11-20) 📄 Paper: MMLongBench-Doc, NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)
Facebook
TwitterAccepted by NeurIPS 2024 Datasets and Benchmarks Track
We introduce the RePair puzzle-solving dataset, a large-scale real world dataset of fractured frescoes from the archaelogical campus of Pompeii. Our dataset consists of over 1000 fractured frescoes. The RePAIR stands as a realistic computational challenge for methods for 2D and 3D puzzle solving, and serves as a benchmark that enables the study of fractured object reassembly and presents new challenges for geometric shape understanding. Please visit our website for more dataset information, access to source code scripts and for an interactive gallery viewing of the dataset samples.
We provide a compressed version of our dataset in two seperate files. One for the 2D version and one for the 3D version.
Our full dataset contains over one thousand individual fractured fragments divided into groups with its corresponding folder and all compressed into their individual sub-set format regarding whether they are 2D or 3D. Regarding the 2D dataset, each fragment is saved as a .PNG image and each group has the corresponding ground truth transformation to solve the puzzle as a .TXT file. Considering the 3D dataset, each fragment is saved as a mesh using the widely .OBJ format with the corresponding material (.MTL) and texture (.PNG) file. The meshes are already in the assembled position and orientation, so that no additional information is needed. All additional metadata information are given as .JSON files.
Please be advised that downloading and reusing this dataset is permitted only upon acceptance of the following license terms.
The Istituto Italiano di Tecnologia (IIT) declares, and the user (“User”) acknowledges, that the "RePAIR puzzle-solving dataset" contains 3D scans, texture maps, rendered images and meta-data of fresco fragments acquired at the Archaeological Site of Pompeii. IIT is authorised to publish the RePAIR puzzle-solving dataset herein only for scientific and cultural purposes and in connection with an academic publication referenced as Tsemelis et al., "Re-assembling the past: The RePAIR dataset and benchmark for real world 2D and 3D puzzle solving", NeurIPS 2024. Use of the RePAIR puzzle-solving dataset by User is limited to downloading, viewing such images; comparing these with data or content in other datasets. User is not authorised to use, in particular explicitly excluding any commercial use nor in conjunction with the promotion of a commercial enterprise and/or its product(s) or service(s), reproduce, copy, distribute the RePAIR puzzle-solving dataset. User will not use the RePAIR puzzle-solving dataset in any way prohibited by applicable laws. RePAIR puzzle-solving dataset therein is being provided to User without warranty of any kind, either expressed or implied. User will be solely responsible for their use of such RePAIR puzzle-solving dataset. In no event shall IIT be liable for any damages arising from such use.
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Semi Truths Dataset: A Large-Scale Dataset for Testing Robustness of AI-Generated Image Detectors (NeurIPS 2024 Track Datasets & Benchmarks Track)
Recent efforts have developed AI-generated image detectors claiming robustness against various augmentations, but their effectiveness remains unclear. Can these systems detect varying degrees of augmentation?
To address these questions, we introduce Semi-Truths, featuring 27, 600 real images, 223, 400 masks, and 1, 472, 700… See the full description on the dataset page: https://huggingface.co/datasets/semi-truths/Semi-Truths.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
MmCows is a large-scale multimodal dataset for behavior monitoring, health management, and dietary management of dairy cattle.
The dataset consists of data from 16 dairy cows collected during a 14-day real-world deployment, divided into two modality groups. The primary group includes 3D UWB location, cows' neck IMMU acceleration, air pressure, cows' CBT, ankle acceleration, multi-view RGB images, indoor THI, outdoor weather, and milk yield. The secondary group contains measured UWB distances, cows' head direction, lying behavior, and health records.
MmCows also contains 20,000 isometric-view images from multiple camera views in one day that are annotated with cows' ID and their behavior as the ground truth. The annotated cow IDs from multi-views are used to derive their 3D body location ground truth.
Below is a portion of the whole dataset. More details of the dataset and benchmarks are available at https://github.com/neis-lab/mmcows.
This link offers faster and more reliable download: https://huggingface.co/datasets/neis-lab/mmcows
Brief overview video: https://www.youtube.com/watch?v=YBDvz-HoLWg
DOI: 10.57967/hf/5965 (cow)
Facebook
TwitterSciFIBench
Jonathan Roberts, Kai Han, Neil Houlsby, and Samuel Albanie
NeurIPS 2024
Note: This repo has been updated to add two splits ('General_Figure2Caption' and 'General_Caption2Figure') with an additional 1000 questions. The original version splits are preserved and have been renamed as follows: 'Figure2Caption' -> 'CS_Figure2Caption' and 'Caption2Figure' -> 'CS_Caption2Figure'.
Dataset Summary
The SciFIBench (Scientific Figure… See the full description on the dataset page: https://huggingface.co/datasets/jonathan-roberts1/SciFIBench.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The paper has been accepted in NeurIPS 2024 (Dataset & Benchmark Track). paper repo
☠️ Warning: The samples presented by this paper may be considered offensive or vulgar.
The opinions and findings contained in the samples of our presented dataset should not be interpreted as representing the views expressed or implied by the authors. We acknowledge the risk of malicious actors attempting to reverse-engineer memes. We sincerely hope that users will employ the dataset responsibly and appropriately, avoiding misuse or abuse. We believe the benefits of our proposed resources outweigh the associated risks. All resources are intended solely for scientific research and are prohibited from commercial use.
To adapt to the Chinese online environment, we introduce the definition of Chinese harmful memes:
Chinese harmful memes are multimodal units consisting of an image and Chinese inline text that have the potential to cause harm to an individual, an organization, a community, a social group, or society as a whole. These memes can range from offense or joking that perpetuate harmful stereotypes towards specific social entities, to memes that are more subtle and general but still have the potential to cause harm. It is important to note that Chinese harmful memes can be created and spread intentionally or unintentionally. They often reflect and reinforce underlying negative values and cultural attitudes on the Chinese Internet, which are detrimental from legal or moral perspectives.
According to the definition, we identify the most common harmful types of memes on Chinese platforms, including targeted harmful, general offense, sexual innuendo, and dispirited culture. We focus on these harmful types when constructing the dataset.
During the annotation, we label memes from two aspects: harmful types (i.e., the above four types) and modality combination (i.e., analyzing toxicity through fused or independent features, including Text-Image Fusion, Harmful Text, and Harmful Image). Finally, we present the ToxiCN MM dataset, which contains 12,000 samples.
Considering the potential risk of abuse, please fill out the following form to request the datasets: https://forms.gle/UN61ZNfTgMZKfMrv7. After we get your request, we will send the dataset to your email as soon as possible.
The dataset labels and captions generated by GPT-4V have been saved as train_data_discription.json and test_data_discription.json in the ./data/ directory. Here we simply describe each fine-grain label.
| Label | Description |
|---|---|
| label | Identify if a meme is Harmful (1) or Non-harmful (0). |
| type | Non-harmful: 0, Targeted Harmful: 1, Sexual Innuendo: 2, General Offense: 3, Dispirited Culture: 4 |
| modal | Non-harmful / Text-Image Fusion: [0, 0], Only Harmful Text: [1, 0], Only Harmful Image: [0, 1], Harmful Text & Image: [1, 1] |
We present a Multimodal Knowledge Enhancement Detector for effective detection. It incorporates contextual information of meme content to enhance the detector's understanding of Chinese memes generated by the LLM. The requirements.txt file lists the specific dependencies of the project.
This work is licensed under a Creative Commons Attribution- NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).
https://github.com/user-attachments/assets/c3cb7793-33f2-4e3e-ad72-e0d84530c658" alt="poster_original">
If you want to use the resources, please cite the following paper. The camera-ready version of the paper will be released after the conference: ~~~ @article{lu2024towards, title={Towards Comprehensive Detection of Chinese Harmful Memes}, author={Lu, Junyu and Xu, Bo and Zhang, Xiaokun and Wang, Hongbo and Zhu, Haohao and Zhang, Dongyu and Yang, Liang and Lin, Hongfei}, journal={arXiv preprint arXiv:2410.02378}, year={2024} } ~~~
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
WikiDBs is an open-source corpus of 100,000 relational databases. We aim to support research on tabular representation learning on multi-table data. The corpus is based on Wikidata and aims to follow certain characteristics of real-world databases.
WikiDBs was published as a spotlight paper at the Dataset & Benchmarks track at NeurIPS 2024.
WikiDBs contains the database schemas, as well as table contents. The database tables are provided as CSV files, and each database schema as JSON. The 100,000 databases are available in five splits, containing 20k databases each. In total, around 165 GB of disk space are needed for the full corpus. We also provide a script to convert the databases into SQLite.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
BLEnD
This is the official repository of BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages (Submitted to NeurIPS 2024 Datasets and Benchmarks Track). 24/12/05: Updated translation errors25/05/02: Updated multiple choice questions file (v1.1)
About
Large language models (LLMs) often lack culture-specific everyday knowledge, especially across diverse regions and non-English languages. Existing benchmarks for evaluating LLMs' cultural… See the full description on the dataset page: https://huggingface.co/datasets/nayeon212/BLEnD.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MatSeg Dataset and benchmark for zero-shot material state segmentation.
MatSeg Benchmark containing 1220 real-world images and their annotations is available at MatSeg_Benchmark.zip the file contains documentation and Python readers.
MatSeg dataset containing synthetic images with infused natural images patterns is available at MatSeg3D_part_*.zip and MatSeg3D_part_*.zip (* stand for number).
MatSeg3D_part_*.zip: contain synthethc 3D scenes
MatSeg2D_part_*.zip: contain syntethc 2D scenes
Readers and documentation for the synthetic data are available at: Dataset_Documentation_And_Readers.zip
Readers and documentation for the real-images benchmark are available at: MatSeg_Benchmark.zip
The Code used to generate the MatSeg Dataset is available at: https://zenodo.org/records/11401072
Additional permanent sources for downloading the dataset and metadata: 1, 2
Evaluation scripts for the Benchmark are now available at:
https://zenodo.org/records/13402003 and https://e.pcloud.link/publink/show?code=XZsP8PZbT7AJzG98tV1gnVoEsxKRbBl8awX
Materials and their states form a vast array of patterns and textures that define the physical and visual world. Minerals in rocks, sediment in soil, dust on surfaces, infection on leaves, stains on fruits, and foam in liquids are some of these almost infinite numbers of states and patterns.
Image segmentation of materials and their states is fundamental to the understanding of the world and is essential for a wide range of tasks, from cooking and cleaning to construction, agriculture, and chemistry laboratory work.
The MatSeg dataset focuses on zero-shot segmentation of materials and their states, meaning identifying the region of an image belonging to a specific material type of state, without previous knowledge or training of the material type, states, or environment.
The dataset contains a large set of (100k) synthetic images and benchmarks of 1220 real-world images for testing.
The benchmark contains 1220 real-world images with a wide range of material states and settings. For example: food states (cooked/burned..), plants (infected/dry.) to rocks/soil (minerals/sediment), construction/metals (rusted, worn), liquids (foam/sediment), and many other states in without being limited to a set of classes or environment. The goal is to evaluate the segmentation of material materials without knowledge or pretraining on the material or setting. The focus is on materials with complex scattered boundaries, and gradual transition (like the level of wetness of the surface).
Evaluation scripts for the Benchmark are now available at: 1 and 2.
The synthetic dataset is composed of synthetic scenes rendered in 2d and 3d using a blender. The synthetic data is infused with patterns, materials, and textures automatically extracted from real images allowing it to capture the complexity and diversity of the real world while maintaining the precision and scale of synthetic data. 100k images and their annotation are available to download.
License
This dataset, including all its components, is released under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. To the extent possible under law, the authors have dedicated all copyright and related and neighboring rights to this dataset to the public domain worldwide. This dedication applies to the dataset and all derivative works.
The MatSeg 2D and 3D synthetic were generated using the open-images dataset which is licensed under the https://www.apache.org/licenses/LICENSE-2.0. For these components, you must comply with the terms of the Apache License. In addition, the MatSege3D dataset uses Shapenet 3D assets with GNU license.
An Example of a training and evaluation code for a net trained on the dataset and evaluated on the benchmark is given at these urls: 1, 2
This include an evaluation script on the MatSeg benchmark.
Training script using the MatSeg dataset.
And weights of a trained model
Paper:
More detail on the work ca be found in the paper "Infusing Synthetic Data with Real-World Patterns for
Zero-Shot Material State Segmentation"
Croissant metadata and additional sources for downloading the dataset are available at 1,2
Facebook
TwitterDataset Card for ImagineBench
A benchmark for evaluating reinforcement learning algorithms that train the policies using both real data and imaginary rollouts from LLMs. The concept of imaginary rollouts was proposed by KALM (NeurIPS 2024), which focuses on extracting knowledge from LLMs, in the form of environmental rollouts, to improve RL policies' performance on novel tasks. Please check the paper for ImagineBench for more details. Core focus: Measuring how well agents can learn… See the full description on the dataset page: https://huggingface.co/datasets/NJU-RLer/ImagineBench.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset contains UAV footage of wild antelopes (blackbucks) in grassland habitats. It can be mainly used for two tasks: Multi-object tracking (MOT) and Re-Identification (Re-ID). We provide annotations for the position of animals in each frame, allowing us to offer very long videos (up to 3 min) completely annotated while maintaining the identity of each animal in the video. The Re-ID dataset offers two videos, that capture the movement of some animals simultaneously from two different UAVs. The Re-ID task is to find the same individual in two videos taken simultaneously from a slightly different perspective. The relevant paper will be published in the NeurIPS 2024 Dataset and Benchmarking Track. https://nips.cc/virtual/2024/poster/97563 Resolution: 5.4 K MOT: 12 videos ( MOT17 Format) Re-ID: 6 sets (each with a pair of drones) (Custom) Detection: 320 Images (COCO, YOLO)
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides pre-extracted features from multimodal environmental data and expert-verified species observations, ready to be integrated into your models. Whether you're here for research, experimentation, or competition, you're in the right place!
🔎 Check out the key resources below to get started: | Resource | Description | Link | | ------------------------------ | -------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | | 📄 Dataset Paper | NeurIPS 2024 paper detailing the dataset, benchmark setup, etc. | NeurIPS Paper (PDF) | | | 🧠 GitHub Repository | Codebase with data loaders, baseline models, and utilities | GeoPlant Repo | | 🚀 Starter Notebooks | Baseline models, multimodal pipelines, and training scripts | GeoPlant Code on Kaggle | | 📦 Full Dataset | All provided data including the Presence-Only (PO) species observations. | GeoPlant Seafile |
The species related training data comprises: 1. Presence-Absence (PA) surveys: including around 90 thousand surveys with roughly 10,000 species of the European flora. The presence-absence data (PA) is provided to compensate for the problem of false-absences of PO data and calibrate models to avoid associated biases. 2. Presence-Only (PO) occurrences: combines around five million observations from numerous datasets gathered from the Global Biodiversity Information Facility (GBIF, www.gbif.org). This data constitutes the larger piece of the training data and covers all countries of our study area, but it has been sampled opportunistically (without standardized sampling protocol), leading to various sampling biases. The local absence of a species among PO data doesn't mean it is truly absent. An observer might not have reported it because it was difficult to "see" it at this time of the year, to identify it as not a monitoring target, or just unattractive.
There are two CSVs with species occurrence data on the Seafile available for training. The detailed description is provided again on SeaFile in separate ReadME files in relevant folders.
- The PO metadata are available in PresenceOnlyOccurences/GLC24_PO_metadata_train.csv.
- The PA metadata are available in PresenceAbsenceSurveys/GLC24_PA_metadata_train.csv.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1518097%2Fcf0b0ee7f4ab8c1f7944fd7b3cd89d81%2FDataComposition.png?generation=1718369587083645&alt=media" alt="">
Besides species data, we provide spatialized geographic and environmental data as additional input variables (see Figure 1). More precisely, For each species observation location, we provide: 1. Satellite image patches: 3-band (RGB) and 1-band (NIR) 128x128 images at 10m resolution. 2. Satellite time series: Up to 20 years of values for six satellite bands (R, G, B, NIR, SWIR1, and SWIR2). 3. Environmental rasters Various climatic, pedologic, land use, and human footprint variables at the European scale. We provide scalar values, time-series, and original rasters from which you may extract local 2D images.
There are three separate folders with the relevant data on the Seafile available for training. The detailed description is provided below and again on SeaFile in separate "Readme" files in relevant folders.
- The Satellite image patches in ./SatellitePatches/.
- The Satellite time series in ./SatelliteTimeSeries/.
- The Environmental rasters in ./EnvironmentalRasters/.
Figure. Illustration of of the environmental data for an occurrence (glcID=4859165) collected in northern Switzerland (lon=8.5744;lat=47.7704) in 2021. A. The 1280x1280m satellite image patches were sampled in 2021 around the observation. B. Quarterly time series of six satellite ...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ChronoMagic Dataset
This dataset contains time-lapse video-text pairs curated for metamorphic video generation. It was presented in the paper ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation. Project page: https://pku-yuangroup.github.io/ChronoMagic-Bench
Usage
cat ChronoMagic-Pro.zip.part-* > ChronoMagic-Pro.zip unzip ChronoMagic-Pro.zip
[NeurIPS D&B 2024 Spotlight] ChronoMagic-Bench: A Benchmark for Metamorphic… See the full description on the dataset page: https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Pro.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
NeurIPS 2024 🏠Home (🚧Still in construction) | 🤗Data | 🥇Leaderboard | 🖥️Code | 📄Paper This repo contains the full dataset for our paper CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs, which is a diverse and challenging chart understanding benchmark fully curated by human experts. It includes 2,323 high-resolution charts manually sourced from arXiv preprints. Each chart is… See the full description on the dataset page: https://huggingface.co/datasets/princeton-nlp/CharXiv.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📊 MMLongBench-Doc Evaluation Results
Official evaluation results: GPT-4.1 (2025-04-14) & GPT-4o (2024-11-20) 📄 Paper: MMLongBench-Doc, NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)