lmms-lab/Video-MME dataset hosted on Hugging Face and contributed by the HF Datasets community
Video-MME stands for Video Multi-Modal Evaluation. It is the first-ever comprehensive evaluation benchmark specifically designed for Multi-modal Large Language Models (MLLMs) in video analysis¹. This benchmark is significant because it addresses the need for a high-quality assessment of MLLMs' performance in processing sequential visual data, which has been less explored compared to their capabilities in static image understanding.
The Video-MME benchmark is characterized by its: 1. Diversity in video types, covering 6 primary visual domains with 30 subfields for broad scenario generalizability. 2. Duration in the temporal dimension, including short-, medium-, and long-term videos ranging from 11 seconds to 1 hour, to assess robust contextual dynamics. 3. Breadth in data modalities, integrating multi-modal inputs such as video frames, subtitles, and audios. 4. Quality in annotations, with rigorous manual labeling by expert annotators for precise and reliable model assessment¹.
The benchmark includes 900 videos totaling 256 hours, manually selected and annotated, resulting in 2,700 question-answer pairs. It has been used to evaluate various state-of-the-art MLLMs, including the GPT-4 series and Gemini 1.5 Pro, as well as open-source image and video models¹. The findings from Video-MME highlight the need for further improvements in handling longer sequences and multi-modal data, which is crucial for the advancement of MLLMs¹.
(1) [2405.21075] Video-MME: The First-Ever Comprehensive Evaluation .... https://arxiv.org/abs/2405.21075. (2) Video-MME. https://video-mme.github.io/home_page.html. (3) Video-MME: Welcome. https://video-mme.github.io/. (4) undefined. https://doi.org/10.48550/arXiv.2405.21075.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Provide:
a high-level explanation of the dataset characteristics explain motivations and summary of its content potential use cases of the dataset
topyun/Video-MME-Long dataset hosted on Hugging Face and contributed by the HF Datasets community
Reacherx/Video-MME-Pro dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset is from the paper MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios. See https://github.com/DogNeverSleep/MME-VideoOCR_Dataset for more information, including the license.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Colombia Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data was reported at 75,673.668 USD th in Apr 2019. This records a decrease from the previous number of 83,199.965 USD th for Mar 2019. Colombia Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data is updated monthly, averaging 75,673.668 USD th from Feb 2013 (Median) to Apr 2019, with 75 observations. The data reached an all-time high of 440,430.861 USD th in May 2013 and a record low of 45,553.582 USD th in Jan 2015. Colombia Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data remains active status in CEIC and is reported by National Statistics Administrative Department. The data is categorized under Global Database’s Colombia – Table CO.JA034: Imports: Value: CPC Ver 2 AC.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Colombia Imports Volume: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data was reported at 3,265.446 Metric Ton in Apr 2019. This records a decrease from the previous number of 3,951.599 Metric Ton for Mar 2019. Colombia Imports Volume: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data is updated monthly, averaging 3,715.305 Metric Ton from Feb 2013 (Median) to Apr 2019, with 75 observations. The data reached an all-time high of 20,085.733 Metric Ton in May 2013 and a record low of 2,304.283 Metric Ton in Jan 2015. Colombia Imports Volume: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc. data remains active status in CEIC and is reported by National Statistics Administrative Department. The data is categorized under Global Database’s Colombia – Table CO.JA035: Imports: Volume: CPC Ver 2 AC.
VideoEval-Pro
VideoEval-Pro is a robust and realistic long video understanding benchmark containing open-ended, short-answer QA problems. The dataset is constructed by reformatting questions from four existing long video understanding MCQ benchmarks: Video-MME, MLVU, LVBench, and LongVideoBench into free-form questions. The paper can be found here. The evaluation code and scripts are available at: TIGER-AI-Lab/VideoEval-Pro
Dataset Structure
Each example in the… See the full description on the dataset page: https://huggingface.co/datasets/TIGER-Lab/VideoEval-Pro.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
哥伦比亚 Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc.在2019-04达75,673.668 美元 千,相较于2019-03的83,199.965 美元 千有所下降。哥伦比亚 Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc.数据按月度更新,2013-02至2019-04期间平均值为75,673.668 美元 千,共75份观测结果。该数据的历史最高值出现于2013-05,达440,430.861 美元 千,而历史最低值则出现于2015-01,为45,553.582 美元 千。CEIC提供的哥伦比亚 Imports Value: cif: MME: RTC: Radio & TV Receivers, Apparatus for Recording & Reproducing Sound & Video, etc.数据处于定期更新的状态,数据来源于National Statistics Administrative Department,数据归类于Global Database的哥伦比亚 – Table CO.JA034: Imports: Value: CPC Ver 2 AC。
Not seeing a result you expected?
Learn how you can add new datasets to our index.
lmms-lab/Video-MME dataset hosted on Hugging Face and contributed by the HF Datasets community