The MOTChallenge datasets are designed for the task of multiple object tracking. There are several variants of the dataset released each year, such as MOT15, MOT17, MOT20.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is object tracking data for MOT challenge datasets.The inferenced data is generated by Yolov5 and DeepSort.
The dataset is about tracking objects in 2D in movies with fixed and moving cameras. Most of the objects are pedestrians but there are a few other examples
I just downloaded the zip and am now looking at what is actually inside. A kernel will hopefully clarify how the ground truth can be read.
The dataset was originally download from the MOT challenge site at https://motchallenge.net/data/2D_MOT_2015/#download
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
MOT20 is a dataset for multiple object tracking. The dataset contains 8 challenging video sequences (4 train, 4 test) in unconstrained environments, from crowded places such as train stations, town squares and a sports stadium.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The Growing Strawberries Dataset (GSD) is a curated multiple-object tracking dataset inspired by the growth monitoring of strawberries. The frames were taken at hourly intervals by six cameras for in total of 16 months in 2021 and 2022, covering 12 plants in two greenhouses respectively. The dataset consists of hourly images collected during the cultivation period, bounding box (bbox) annotations of strawberry fruits, and precise identification and tracking of strawberries over time. GSD contains two types of images - RGB (visual spectrum) and OCN (orange, cyan, near-infrared). These images were captured throughout the cultivation period. Each image sequence represents all the images captured by one camera during the year of cultivation. These sequences are named using the format "
SOMPT22 is a multi-object tracking (MOT) benchmark focused on surveillance-style pedestrian tracking.
22 long video sequences (static pole-mounted cameras, 6 – 8 m height)
~51 k annotated frames with bounding boxes + unique track IDs
Outdoor scenes with illumination changes, partial occlusions and appearance similarity
Single class: person
Split files ready for training/validation and standard MOT evaluation tools
SOMPT22 aims to complement generic MOTChallenge-style datasets by stressing long-term ID maintenance under sparse-to-medium crowd density instead of dense, short clips.
Homepage → https://sompt22.github.io Download → Google Drive link in the homepage Citation → ```bibtex @misc{simsek2022sompt22, author = {Simsek, Fatih Emre and Cigla, Cevahir and Kayabol, Koray}, title = {SOMPT22: A Surveillance Oriented Multi-Pedestrian Tracking Dataset}, year = {2022}, eprint = {2208.02580}, archivePrefix = {arXiv}, primaryClass = {cs.CV} }
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All datasets are derived from the https://zenodo.org/records/15103888" target="_blank" rel="noopener">official release of the 4th Anti-UAV Challenge, featuring thermal infrared videos.
Most existing MOT datasets are captured using pinhole cameras, which are characterized by a narrow-FoV and linear sensor motion. However, when panoramic-FoV capture devices experience even slight movements, the entire scene can change drastically, posing significant challenges for object tracking. QuadTrack addresses this challenge by providing a benchmark specifically designed to test MOT algorithms under dynamic, non-linear motion conditions. It enables evaluating algorithm robustness in tracking objects with panoramic, non-uniform motion.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Enhanced animal welfare has emerged as a pivotal element in contemporary precision animal husbandry, with bovine monitoring constituting a significant facet of precision agriculture. The evolution of intelligent agriculture in recent years has significantly facilitated the integration of drone flight monitoring tools and innovative systems, leveraging deep learning to interpret bovine behavior. Smart drones, outfitted with monitoring systems, have evolved into viable solutions for wildlife protection and monitoring as well as animal husbandry. Nevertheless, challenges arise under actual and multifaceted ranch conditions, where scale alterations, unpredictable movements, and occlusions invariably influence the accurate tracking of unmanned aerial vehicles (UAVs). To address these challenges, this manuscript proposes a tracking algorithm based on deep learning, adhering to the Joint Detection Tracking (JDT) paradigm established by the CenterTrack algorithm. This algorithm is designed to satisfy the requirements of multi-objective tracking in intricate practical scenarios. In comparison with several preeminent tracking algorithms, the proposed Multi-Object Tracking (MOT) algorithm demonstrates superior performance in Multiple Object Tracking Accuracy (MOTA), Multiple Object Tracking Precision (MOTP), and IDF1. Additionally, it exhibits enhanced efficiency in managing Identity Switches (ID), False Positives (FP), and False Negatives (FN). This algorithm proficiently mitigates the inherent challenges of MOT in complex, livestock-dense scenarios.
VETRA is a dataset for vehicle tracking in aerial image sequences and presents unique challenges such as low frame rates, small and fast-moving objects, as well as high camera movement. These characteristics allow for extended tracking of numerous vehicles with varying motion behaviors over large areas and pose new challenges for MOT algorithms. VETRA consists of 52 image sequences captured by airplanes and helicopters using DLR’s 3k and 4k camera systems. The acquisition sites are located in Germany and Austria. In addition to the classical training, validation and test sets, VETRA offers a second test set specifically designed for the application of large area monitoring (LAM). The LAM sequences are recorded over 7 rural roads and motorways with a fixed camera speed and configuration. Each road section is captured at 4 different times of the day, enabling the performance of MOT algorithms to be evaluated under different traffic loads in a static environment. Furthermore, the features extracted from the LAM sequences can be utilized in transport research applications.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Modular Operating Theatre (MOT) market is experiencing robust growth, driven by increasing demand for advanced healthcare infrastructure, a surge in surgical procedures, and the need for efficient and flexible healthcare facilities. The market, estimated at $2.5 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 7% from 2025 to 2033, reaching an estimated market value of approximately $4.5 billion by 2033. Key factors propelling this growth include the rising prevalence of chronic diseases necessitating more surgeries, the increasing adoption of minimally invasive surgical techniques requiring specialized operating rooms, and the benefits of modular construction, such as faster deployment, cost-effectiveness, and adaptability to future needs. The segment comprising large hospitals accounts for a significant market share due to their higher capacity and investment capabilities. Stainless steel wall panels dominate the types segment due to their durability, ease of cleaning and sterilization, and overall hygiene benefits. Leading market players are continuously innovating and expanding their product portfolios, contributing to market expansion through technological advancements and strategic partnerships. The Asia-Pacific region, particularly India and China, is projected to witness significant growth due to burgeoning healthcare infrastructure development and rising disposable incomes. While the market presents significant opportunities, challenges remain. High initial investment costs for MOTs could hinder market penetration, especially in resource-constrained settings. Furthermore, regulatory hurdles and stringent safety standards in various regions can pose obstacles to market expansion. However, the long-term cost-effectiveness, reduced construction time, and enhanced operational efficiency offered by MOTs are expected to outweigh these challenges, leading to sustained growth. The competitive landscape is characterized by a mix of established players and emerging companies, fostering innovation and competitive pricing. Future growth will likely depend on the development of more technologically advanced MOTs with features such as integrated imaging systems, advanced ventilation, and enhanced infection control measures. The continued focus on improving patient safety and optimizing surgical workflows will also significantly influence market trends.
The RailEye3D dataset, a collection of train-platform scenarios for applications targeting passenger safety and automation of train dispatching, consists of 10 image sequences captured at 6 railway stations in Austria. Annotations for multi-object tracking are provided in both an unified format as well as the ground-truth format used in the MOTChallenge.
The TrajNet Challenge represents a large multi-scenario forecasting benchmark. The challenge consists on predicting 3161 human trajectories, observing for each trajectory 8 consecutive ground-truth values (3.2 seconds) i.e., t−7,t−6,…,t, in world plane coordinates (the so-called world plane Human-Human protocol) and forecasting the following 12 (4.8 seconds), i.e., t+1,…,t+12. The 8-12-value protocol is consistent with the most trajectory forecasting approaches, usually focused on the 5-dataset ETH-univ + ETH-hotel + UCY-zara01 + UCY-zara02 + UCY-univ. Trajnet extends substantially the 5-dataset scenario by diversifying the training data, thus stressing the flexibility and generalization one approach has to exhibit when it comes to unseen scenery/situations. In fact, TrajNet is a superset of diverse datasets that requires to train on four families of trajectories, namely 1) BIWI Hotel (orthogonal bird’s eye flight view, moving people), 2) Crowds UCY (3 datasets, tilted bird’s eye view, camera mounted on building or utility poles, moving people), 3) MOT PETS (multisensor, different human activities) and 4) Stanford Drone Dataset (8 scenes, high orthogonal bird’s eye flight view, different agents as people, cars etc. ), for a total of 11448 trajectories. Testing is requested on diverse partitions of BIWI Hotel, Crowds UCY, Stanford Drone Dataset, and is evaluated by a specific server (ground-truth testing data is unavailable for applicants).
VisDrone is a large-scale benchmark with carefully annotated ground-truth for various important computer vision tasks, to make vision meet drones. The VisDrone2019 dataset is collected by the AISKYEYE team at Lab of Machine Learning and Data Mining, Tianjin University, China. The benchmark dataset consists of 288 video clips formed by 261,908 frames and 10,209 static images, captured by various drone-mounted cameras, covering a wide range of aspects including location (taken from 14 different cities separated by thousands of kilometers in China), environment (urban and country), objects (pedestrian, vehicles, bicycles, etc.), and density (sparse and crowded scenes). Note that, the dataset was collected using various drone platforms (i.e., drones with different models), in different scenarios, and under various weather and lighting conditions. These frames are manually annotated with more than 2.6 million bounding boxes of targets of frequent interests, such as pedestrians, cars, bicycles, and tricycles. Some important attributes including scene visibility, object class and occlusion, are also provided for better data utilization.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The MOTChallenge datasets are designed for the task of multiple object tracking. There are several variants of the dataset released each year, such as MOT15, MOT17, MOT20.