The Moving MNIST dataset contains 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64×64 pixels. The digits frequently intersect with each other and bounce off the edges of the frame
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Detection Moving MNIST (Easy)
Description
Repository: https://github.com/maxploter/detection-moving-mnist A synthetic video dataset for object detection and tracking, featuring moving MNIST digits with:
1-10 digits per sequence Linear trajectories with small random translations 128x128 resolution grayscale frames20 frames per video sequence Digit size 28x28 Per-frame annotations including: Digit labels (0-9) Center coordinates (x,y)… See the full description on the dataset page: https://huggingface.co/datasets/Max-Ploter/detection-moving-mnist-easy.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Note: This dataset includes four difficulty levels, and this is the Medium Version, where digits bounce back upon hitting the edges of the frame.
The dataset consists of videos of moving digits. Each video is paired with a corresponding text file that provides frame-by-frame captions describing the digits' movements.
This dataset is suitable for training diffusion models for video generation.
The other versions of this dataset are:
The dataset consists of 90 000 grayscale videos that show two objects of equal shape and size in which one object approaches the other one. The object speed during the process of approaching is hereby modelled by a proportional-derivative controller. Overall, three different shapes (Rectangle, Triangle and Circle) are provided. Initial configuration of the objects such as position and color were randomly sampled. Different from the moving MNIST dataset, the samples comprise a goal-oriented task, namely one object has to fully cover the other object rather than randomly moving, making it better suitable for testing prediction capabilities of an ML model. For instance, one can use it as a toy dataset to investigate the capacity and output behavior of a deep neural network before testing it on real-world data.
Brief Description The Neuromorphic-MNIST (N-MNIST) dataset is a spiking version of the original frame-based MNIST dataset. It consists of the same 60 000 training and 10 000 testing samples as the original MNIST dataset, and is captured at the same visual scale as the original MNIST dataset (28x28 pixels). The N-MNIST dataset was captured by mounting the ATIS sensor on a motorized pan-tilt unit and having the sensor move while it views MNIST examples on an LCD monitor as shown in this video. A full description of the dataset and how it was created can be found in the paper below. Please cite this paper if you make use of the dataset.
Orchard, G.; Cohen, G.; Jayawant, A.; and Thakor, N. “Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades", Frontiers in Neuroscience, vol.9, no.437, Oct. 2015
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field. With some classification methods (particuarly template-based methods, such as SVM and K-nearest neighbors),
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recording of the MNIST dataset displayed on a screen as viewed by a dynamic vision sensor moving through a fixed trajectory on a pan-tilt unit. Details are in the listed paper.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The dataset consists of 90 000 color videos that show a planar robot manipulator executing articulated manipulation tasks. More precisely, the manipulator grasps a circular object of random color and size and places it on top of a square object/platform of again random color and size. The initial configurations (location, size and color) of the objects were randomly sampled during generation. Different from other datasets such as the moving MNIST dataset, the samples comprise a goal-oriented task as described, making it more suitable for testing prediction capabilities of an ML model. For instance, one can use it as a toy dataset to investigate the capacity and output behavior of a deep neural network before testing it on real-world data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Moving MNIST dataset contains 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64×64 pixels. The digits frequently intersect with each other and bounce off the edges of the frame