27 datasets found
  1. R

    Tensorflow Tfrecord Dataset

    • universe.roboflow.com
    zip
    Updated Jan 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TensorflowTfrecord (2022). Tensorflow Tfrecord Dataset [Dataset]. https://universe.roboflow.com/tensorflowtfrecord/tensorflow-tfrecord-w5cw6/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 23, 2022
    Dataset authored and provided by
    TensorflowTfrecord
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Licenseplate Bounding Boxes
    Description

    Tensorflow Tfrecord

    ## Overview
    
    Tensorflow Tfrecord is a dataset for object detection tasks - it contains Licenseplate annotations for 4,181 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. Retina OCT TFRecord Dataset

    • kaggle.com
    zip
    Updated Apr 9, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harsh Soni (2021). Retina OCT TFRecord Dataset [Dataset]. https://www.kaggle.com/datasets/harshsoni/retina-oct-tfrecord-dataset/code
    Explore at:
    zip(5811856844 bytes)Available download formats
    Dataset updated
    Apr 9, 2021
    Authors
    Harsh Soni
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    [Source of description]

    Context

    http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

    Retinal optical coherence tomography (OCT) is an imaging technique used to capture high-resolution cross sections of the retinas of living patients. Approximately 30 million OCT scans are performed each year, and the analysis and interpretation of these images takes up a significant amount of time (Swanson and Fujimoto, 2017).

    https://i.imgur.com/fSTeZMd.png" alt=""> Figure 2. Representative Optical Coherence Tomography Images and the Workflow Diagram [Kermany et. al. 2018] http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

    (A) (Far left) choroidal neovascularization (CNV) with neovascular membrane (white arrowheads) and associated subretinal fluid (arrows). (Middle left) Diabetic macular edema (DME) with retinal-thickening-associated intraretinal fluid (arrows). (Middle right) Multiple drusen (arrowheads) present in early AMD. (Far right) Normal retina with preserved foveal contour and absence of any retinal fluid/edema.

    Content

    The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (NORMAL,CNV,DME,DRUSEN). There are more than 84K OCT examples (TFRecord).

    Examples are labeled as (disease)-(s.no.). The entire dataset is split into 3 parts, namely Train, Val and Test. To retrieve the validation set, for each class, the original training set was grouped by patient ID and 10% of was taken at random. This endured that all images for each patient remained in only one of the set.

    Optical coherence tomography (OCT) images (Spectralis OCT, Heidelberg Engineering, Germany) were selected from retrospective cohorts of adult patients from the Shiley Eye Institute of the University of California San Diego, the California Retinal Research Foundation, Medical Center Ophthalmology Associates, the Shanghai First People’s Hospital, and Beijing Tongren Eye Center between July 1, 2013 and March 1, 2017.

    Before training, each image went through a tiered grading system consisting of multiple layers of trained graders of increasing exper- tise for verification and correction of image labels. Each image imported into the database started with a label matching the most recent diagnosis of the patient. The first tier of graders consisted of undergraduate and medical students who had taken and passed an OCT interpretation course review. This first tier of graders conducted initial quality control and excluded OCT images containing severe artifacts or significant image resolution reductions. The second tier of graders consisted of four ophthalmologists who independently graded each image that had passed the first tier. The presence or absence of choroidal neovascularization (active or in the form of subretinal fibrosis), macular edema, drusen, and other pathologies visible on the OCT scan were recorded. Finally, a third tier of two senior independent retinal specialists, each with over 20 years of clinical retina experience, verified the true labels for each image. The dataset selection and stratification process is displayed in a CONSORT-style diagram in Figure 2B. To account for human error in grading, a validation subset of 993 scans was graded separately by two ophthalmologist graders, with disagreement in clinical labels arbitrated by a senior retinal specialist.

    For additional information: see http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

    TFRecord Info [TensorFlow]

    Feature descriptor to decode each example is given as: ```python3 FEATURE_DESCRIPTOR = { 'image': tf.io.FixedLenFeature([], tf.string), # image encoded as binary string 'label': tf.io.FixedLenFeature([], tf.int64), # label encoded label 'label_name': tf.io.FixedLenFeature([], tf.string) # name of class [NORMAL,CNV,DME,DRUSEN] }

    code2label = { 'CNV': 0, 'DME': 1, 'DRUSEN': 2, 'NORMAL': 3 } ```

    Acknowledgements

    Data: https://data.mendeley.com/datasets/rscbjbr9sj/2

    Citation: http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

    Description Source: https://www.kaggle.com/paultimothymooney/kermany2018

    Inspiration

    Deep learning based methods to detect and classify retinal abnormalities using OCT images.

  3. Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gong Chengjuan; Yin Ranyu; Long Tengfei; He Guojin; Jiao Weili; Wang Guizhou (2024). Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A Spatial-Temporal Approach and Dataset" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8419699
    Explore at:
    Dataset updated
    Feb 4, 2024
    Dataset provided by
    Aerospace Information Research Institute, Chinese Academy of Sciences
    Authors
    Gong Chengjuan; Yin Ranyu; Long Tengfei; He Guojin; Jiao Weili; Wang Guizhou
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset is built for time-series Sentinel-2 cloud detection and stored in Tensorflow TFRecord (refer to https://www.tensorflow.org/tutorials/load_data/tfrecord).

    Each file is compressed in 7z format and can be decompressed using Bandzip or 7-zip software.

    Dataset Structure:

    Each filename can be split into three parts using underscores. The first part indicates whether it is designated for training or validation ('train' or 'val'); the second part indicates the Sentinel-2 tile name, and the last part indicates the number of samples in this file.

    For each sample, it includes:

    Sample ID;

    Array of time series 4 band image patches in 10m resolution, shaped as (n_timestamps, 4, 42, 42);

    Label list indicating cloud cover status for the center (6\times6) pixels of each timestamp;

    Ordinal list for each timestamp;

    Sample weight list (reserved);

    Here is a demonstration function for parsing the TFRecord file:

    import tensorflow as tf

    init Tensorflow Dataset from file name

    def parseRecordDirect(fname): sep = '/' parts = tf.strings.split(fname,sep) tn = tf.strings.split(parts[-1],sep='_')[-2] nn = tf.strings.to_number(tf.strings.split(parts[-1],sep='_')[-1],tf.dtypes.int64) t = tf.data.Dataset.from_tensors(tn).repeat().take(nn) t1 = tf.data.TFRecordDataset(fname) ds = tf.data.Dataset.zip((t, t1)) return ds

    keys_to_features_direct = { 'localid': tf.io.FixedLenFeature([], tf.int64, -1), 'image_raw_ldseries': tf.io.FixedLenFeature((), tf.string, ''), 'labels': tf.io.FixedLenFeature((), tf.string, ''), 'dates': tf.io.FixedLenFeature((), tf.string, ''), 'weights': tf.io.FixedLenFeature((), tf.string, '') }

    The Decoder (Optional)

    class SeriesClassificationDirectDecorder(decoder.Decoder): """A tf.Example decoder for tfds classification datasets.""" def init(self) -> None: super()._init_()

    def decode(self, tid, ds): parsed = tf.io.parse_single_example(ds, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) sample_dict = { 'tid': tid, # tile ID 'dates': dates, # Date list 'localid': parsed['localid'], # sample ID 'imgs': decoded, # image array 'labels': label, # label list 'weights': weight } return sample_dict

    simple function

    def preprocessDirect(tid, record): parsed = tf.io.parse_single_example(record, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) return tid, dates, parsed['localid'], decoded, label, weight

    t1 = parseRecordDirect('filename here') dataset = t1.map(preprocessDirect, num_parallel_calls=tf.data.experimental.AUTOTUNE)

    #

    Class Definition:

    0: clear

    1: opaque cloud

    2: thin cloud

    3: haze

    4: cloud shadow

    5: snow

    Dataset Construction:

    First, we randomly generate 500 points for each tile, and all these points are aligned to the pixel grid center of the subdatasets in 60m resolution (eg. B10) for consistence when comparing with other products. It is because that other cloud detection method may use the cirrus band as features, which is in 60m resolution.

    Then, the time series image patches of two shapes are cropped with each point as the center.The patches of shape (42 \times 42) are cropped from the bands in 10m resolution (B2, B3, B4, B8) and are used to construct this dataset.And the patches of shape (348 \times 348) are cropped from the True Colour Image (TCI, details see sentinel-2 user guide) file and are used to interpreting class labels.

    The samples with a large number of timestamps could be time-consuming in the IO stage, thus the time series patches are divided into different groups with timestamps not exceeding 100 for every group.

  4. Tensorflow Malaria Classification Dataset

    • kaggle.com
    zip
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prajesh Sanghvi (2023). Tensorflow Malaria Classification Dataset [Dataset]. https://www.kaggle.com/datasets/prajeshsanghvi/tensorflow-malaria-classification-dataset
    Explore at:
    zip(152255961 bytes)Available download formats
    Dataset updated
    Jun 16, 2023
    Authors
    Prajesh Sanghvi
    Description

    Dataset is in .tfrecords format. use the following code to parse the data into tensorflow usable format:

    import tensorflow as tf
    PATH = '/kaggle/working/tf_malaria.tfrecord'
    
    full_data = tf.data.TFRecordDataset(
      filenames = [FPATH]
    )
    
    def parse_tfrecords(example):
      
      feature_description = {
        "images": tf.io.FixedLenFeature([], tf.string),
        "labels": tf.io.FixedLenFeature([], tf.int64),
      }
      
      example = tf.io.parse_single_example(example, feature_description)
      example["images"] = tf.io.decode_jpeg(example["images"], channels = 3)
      
      return example["images"], example["labels"]
    
    parsed_full_data = (
       full_data
      .map(parse_tfrecords)
    )
    
  5. Crown of thorns starfish dataset in TFRecord

    • kaggle.com
    zip
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khanh (2021). Crown of thorns starfish dataset in TFRecord [Dataset]. https://www.kaggle.com/datasets/khanhlvg/crown-of-thorns-starfish-dataset-in-tfrecord
    Explore at:
    zip(3192917187 bytes)Available download formats
    Dataset updated
    Nov 23, 2021
    Authors
    Khanh
    Description

    This dataset is converted from the Tensorflow - Help Protect the Great Barrier Reef competition dataset.

  6. h

    wikitext-v1-tfrecords

    • huggingface.co
    Updated Apr 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TensorFlow TPU (2023). wikitext-v1-tfrecords [Dataset]. https://huggingface.co/datasets/tf-tpu/wikitext-v1-tfrecords
    Explore at:
    Dataset updated
    Apr 28, 2023
    Dataset authored and provided by
    TensorFlow TPU
    Description

    This dataset repository contains the TFRecord shards of the [WikText (v1) dataset](https://huggingface.co/datasets/wikitext. We used this script to prepare these TFRecord shards. For more details on how these TFRecord shards should be used, refer to the following tutorial: Training a masked language model end-to-end from scratch on TPUs .

  7. my tf record

    • kaggle.com
    zip
    Updated Feb 14, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    huiqin (2020). my tf record [Dataset]. https://www.kaggle.com/qinhui1999/my-tf-record
    Explore at:
    zip(3260451 bytes)Available download formats
    Dataset updated
    Feb 14, 2020
    Authors
    huiqin
    Description

    Dataset

    This dataset was created by huiqin

    Contents

  8. VinBig TFRecords for Object Detection

    • kaggle.com
    zip
    Updated Jan 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akshit Bhalla (2021). VinBig TFRecords for Object Detection [Dataset]. https://www.kaggle.com/bhallaakshit/vinbig-tfrecords-for-object-detection
    Explore at:
    zip(670450876 bytes)Available download formats
    Dataset updated
    Jan 30, 2021
    Authors
    Akshit Bhalla
    Description

    Context

    The VinBigData Chest X-ray Abnormalities Detection competition involves building an object detection model to classify and localize thoracic abnormalities. This dataset was created to support building TensorFlow models.

    Content

    The TFRecord format is a simple format for storing a sequence of binary records. This format is efficient in terms of storage and retrieval. This Kaggle dataset comprises 25 TFRecords (shards) created from the chest x-rays and their annotations from the VinBidData competition. TFrecord is the desired input format for building object detection models with TensorFlow 2. This Kaggle dataset was created keeping in mind that it can be cumbersome to create them. Anyone willing to build object detection models with TensorFlow may use these data to train and evaluate their own models with ease.

    NOTE

    1. The original x-rays were resized (max dimension = 500px) before TFRecord creation.
    2. The original x-rays were in DICOM format. They were converted to JPEG for TFRecord creation.
    3. The images were preprocessed (by applying CLAHE) before TFRecord creation.
    4. The original metadata had many annotations from different radiologists for the same thoracic abnormality. These annotations were simplified using Weighted Boxes Fusion. These data have also been made available in a CSV named data.csv.
    5. The TensorFlow 2 Object Detection API requires the classes to be from 1 to n and outputs 0 when no class is found. Since the labels start with 0 in the original metadata, a unit increment was made to them. The new mapping has also been made available as LabelMap.pbtxt.
    6. The TFRecord shards were created keeping in mind the distribution of abnormalities and that each image should belong to exactly one shard.

    The entire pre-processing procedure can be found here.

    Acknowledgements

    The preprocessing and TFRecord creation would not be possible without this notebook.

  9. R

    Cifar10 Dataset

    • universe.roboflow.com
    • opendatalab.com
    • +3more
    zip
    Updated Aug 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Popular Benchmarks (2022). Cifar10 Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/cifar10-uml7g/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 10, 2022
    Dataset authored and provided by
    Popular Benchmarks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Animals People
    Description

    CIFAR-10

    The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. * More info on CIFAR-10: https://www.cs.toronto.edu/~kriz/cifar.html * TensorFlow listing of the dataset: https://www.tensorflow.org/datasets/catalog/cifar10 * GitHub repo for converting CIFAR-10 tarball files to png format: https://github.com/knjcode/cifar2png

    All images were sized 32x32 in the original dataset

    The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images [in the original dataset].

    The dataset is divided into five training batches and one test batch, each with 10,000 images. The test batch contains exactly 1,000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5,000 images from each class.

    Here are the classes in the dataset, as well as 10 random images from each: https://i.imgur.com/EGA4Bbf.png" alt="Visualized CIFAR-10 Dataset Subset">

    The classes are completely mutually exclusive. There is no overlap between automobiles and trucks. Automobile includes sedans, SUVs, things of that sort. Truck includes only big trucks. Neither includes pickup trucks.

    Version 1 (original-images_Original-CIFAR10-Splits):

    • Original images, with the original splits for CIFAR-10: train (83.33% of images - 50,000 images) set and test (16.67% of images - 10,000 images) set only.
    • This version was not trained

    Version 3 (original-images_trainSetSplitBy80_20):

    • Original, raw images, with the train set split to provide 80% of its images to the training set (approximately 40,000 images) and 20% of its images to the validation set (approximately 10,000 images)
    • https://blog.roboflow.com/train-test-split/ https://i.imgur.com/kSPeKGn.png" alt="Train/Valid/Test Split Rebalancing">

    Citation:

    @TECHREPORT{Krizhevsky09learningmultiple,
      author = {Alex Krizhevsky},
      title = {Learning multiple layers of features from tiny images},
      institution = {},
      year = {2009}
    }
    
  10. Google Landmarks 2020 - Triplet Loss tfrecords

    • kaggle.com
    zip
    Updated Aug 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt (2020). Google Landmarks 2020 - Triplet Loss tfrecords [Dataset]. https://www.kaggle.com/datasets/mattbast/google-landmarks-2020-tfrecords/code
    Explore at:
    zip(4022222301 bytes)Available download formats
    Dataset updated
    Aug 5, 2020
    Authors
    Matt
    Description

    Context

    The latest Google Landmark Retrieval competition contains a crazy large dataset (1.5 million images) and asks participants to only use notebooks. TPUs are a great way to quickly train models on large volumes of this data. To realise the full potential of a TPU while using Tensorflow it is worth feeding the data into it as tfrecords.

    Content

    This dataset contains a sample of the total dataset but transformed into tfrecords. As I created this for use with a model that uses triplet loss you will find three images inside each example (i.e. a triplet). If you'd like to find out more about how the dataset is formed you can check out the notebook I used to create it here.

    Acknowledgements

    The notebook I used to create this dataset was largely inspired by Chris Deottes notebook so this is me saying thanks 😁.

  11. d

    Data from: RockNet: Rockfall and earthquake detection and association via...

    • search.dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wu-Yu Liao; En-Jui Lee; Chung-Ching Wang; Po Chen; Floriane Provost; Clément Hibert; Jean-Philippe Malet (2025). Data from: RockNet: Rockfall and earthquake detection and association via multitask learning and transfer learning [Dataset]. http://doi.org/10.5061/dryad.tx95x6b2f
    Explore at:
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Wu-Yu Liao; En-Jui Lee; Chung-Ching Wang; Po Chen; Floriane Provost; Clément Hibert; Jean-Philippe Malet
    Time period covered
    Jan 1, 2022
    Description

    Seismological data can provide timely information for slope failure hazard assessments, among which rockfall waveform identification is challenging for its high waveform variations across different events and stations. A rockfall waveform does not have typical body waves as earthquakes do, so researchers have made enormous efforts to explore characteristic function parameters for automatic rockfall waveform detection. With recent advances in deep learning, algorithms can learn to automatically map the input data to target functions. We develop RockNet via multitask and transfer learning; the network consists of a single-station detection model and an association model. The former discriminates rockfall and earthquake waveforms. The latter determines the local occurrences of rockfall and earthquake events by assembling the single-station detection model representations with multiple station recordings. RockNet achieves macro F1 scores of 0.990 and 0.981 in terms of discriminating earthqu..., The raw seismic waveforms (.sac files) were recorded by the Geophones and DATA-CUBE (https://digos.eu/wp-content/uploads/2020/11/2020-10-21-Broschure.pdf) and converted to mseed format with cub2mseed command (https://digos.eu/CUBE/DATA-CUBE-Download-Data-2017-06.pdf) of the CubeTools utility package (https://digos.eu/seismology/). The .tfrecord files are generated using the scripts host on Github and a permanent identifier to Zenodo., Please clone the RockNet project on Github (https://github.com/tso1257771/RockNet) and put the downloaded dataset under the cloned directory. *The SAC software (Seismic Analysis Code, http://ds.iris.edu/ds/nodes/dmc/software/downloads/sac/102-0/) is used to process and visualize SAC files. *The ObsPy (https://docs.obspy.org/) package is used to process and manipulate SAC files in the python interface. *The h5py package (https://docs.h5py.org/en/stable/) is used to store seismic data and header information (i.e., metadata, including station and labeled information) in HDF5 (https://hdfgroup.org/) format for broader usages. *The ObsPy and TensorFlow packages (https://www.tensorflow.org/) are collaboratively used to convert the SAC files into the TFRecord format (https://www.tensorflow.org/tutorials/load_data/tfrecord) for TensorFlow applications.

  12. 4 Bars Monophonic Melodies Dataset (Pitch Sequence)

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matteo Pettenò; Matteo Pettenò (2025). 4 Bars Monophonic Melodies Dataset (Pitch Sequence) [Dataset]. http://doi.org/10.5281/zenodo.13369389
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Matteo Pettenò; Matteo Pettenò
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is designed for applications in music information retrieval, algorithmic composition, and machine learning tasks involving symbolic music data and it consists in a collection of unique 4 bars monophonic melodies represented as MIDI pitch sequences and each accompanied by thirteen attributes obtained with computational methods. The dataset has been generated using the Resolv system's pipelines starting from the full version of the Lakh MIDI Dataset (a collection of 176,581 unique MIDI files).

    The dataset has been used to train the models described in the papers:

    M. Pettenò, A. I. Mezza, and A. Bernardini, "Conditional Diffusion As Latent Constraints for Controllable Symbolic Music Generation", in Proc. of the 26th International Society for Music Information Retrieval Conference (ISMIR 2025), Daejeon, Korea, S ept. 21-25, 2025.

    M. Pettenò, A. I. Mezza, and A. Bernardini, "On the Joint Minimization of Regularization Loss Functions in Deep Variational Bayesian Methods for Attribute-Controlled Symbolic Music Generation", in Proc. of the 33rd European Signal Processing Conference (EUSIPCO 2025), Palermo, Italy, Sept. 8-12, 2025.

    The full article of this work also contains all the details on how the attributes have been obtained and on the implementation of the pipelines used for the generation and here it is worth to point out that melodies have been quanized to 4 steps per quarter and only 4/4 time signatures have been considered, hence each melody consists of N = 64 steps where each step is a number in the range [21-108], the MIDI pitches available in a standard piano, or a token in the set {128, 129} for hold note and note off events respectively. No additional performance features (e.g., dynamics, duration, or timing) are included, making this dataset a purely pitch-based collection.

    Three datasets (train, validation and test) are provided as TFRecord file divided into 8 shards that contain the data in the Tensorflow's SequenceExample format in which the feature_lists field contains the pitch sequence as a list of integers and the context field its attributes.

    The table below shows the numbers of unique melodies contained in the three datasets.

    TrainValidationTest
    Total unique melodies10,126,67670,90822,265

    And here is the list of computed attributes for each melody:

    Attribute NameSequenceExample Context KeyDescription
    Toussaint Metrical ComplexitytoussaintA metric that measures the degree of syncopation in rhythm patterns.
    Note Densitynote_densityMeasures the density of note onsets within the melody.
    Pitch Rangepitch_rangeAn indicator of how wide or narrow the melody is in terms of its pitch content.
    ContourcontourMeasures the degree to which the melody moves up or down.
    Note Change Rationote_change_ratioThe number of note changes normalized to the total number of steps N.
    Dynamic Rangedynamic_rangeThe difference between the maximum and minimum note velocities.
    Longest Repetitive Sectionlen_longest_rep_sectionThe length of the longest repetitive section in the melody normalized to the total number of steps N. A repetitive section is defined as a note that consecutively repeats at least r = 4 times.
    Repetitive Section Ratiorepetitive_section_ratioThe ratio between the total number of repetitive sections and a normalization factor N/r = 64/4 = 16.
    Hold Note Steps Ratioratio_hold_note_stepsThe ratio between the number steps where a note is hold and the total steps N.
    Note Off Steps Ratioratio_note_off_stepsThe ratio between the number steps where no note is played and the total steps N.
    Unique Notes Ratiounique_notes_ratioThe ratio of unique notes is defined with respect to the total number of MIDI pitches considered (88) and the total number of steps N.
    Unique Bigrams Ratiounique_bigrams_ratioIt is the ratio of the unique bigrams in the melody with respect to the total numbers of steps N.
    Unique Trigrams Ratiounique_trigrams_ratioIt is the ratio of the unique trigrams in the melody with respect to the total numbers of steps N.

    To access the content of a SequenceExample use the tf.io.parse_single_sequence_example, for instance:

    tf.io.parse_single_sequence_example(
    serialized_example,
    context_features={
    "toussaint": tf.io.FixedLenFeature([], dtype=tf.float32, default_value=0),
    "note_density": tf.io.FixedLenFeature([], dtype=tf.float32, default_value=0),
    },
    sequence_features=["pitch_seq"]
    )
  13. Dataset for OrbID: Identifying Orbcomm Satellite RF Fingerprints

    • zenodo.org
    application/gzip
    Updated Sep 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cedric Solenthaler; Cedric Solenthaler; Joshua Smailes; Joshua Smailes; Martin Strohmeier; Martin Strohmeier (2025). Dataset for OrbID: Identifying Orbcomm Satellite RF Fingerprints [Dataset]. http://doi.org/10.5281/zenodo.17123102
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cedric Solenthaler; Cedric Solenthaler; Joshua Smailes; Joshua Smailes; Martin Strohmeier; Martin Strohmeier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 2024 - Nov 2024
    Description

    OrbID dataset

    Overview and Context

    This is the dataset collected and used for training of the RF identification paper "OrbID: Identifying Orbcomm Satellite RF Fingerprints".
    It is in the tfrecord format.
    The dataset contains recordings of ORBCOMM messages that were parsed and downsampled.
    Only recordings where the Fletcher checksum was correct were retained.
    The oversampling ratio is 2.

    The paper can be found here: https://doi.org/10.14722/spacesec.2025.23031

    Structure

    This record contains three folders. The training_testing_data contains the data used for training and testing. The validation_data contains data used for generating the final figure s and statistics used in the paper. The analysis_spoofed_dataset folder contains spoofed data that was replayed with an UHDmini SDR and fed into the collection system via coax. This data was also only used in the final analysis and not during training.


    Features


    - id: a number representing the transmitting satellite
    - sample_I: Array of floats that represent the inphase part of the signal
    - sample_Q: Array of floats that represent the quadrature part of the signal
    - snrdb: the estimated SNR of the singal (in deciBells) (calculated as {energy in bandwidth of signal}/{Energy in adjacent frequencies})
    - loc: An ID encoding the receiver location
    - ant: An ID encoding the used antenna type
    - sdr: An ID encoding the used SDR
    - timestamp: A linux epoch timestamp of the recoording time
    - idencoded: A bool that states if the signal contains identifying information (e.g. satellite orbital data, used frequencies, id, ...)


    Used Encoding IDs

    Locations: 1 St. Gallen, 2 Zurich

    Antennas: 1 QFH, 2 Half Wave Dipole, 3 V-Dipole, 4 Turnstile

    SDRs: 1 RTLSDR (sample rate 1.2288MS/s), 2 HackRF one (sample rate 2MS/s), 3 UHDmini (only used for replaying)

    Example usage (python)

    import tensorflow as tf # tested with tensorflow[and-cuda]==2.17.1 and python 3.10.0

    samples_per_message = 196
    feature_description = {
    'id': tf.io.FixedLenFeature([], tf.int64),
    'sample_I': tf.io.FixedLenFeature([samples_per_message], tf.float32),
    'sample_Q': tf.io.FixedLenFeature([samples_per_message], tf.float32),
    'snrdb': tf.io.FixedLenFeature([], tf.float32),
    'loc': tf.io.FixedLenFeature([], tf.int64),
    'ant': tf.io.FixedLenFeature([], tf.int64),
    "sdr": tf.io.FixedLenFeature([], tf.int64),
    "timestamp": tf.io.FixedLenFeature([], tf.int64),
    "idencoded": tf.io.FixedLenFeature([], tf.int64),
    }

    data_dir = ... # path to the folder that holds the tfrecord files
    files_in = [file.path for file in os.scandir(data_dir) if file.path.endswith(".tfrecord")]

    dataset = tf.data.TFRecordDataset(files)

    def _parse_ds_function(example_proto):
    # Parse the input tf.train.Example proto using the dictionary above.
    return tf.io.parse_single_example(example_proto, feature_description)

    dataset = dataset.map(_parse_ds_function)

    Citation

    When using this dataset, please cite our paper "OrbID: Identifying Orbcomm Satellite RF Fingerprints". The BibTeX citation is given below.

    @inproceedings{solenthaler_orbid_2025,
    address = {San Diego, CA, USA},
    title = {{OrbID}: {Identifying} {Orbcomm} {Satellite} {RF} {Fingerprints}},
    isbn = {979-8-9919276-1-1},
    shorttitle = {{OrbID}},
    url = {https://www.ndss-symposium.org/wp-content/uploads/spacesec25-final31.pdf},
    doi = {10.14722/spacesec.2025.23031},
    booktitle = {Proceedings 2025 {Workshop} on {Security} of {Space} and {Satellite} {Systems}},
    publisher = {Internet Society},
    author = {Solenthaler, Cédric and Smailes, Joshua and Strohmeier, Martin},
    year = {2025},
    }

  14. Raccoon Dataset

    • universe.roboflow.com
    zip
    Updated May 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow (2021). Raccoon Dataset [Dataset]. https://universe.roboflow.com/roboflow-gw7yv/raccoon/model/6
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 5, 2021
    Dataset authored and provided by
    Roboflowhttps://roboflow.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Raccoons
    Description

    Overview

    This dataset contains 196 images of raccoons and 213 bounding boxes (some images have two raccoons). This is a single class problem, and images vary in dimensions. It's a great first dataset for getting started with object detection.

    This dataset was originally collected by Dat Tran, released with MIT license, and posted here with his permission.

    https://i.imgur.com/cRQJ1PB.png" alt="Raccoon Example">

    Per Roboflow's Dataset Health Check, here's how images vary in size:

    https://i.imgur.com/sXc3iAF.png" alt="Raccoon Aspect Ratio">

    Use Cases

    Find raccoons!

    This dataset is a great starter dataset for building an object detection model. Dat has written a comprehensive tutorial here.

    Getting Started

    Fork or download this dataset and follow Dat's tutorial for more.

  15. Udacity Self Driving Car Dataset

    • universe.roboflow.com
    • kaggle.com
    zip
    Updated Mar 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow (2025). Udacity Self Driving Car Dataset [Dataset]. https://universe.roboflow.com/roboflow-gw7yv/self-driving-car/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 24, 2025
    Dataset authored and provided by
    Roboflowhttps://roboflow.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Obstacles
    Description

    Overview

    The original Udacity Self Driving Car Dataset is missing labels for thousands of pedestrians, bikers, cars, and traffic lights. This will result in poor model performance. When used in the context of self driving cars, this could even lead to human fatalities.

    We re-labeled the dataset to correct errors and omissions. We have provided convenient downloads in many formats including VOC XML, COCO JSON, Tensorflow Object Detection TFRecords, and more.

    Some examples of labels missing from the original dataset: https://i.imgur.com/A5J3qSt.jpg" alt="Examples of Missing Labels">

    Stats

    The dataset contains 97,942 labels across 11 classes and 15,000 images. There are 1,720 null examples (images with no labels).

    All images are 1920x1200 (download size ~3.1 GB). We have also provided a version downsampled to 512x512 (download size ~580 MB) that is suitable for most common machine learning models (including YOLO v3, Mask R-CNN, SSD, and mobilenet).

    Annotations have been hand-checked for accuracy by Roboflow.

    https://i.imgur.com/bOFkueI.pnghttps://" alt="Class Balance">

    Annotation Distribution: https://i.imgur.com/NwcrQKK.png" alt="Annotation Heatmap">

    Use Cases

    Udacity is building an open source self driving car! You might also try using this dataset to do person-detection and tracking.

    Using this Dataset

    Our updates to the dataset are released under the MIT License (the same license as the original annotations and images).

    Note: the dataset contains many duplicated bounding boxes for the same subject which we have not corrected. You will probably want to filter them by taking the IOU for classes that are 100% overlapping or it could affect your model performance (expecially in stoplight detection which seems to suffer from an especially severe case of duplicated bounding boxes).

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

    Roboflow Wordmark

  16. wikisum

    • github.com
    • opendatalab.com
    Updated Jul 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2023). wikisum [Dataset]. https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/wikisum/README.md
    Explore at:
    Dataset updated
    Jul 7, 2023
    Dataset provided by
    Googlehttp://google.com/
    Description

    The dataset from the paper Generating Wikipedia by Summarizing Long Sequences. The task is to generate a Wikipedia article based on the contents of the cited references in that article and the top 10 Google search results for the article's title.

    There are 2 sources for the reference URLs used: 1. CommonCrawl, an open-source crawl of the web. The advantage of using CommonCrawl is that the dataset is perfectly reproducible. However, there is limited coverage of the reference URLs. 1. Live web fetches. Coverage is considerably increased, but the content is subject to change.

    The dataset includes:

    URLs: The dataset contains ~90M URLs total (~2.3M Wikipedia articles, each with ~40 reference URLs). The URLs in the dataset are available in sharded JSON files.

    Wikipedia Articles: We have processed the Wikipedia articles slightly to extract the title, section breaks, and section headings. The processed Wikipedia content is available in sharded TFRecord files containing serialized tensorflow.Example protocol buffers.

    CommonCrawl References Index: To enable efficiently extracting the reference URLs from CommonCrawl, we provide a JSON file per CommonCrawl file which maps a reference URL contained in that CommonCrawl file to a list of shard ids. These shards are the ones that contain one or more Wikipedia articles that cite this reference.

  17. WIDER FACE

    • kaggle.com
    • tensorflow.org
    • +4more
    zip
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bosco Yung (2022). WIDER FACE [Dataset]. https://www.kaggle.com/datasets/boscoyung/wider-face
    Explore at:
    zip(3674709308 bytes)Available download formats
    Dataset updated
    Jun 14, 2022
    Authors
    Bosco Yung
    Description

    Dataset

    This dataset was created by Bosco Yung

    Contents

    Source: http://shuoyang1213.me/WIDERFACE/

  18. One Piece character detection with TFrecords

    • kaggle.com
    zip
    Updated Apr 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ibrahim Serouis 99 (2022). One Piece character detection with TFrecords [Dataset]. https://www.kaggle.com/ibrahimserouis99/one-piece-character-detection-with-tfrecords
    Explore at:
    zip(555182807 bytes)Available download formats
    Dataset updated
    Apr 25, 2022
    Authors
    Ibrahim Serouis 99
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    I wanted to build an object detection model for a unusual task, with my own dataset, for learning purposes.

    Content

    Data

    This folder contains the training and validation datasets, as TFRecord files.

    TFRecords

    This folder contains approximately 300 images annotations stored as TFRecord files.

    Test images

    As the name suggest, this folder contains some images that will be used for testing purposes (post training).

    Workspace

    This folder contains a ready-to-use workspace, including : - The pipeline configuration file - The model that will be used for fine-tuning (in our case, an SSD Resnet50 Deep Learning model)

    Acknowledgements

    Creating this dataset and training the model would have been impossible without : - Parthbkgadoya's notebook on how to setup Tensorflow Object Detection API on Kaggle - Microsoft Visual Object Tagging Tool (VoTT) for image annotation - Fatkun Batch Download Image for bulk image download - This tutorial by Tensorflow Object Detection API on how to create a train a custom object detector

  19. R

    Self Driving Car Re Encode Dataset

    • universe.roboflow.com
    zip
    Updated Feb 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad Dwyer (2020). Self Driving Car Re Encode Dataset [Dataset]. https://universe.roboflow.com/brad-dwyer/self-driving-car-re-encode/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 7, 2020
    Dataset authored and provided by
    Brad Dwyer
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Obstacles
    Description

    Overview

    The original Udacity Self Driving Car Dataset is missing labels for thousands of pedestrians, bikers, cars, and traffic lights. This will result in poor model performance. When used in the context of self driving cars, this could even lead to human fatalities.

    We re-labeled the dataset to correct errors and omissions. We have provided convenient downloads in many formats including VOC XML, COCO JSON, Tensorflow Object Detection TFRecords, and more.

    Some examples of labels missing from the original dataset: https://i.imgur.com/A5J3qSt.jpg" alt="Examples of Missing Labels">

    Use Cases

    Udacity is building an open source self driving car! You might also try using this dataset to do person-detection and tracking.

    Using this Dataset

    Our updates to the dataset are released under the same license as the original.

    Note: the dataset contains many duplicated bounding boxes for the same subject which we have not corrected. You will probably want to filter them by taking the IOU for classes that are 100% overlapping or it could affect your model performance (expecially in stoplight detection which seems to suffer from an especially severe case of duplicated bounding boxes).

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

    Roboflow Wordmark

  20. Data from: The QCML dataset, Quantum chemistry reference data from 33.5M DFT...

    • zenodo.org
    bin, text/x-python
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Ganscha; Stefan Ganscha; Oliver T. Unke; Oliver T. Unke; Daniel Ahlin; Daniel Ahlin; Hartmut Maennel; Hartmut Maennel; Sergii Kashubin; Sergii Kashubin; Klaus-Robert Mueller; Klaus-Robert Mueller (2025). Data from: The QCML dataset, Quantum chemistry reference data from 33.5M DFT and 14.7B semi-empirical calculations [Dataset]. http://doi.org/10.5281/zenodo.14859804
    Explore at:
    text/x-python, binAvailable download formats
    Dataset updated
    Mar 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Stefan Ganscha; Stefan Ganscha; Oliver T. Unke; Oliver T. Unke; Daniel Ahlin; Daniel Ahlin; Hartmut Maennel; Hartmut Maennel; Sergii Kashubin; Sergii Kashubin; Klaus-Robert Mueller; Klaus-Robert Mueller
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    2024
    Description

    Machine learning (ML) methods enable prediction of the properties of chemical structures without computationally expensive ab initio calculations. The quality of such predictions depends on the reference data that was used to train the model. In this work, we introduce the QCML dataset: A comprehensive dataset for training ML models for quantum chemistry. The QCML dataset systematically covers chemical space with small molecules consisting of up to 8 heavy atoms and includes elements from a large fraction of the periodic table, as well as different electronic states. Starting from chemical graphs, conformer search and normal mode sampling are used to generate both equilibrium and off-equilibrium 3D structures, for which various properties are calculated with semi-empirical methods (14.7 billion entries) and density functional theory (33.5 million entries). The covered properties include energies, forces, multipole moments, and other quantities, e.g. Kohn-Sham matrices. We provide a first demonstration of the utility of our dataset by training ML-based force fields on the data and applying them to run molecular dynamics simulations.

    The data is available as TensorFlow dataset (TFDS) and can be accessed from the publicly available Google Cloud Storage at gs://qcml-datasets/tfds/. (See "Directory structure" below.)

    For information on different access options (command-line tools, client libraries, etc), please see https://cloud.google.com/storage/docs/access-public-data.

    Directory structure

    • gs://qcml-datasets (GCS Bucket)
      • tfds (TFDS data directory)
        • qcml (TFDS dataset name)
          • dft_atomic_numbers (TFDS builder config name)
            • 1.0.0 (Current version)
              • dataset_info.json
              • features.json
              • qcml-full.tfrecord-X-of-Y (TFDS data shards, see below)
          • ...
          • dft_positions
          • xtb_all

    Builder configurations

    Format: Builder config name: number of shards (rounded total size)

    Semi-empirical calculations:

    • xtb_all: 85000 (69 TB)

    DFT calculations:

    • dft_atomic_numbers: 11 (3 GB)
    • dft_d4_atomic_charges: 11 (4 GB)
    • dft_d4_c6_coefficients: 11 (4 GB)
    • dft_d4_correction: 11 (8 GB)
    • dft_d4_energy: 11 (2 GB)
    • dft_d4_forces: 11 (7 GB)
    • dft_d4_polarizabilities: 11 (4 GB)
    • dft_force_field: 11 (18 GB)
    • dft_force_field_d4: 110 (24 GB)
    • dft_force_field_mbd: 110 (24 GB)
    • dft_gfn0_dipole: 11 (3 GB)
    • dft_gfn0_eeq_charges: 11 (4 GB)
    • dft_gfn0_energy: 11 (2 GB)
    • dft_gfn0_forces: 11 (7 GB)
    • dft_gfn0_formation_energy: 11 (3 GB)
    • dft_gfn0_orbital_energies_a: 11 (8 GB)
    • dft_gfn0_orbital_occupations_a: 11 (8 GB)
    • dft_gfn0_wiberg_bond_orders: 110 (29 GB)
    • dft_gfn2_dipole: 11 (3 GB)
    • dft_gfn2_energy: 11 (2 GB)
    • dft_gfn2_forces: 11 (7 GB)
    • dft_gfn2_formation_energy: 11 (3 GB)
    • dft_gfn2_mulliken_charges: 11 (4 GB)
    • dft_gfn2_orbital_energies_a: 11 (7 GB)
    • dft_gfn2_orbital_occupations_a: 11 (7 GB)
    • dft_gfn2_wiberg_bond_orders: 110 (29 GB)
    • dft_is_outlier: 11 (2 GB)
    • dft_mbd_c6_coefficients: 11 (4 GB)
    • dft_mbd_correction: 11 (8 GB)
    • dft_mbd_energy: 11 (2 GB)
    • dft_mbd_forces: 11 (7 GB)
    • dft_mbd_polarizabilities: 11 (4 GB)
    • dft_metadata: 11 (11 GB)
    • dft_multipole_moments: 11 (8 GB)
    • dft_pbe0_core_hamiltonian_matrix: 110000 (30 TB)
    • dft_pbe0_density_matrix_a: 110000 (30 TB)
    • dft_pbe0_density_matrix_b: 110000 (3 TB)
    • dft_pbe0_dipole: 11 (3 GB)
    • dft_pbe0_electronic_free_energy: 11 (3 GB)
    • dft_pbe0_energy: 11 (2 GB)
    • dft_pbe0_forces: 11 (7 GB)
    • dft_pbe0_formation_energy: 11 (3 GB)
    • dft_pbe0_grid_density_a: 110000 (27 TB)
    • dft_pbe0_grid_density_b: 110000 (3 TB)
    • dft_pbe0_grid_density_gradient_a: 110000 (81 TB)
    • dft_pbe0_grid_density_gradient_b: 110000 (10 TB)
    • dft_pbe0_grid_density_laplacian_a: 110000 (27 TB)
    • dft_pbe0_grid_density_laplacian_b: 110000 (3 TB)
    • dft_pbe0_grid_kinetic_energy_density_a: 110000 (27 TB)
    • dft_pbe0_grid_kinetic_energy_density_b: 110000 (3 TB)
    • dft_pbe0_grid_points: 110000 (81 TB)
    • dft_pbe0_grid_weight: 110000 (27 TB)
    • dft_pbe0_guid: 11 (3 GB)
    • dft_pbe0_hamiltonian_matrix_a: 110000 (30 TB)
    • dft_pbe0_hamiltonian_matrix_b: 110000 (3 TB)
    • dft_pbe0_has_equal_a_b_electrons: 11 (3 GB)
    • dft_pbe0_hexadecapole: 11 (3 GB)
    • dft_pbe0_hirshfeld_charges: 11 (4 GB)
    • dft_pbe0_hirshfeld_dipoles: 11 (8 GB)
    • dft_pbe0_hirshfeld_quadrupoles: 11 (11 GB)
    • dft_pbe0_hirshfeld_spins: 11 (3 GB)
    • dft_pbe0_hirshfeld_volume_ratios: 11 (4 GB)
    • dft_pbe0_hirshfeld_volumes: 11 (4 GB)
    • dft_pbe0_loewdin_charges: 11 (4 GB)
    • dft_pbe0_loewdin_spins: 11 (3 GB)
    • dft_pbe0_mulliken_charges: 11 (4 GB)
    • dft_pbe0_mulliken_spins: 11 (3 GB)
    • dft_pbe0_num_scf_iterations: 11 (3 GB)
    • dft_pbe0_octupole: 11 (3 GB)
    • dft_pbe0_orbital_coefficients_a: 110000 (30 TB)
    • dft_pbe0_orbital_coefficients_b: 110000 (3 TB)
    • dft_pbe0_orbital_energies_a: 110 (44 GB)
    • dft_pbe0_orbital_energies_b: 11 (8 GB)
    • dft_pbe0_orbital_occupations_a: 110 (44 GB)
    • dft_pbe0_orbital_occupations_b: 11 (8 GB)
    • dft_pbe0_overlap_matrix: 110000 (30 TB)
    • dft_pbe0_quadrupole: 11 (3 GB)
    • dft_pbe0_zero_broadening_corrected_energy: 11 (3 GB)
    • dft_population_analysis: 11 (19 GB)
    • dft_positions: 11 (7 GB)
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
TensorflowTfrecord (2022). Tensorflow Tfrecord Dataset [Dataset]. https://universe.roboflow.com/tensorflowtfrecord/tensorflow-tfrecord-w5cw6/dataset/1

Tensorflow Tfrecord Dataset

tensorflow-tfrecord-w5cw6

tensorflow-tfrecord-dataset

Explore at:
69 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
Jan 23, 2022
Dataset authored and provided by
TensorflowTfrecord
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Licenseplate Bounding Boxes
Description

Tensorflow Tfrecord

## Overview

Tensorflow Tfrecord is a dataset for object detection tasks - it contains Licenseplate annotations for 4,181 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu