11 datasets found

T
fmb
tensorflow.org
huggingface.co
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). fmb [Dataset]. https://www.tensorflow.org/datasets/catalog/fmb
Explore at:
Dataset updated
May 31, 2024
Description
Our dataset consists of objects in diverse appearance and geometry. It requires multi-stage and multi-modal fine motor skills to successfully assemble the pegs onto a unfixed board in a randomized scene. We collected a total of 22,550 trajectories across two different tasks on a Franka Panda arm. We record the trajectories from 2 global views and 2 wrist views. Each view contains both RGB and depth map.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('fmb', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
T
forest_fires
tensorflow.org
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). forest_fires [Dataset]. https://www.tensorflow.org/datasets/catalog/forest_fires
Explore at:
Dataset updated
Nov 23, 2022
Description
This is a regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data.

Data Set Information:

In [Cortez and Morais, 2007], the output 'area' was first transformed with a ln(x+1) function. Then, several Data Mining methods were applied. After fitting the models, the outputs were post-processed with the inverse of the ln(x+1) transform. Four different input setups were used. The experiments were conducted using a 10-fold (cross-validation) x 30 runs. Two regression metrics were measured: MAD and RMSE. A Gaussian support vector machine (SVM) fed with only 4 direct weather conditions (temp, RH, wind and rain) obtained the best MAD value: 12.71 +- 0.01 (mean and confidence interval within 95% using a t-student distribution). The best RMSE was attained by the naive mean predictor. An analysis to the regression error curve (REC) shows that the SVM model predicts more examples within a lower admitted error. In effect, the SVM model predicts better small fires, which are the majority.

Attribute Information:

For more information, read [Cortez and Morais, 2007].

X - x-axis spatial coordinate within the Montesinho park map: 1 to 9

Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9

month - month of the year: 'jan' to 'dec'

day - day of the week: 'mon' to 'sun'

FFMC - FFMC index from the FWI system: 18.7 to 96.20

DMC - DMC index from the FWI system: 1.1 to 291.3

DC - DC index from the FWI system: 7.9 to 860.6

ISI - ISI index from the FWI system: 0.0 to 56.10

temp - temperature in Celsius degrees: 2.2 to 33.30

RH - relative humidity in %: 15.0 to 100

wind - wind speed in km/h: 0.40 to 9.40

rain - outside rain in mm/m2 : 0.0 to 6.4

area - the burned area of the forest (in ha): 0.00 to 1090.84 (this output variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('forest_fires', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
Z
Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...
data.niaid.nih.gov
zenodo.org
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yin Ranyu (2024). Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A Spatial-Temporal Approach and Dataset" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8419699
Explore at:
Dataset updated
Feb 4, 2024
Dataset provided by
He Guojin
Gong Chengjuan
Jiao Weili
Yin Ranyu
Wang Guizhou
Long Tengfei
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset is built for time-series Sentinel-2 cloud detection and stored in Tensorflow TFRecord (refer to https://www.tensorflow.org/tutorials/load_data/tfrecord).

Each file is compressed in 7z format and can be decompressed using Bandzip or 7-zip software.

Dataset Structure:

Each filename can be split into three parts using underscores. The first part indicates whether it is designated for training or validation ('train' or 'val'); the second part indicates the Sentinel-2 tile name, and the last part indicates the number of samples in this file.

For each sample, it includes:

Sample ID;

Array of time series 4 band image patches in 10m resolution, shaped as (n_timestamps, 4, 42, 42);

Label list indicating cloud cover status for the center (6\times6) pixels of each timestamp;

Ordinal list for each timestamp;

Sample weight list (reserved);

Here is a demonstration function for parsing the TFRecord file:

import tensorflow as tf

init Tensorflow Dataset from file name

def parseRecordDirect(fname): sep = '/' parts = tf.strings.split(fname,sep) tn = tf.strings.split(parts[-1],sep='_')[-2] nn = tf.strings.to_number(tf.strings.split(parts[-1],sep='_')[-1],tf.dtypes.int64) t = tf.data.Dataset.from_tensors(tn).repeat().take(nn) t1 = tf.data.TFRecordDataset(fname) ds = tf.data.Dataset.zip((t, t1)) return ds

keys_to_features_direct = { 'localid': tf.io.FixedLenFeature([], tf.int64, -1), 'image_raw_ldseries': tf.io.FixedLenFeature((), tf.string, ''), 'labels': tf.io.FixedLenFeature((), tf.string, ''), 'dates': tf.io.FixedLenFeature((), tf.string, ''), 'weights': tf.io.FixedLenFeature((), tf.string, '') }

The Decoder (Optional)

class SeriesClassificationDirectDecorder(decoder.Decoder): """A tf.Example decoder for tfds classification datasets.""" def init(self) -> None: super()._init_()

def decode(self, tid, ds): parsed = tf.io.parse_single_example(ds, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) sample_dict = { 'tid': tid, # tile ID 'dates': dates, # Date list 'localid': parsed['localid'], # sample ID 'imgs': decoded, # image array 'labels': label, # label list 'weights': weight } return sample_dict

simple function

def preprocessDirect(tid, record): parsed = tf.io.parse_single_example(record, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) return tid, dates, parsed['localid'], decoded, label, weight

t1 = parseRecordDirect('filename here') dataset = t1.map(preprocessDirect, num_parallel_calls=tf.data.experimental.AUTOTUNE)

#

Class Definition:

0: clear

1: opaque cloud

2: thin cloud

3: haze

4: cloud shadow

5: snow

Dataset Construction:

First, we randomly generate 500 points for each tile, and all these points are aligned to the pixel grid center of the subdatasets in 60m resolution (eg. B10) for consistence when comparing with other products. It is because that other cloud detection method may use the cirrus band as features, which is in 60m resolution.

Then, the time series image patches of two shapes are cropped with each point as the center.The patches of shape (42 \times 42) are cropped from the bands in 10m resolution (B2, B3, B4, B8) and are used to construct this dataset.And the patches of shape (348 \times 348) are cropped from the True Colour Image (TCI, details see sentinel-2 user guide) file and are used to interpreting class labels.

The samples with a large number of timestamps could be time-consuming in the IO stage, thus the time series patches are divided into different groups with timestamps not exceeding 100 for every group.
T
math_qa
tensorflow.org
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). math_qa [Dataset]. https://www.tensorflow.org/datasets/catalog/math_qa
Explore at:
Dataset updated
Dec 14, 2022
Description
A large-scale dataset of math word problems and an interpretable neural math problem solver that learns to map problems to operation programs.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('math_qa', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
A
Programs and Code for Geothermal Exploration Artificial Intelligence
data.amerigeoss.org
gdr.openei.org
+3more
md, py, sh, zip
Updated Jun 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2021). Programs and Code for Geothermal Exploration Artificial Intelligence [Dataset]. https://data.amerigeoss.org/dataset/programs-and-code-for-geothermal-exploration-artificial-intelligence-fac4c
Explore at:
md, py, zip, shAvailable download formats
Dataset updated
Jun 9, 2021
Dataset provided by
United States
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The scripts below are used to run the Geothermal Exploration Artificial Intelligence developed within the "Detection of Potential Geothermal Exploration Sites from Hyperspectral Images via Deep Learning" project. It includes all scripts for pre-processing and processing, including: - Land Surface Temperature K-Means classifier - Labeling AI using Self Organizing Maps (SOM) - Post-processing for Permanent Scatterer InSAR (PSInSAR) analysis with SOM - Mineral marker summarizing - Artificial Intelligence (AI) Data splitting: creates data set from a single raster file - Artificial Intelligence Model: creates AI from a single data set, after splitting in Train, Validation and Test subsets - AI Mapper: creates a classification map based on a raster file
Data from: Automatic extraction of road intersection points from USGS...
figshare.com
zip
Updated Nov 11, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahmoud Saeedimoghaddam; Tomasz Stepinski (2019). Automatic extraction of road intersection points from USGS historical map series using deep convolutional neural networks [Dataset]. http://doi.org/10.6084/m9.figshare.10282085.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10282085.v1
Dataset updated
Nov 11, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Mahmoud Saeedimoghaddam; Tomasz Stepinski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Tagged image tiles as well as the Faster-RCNN framework for automatic extraction of road intersection points from USGS historical maps of the United States of America. The data and code have been prepared for the paper entitled "Automatic extraction of road intersection points from USGS historical map series using deep convolutional neural networks" submitted to "International Journal of Geographic Information Science". The image tiles have been tagged manually. The Faster RCNN framework (see https://arxiv.org/abs/1611.10012) was captured from:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
o
mobilenet-v1-ssd300.tensorflow (8bit symmetrically quantized and fine-tuned)...
explore.openaire.eu
Updated Sep 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Itay Hubara (2019). mobilenet-v1-ssd300.tensorflow (8bit symmetrically quantized and fine-tuned) [Dataset]. http://doi.org/10.5281/zenodo.3401713
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3401713
Dataset updated
Sep 6, 2019
Authors
Itay Hubara
Description
Application: Single-stage Object Detection Base model: MobileNet-v1 Framework: tensorflow1.1 Training Information: weights were fine-tuned using TF fake quantization nodes Quality: The COCO mAP(IoU=0.50:0.95) on 5000 validation images is 23.4% Precision: 8-bit precision Is Quantized: Yes, using fake quantization with symmetric=True. - i.e., weights appear in float32 but have only 256 unique values and no zero point. Dataset: COCO val-2017 Fake quantization with symmetric=True. Weights appear in float32 but have only 256 unique values and no zero point. Additional information in the README file.
T
cityscapes
tensorflow.org
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). cityscapes [Dataset]. https://www.tensorflow.org/datasets/catalog/cityscapes
Explore at:
Dataset updated
Dec 6, 2022
Description
Cityscapes is a dataset consisting of diverse urban street scenes across 50 different cities at varying times of the year as well as ground truths for several vision tasks including semantic segmentation, instance level segmentation (TODO), and stereo pair disparity inference.

For segmentation tasks (default split, accessible via 'cityscapes/semantic_segmentation'), Cityscapes provides dense pixel level annotations for 5000 images at 1024 * 2048 resolution pre-split into training (2975), validation (500) and test (1525) sets. Label annotations for segmentation tasks span across 30+ classes commonly encountered during driving scene perception. Detailed label information may be found here: https://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/helpers/labels.py#L52-L99

Cityscapes also provides coarse grain segmentation annotations (accessible via 'cityscapes/semantic_segmentation_extra') for 19998 images in a 'train_extra' split which may prove useful for pretraining / data-heavy models.

Besides segmentation, cityscapes also provides stereo image pairs and ground truths for disparity inference tasks on both the normal and extra splits (accessible via 'cityscapes/stereo_disparity' and 'cityscapes/stereo_disparity_extra' respectively).

Ingored examples:

For 'cityscapes/stereo_disparity_extra':

troisdorf_000000_000073_{*} images (no disparity map present)

WARNING: this dataset requires users to setup a login and password in order to get the files.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cityscapes', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
T
uc_merced
tensorflow.org
huggingface.co
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). uc_merced [Dataset]. https://www.tensorflow.org/datasets/catalog/uc_merced
Explore at:
Dataset updated
Dec 6, 2022
Area covered
Merced
Description
UC Merced is a 21 class land use remote sensing image dataset, with 100 images per class. The images were manually extracted from large images from the USGS National Map Urban Area Imagery collection for various urban areas around the country. The pixel resolution of this public domain imagery is 0.3 m.

While most images are 256x256 pixels, there are 44 images with different shape.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('uc_merced', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/uc_merced-2.0.0.png" alt="Visualization" width="500px">
T
tiny_shakespeare
tensorflow.org
huggingface.co
Updated Feb 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). tiny_shakespeare [Dataset]. https://www.tensorflow.org/datasets/catalog/tiny_shakespeare
Explore at:
Dataset updated
Feb 11, 2023
Description
40,000 lines of Shakespeare from a variety of Shakespeare's plays. Featured in Andrej Karpathy's blog post 'The Unreasonable Effectiveness of Recurrent Neural Networks': http://karpathy.github.io/2015/05/21/rnn-effectiveness/.

To use for e.g. character modelling:

d = tfds.load(name='tiny_shakespeare')['train'] d = d.map(lambda x: tf.strings.unicode_split(x['text'], 'UTF-8')) # train split includes vocabulary for other splits vocabulary = sorted(set(next(iter(d)).numpy())) d = d.map(lambda x: {'cur_char': x[:-1], 'next_char': x[1:]}) d = d.unbatch() seq_len = 100 batch_size = 2 d = d.batch(seq_len) d = d.batch(batch_size)

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('tiny_shakespeare', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
T
dsprites
tensorflow.org
library.toponeai.link
+1more
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). dsprites [Dataset]. https://www.tensorflow.org/datasets/catalog/dsprites
Explore at:
Dataset updated
Jun 1, 2024
Description
dSprites is a dataset of 2D shapes procedurally generated from 6 ground truth independent latent factors. These factors are color, shape, scale, rotation, x and y positions of a sprite.

All possible combinations of these latents are present exactly once, generating N = 737280 total images.

Latent factor values

Color: white

Shape: square, ellipse, heart

Scale: 6 values linearly spaced in [0.5, 1]

Orientation: 40 values in [0, 2 pi]

Position X: 32 values in [0, 1]

Position Y: 32 values in [0, 1]

We varied one latent at a time (starting from Position Y, then Position X, etc), and sequentially stored the images in fixed order. Hence the order along the first dimension is fixed and allows you to map back to the value of the latents corresponding to that image.

We chose the latents values deliberately to have the smallest step changes while ensuring that all pixel outputs were different. No noise was added.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('dsprites', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/dsprites-2.0.0.png" alt="Visualization" width="500px">
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). fmb [Dataset]. https://www.tensorflow.org/datasets/catalog/fmb

fmb

Explore at:

Dataset updated

May 31, 2024

Description

Our dataset consists of objects in diverse appearance and geometry. It requires multi-stage and multi-modal fine motor skills to successfully assemble the pegs onto a unfixed board in a randomized scene. We collected a total of 22,550 trajectories across two different tasks on a Franka Panda arm. We record the trajectories from 2 global views and 2 wrist views. Each view contains both RGB and depth map.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('fmb', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

Clear search

Close search

Google apps

Main menu

fmb

forest_fires

Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...

init Tensorflow Dataset from file name

The Decoder (Optional)

simple function

math_qa

Programs and Code for Geothermal Exploration Artificial Intelligence

Data from: Automatic extraction of road intersection points from USGS...

mobilenet-v1-ssd300.tensorflow (8bit symmetrically quantized and fine-tuned)...

cityscapes

uc_merced

tiny_shakespeare

dsprites

Latent factor values

fmb