Last year in February, the TensorFlow’s team introduced TensorFlow Datasets. Machine learning community can access public research datasets as** tf.data.Datasets** and as NumPy arrays. TFDS does all the tedious work of fetching the source data and preparing it into a common format on disk. It uses the** tf.data** API to build high-performance input pipelines, which are TensorFlow 2.0-ready and can be used with tf.keras models.

TensorFlow Datasets provides many public datasets as

tf.data.Datasets

Installation:

pip install tensorflow-datasets

## Snippet:

import tensorflow_datasets as tfds

mnist_data = tfds.load("mnist")

mnist_train, mnist_test = mnist_data["train"], mnist_data["test"]

assert isinstance(mnist_train, tf.data.Dataset)

In the next section we take a look at few important datasets(h/t Lionbridge) that TensorFlow allows you to access with a single line of code:

Lsun

tfds.image.Lsun

[LSUN](https://www.tensorflow.org/datasets/catalog/lsun) contains around one million labeled images for each of 10 scene categories and 20 object categories. We experiment with training popular convolutional networks and find that they achieve substantial performance gains when trained on this dataset.

Image for post

#developers corner #datasets #tensorflow datasets #tfds #tensorflow

Top 10 Ready To Use Datasets For ML on TensorFlow
1.90 GEEK