A TensorFlow Modeling Pipeline using TensorFlow Datasets and TensorBoard

A TensorFlow Modeling Pipeline using TensorFlow Datasets and TensorBoard

This article investigates TensorFlow components for building a toolset to make modeling evaluation more efficient. Specifically, TensorFlow Datasets (TFDS) and TensorBoard (TB) can be quite helpful in this task.

While completing a highly informative AICamp online class taught by Tyler Elliot Bettilyon (TEB) called Deep Learning for Developers, I got interested in creating a more structured way for machine-learning model builders — like me as the student — to understand and evaluate various models and observe their performance when applied to new datasets. Since this particular class focused on TensorFlow (TF), I started to investigate TF components for building a toolset to make this type of modeling evaluation more efficient. In doing so, I learned about two components, TensorFlow Datasets (TFDS) and TensorBoard (TB), that can be quite helpful and this blog post discusses their application in this task. See the References section for links to AICamp, TEB and other useful resources.

Objective

While the term ‘pipeline’ may have several meanings when used in a data science context, I use it here to mean a modeling pipeline or set of programmatic components that can automatically complete end-to-end modeling from loading data, applying a pre-determined model and logging performance results. The goal is to set up a number of modeling tests and to automatically run the pipeline for each test. Once the models are trained, each test result can be easily compared to the others. In summary, the objective is to establish an efficient, organized and methodical mechanism for model testing.

Figure

The logical flow of the modeling pipeline

This approach is depicted in Figure 1. The pipeline consists of three steps:

  1. Data: Loading and processing a dataset,
  2. Analysis: Building predefined models and applying to this dataset,
  3. Results: Capturing key metrics for each dataset-model test for methodical comparison later.

Any analyst who has studied or even dabbled with deep learning neural networks has probably experienced the seemingly boundless array of modeling choices. Any number of many layer types, each with a multitude of configuration options, can be interconnected, and once stacked the model can be trained using multiple optimization routines and numerous hyper-parameters. And there is the question of data, since it may be desirable to apply promising models to new datasets to observe their performance on unseen data or to gain a foundation for further model iterations.

For this application, I worked exclusively with image-classification data and models. TFDS includes audio, image, object-detection, structured, summarization, text, translate and video data and deep-learning models can be specifically constructed for these problems. While the out-of-the box code presented here will require some modifications and testing to be applied to other sets, its foundational framework will still be helpful.

pipeline tensorflow tensorflow datasets tensorboard

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

A TensorFlow Modeling using TensorFlow Datasets and TensorBoard

While completing a highly informative AICamp online class taught by Tyler Elliot Bettilyon (TEB) called Deep Learning for Developers, I …

Top 10 Ready To Use Datasets For ML on TensorFlow

A collection of datasets ready to use with TensorFlow or other Python ML frameworks, such as Jax, enabling easy-to-use and high-performance input pipelines.

How to Use a Pre-defined Tensorflow Dataset

Tensorflow 2.0 comes with a set of pre-defined ready to use datasets. It is quite easy to use and is often handy when you are just playing…

Build TensorFlow dataset pipeline for semantic segmentation task using the tf.data API

In this video, we are going to build TensorFlow dataset pipeline for semantic segmentation task using the tf.data API. The tutorial would be useful in while ...

TensorFlow Tutorial 12 - TensorFlow Datasets

In this video we look at the datasets that are available to us through TensorFlow Datasets (tfds) and how we load them and then doing preprocessing, shuffling, batching, prefetching etc.