This article investigates TensorFlow components for building a toolset to make modeling evaluation more efficient. Specifically, TensorFlow Datasets (TFDS) and TensorBoard (TB) can be quite helpful in this task.
While completing a highly informative AICamp online class taught by Tyler Elliot Bettilyon (TEB) called Deep Learning for Developers, I got interested in creating a more structured way for machine-learning model builders — like me as the student — to understand and evaluate various models and observe their performance when applied to new datasets. Since this particular class focused on TensorFlow (TF), I started to investigate TF components for building a toolset to make this type of modeling evaluation more efficient. In doing so, I learned about two components, TensorFlow Datasets (TFDS) and TensorBoard (TB), that can be quite helpful and this blog post discusses their application in this task. See the References section for links to AICamp, TEB and other useful resources.
While the term ‘pipeline’ may have several meanings when used in a data science context, I use it here to mean a modeling pipeline or set of programmatic components that can automatically complete end-to-end modeling from loading data, applying a pre-determined model and logging performance results. The goal is to set up a number of modeling tests and to automatically run the pipeline for each test. Once the models are trained, each test result can be easily compared to the others. In summary, the objective is to establish an efficient, organized and methodical mechanism for model testing.
The logical flow of the modeling pipeline
This approach is depicted in Figure 1. The pipeline consists of three steps:
Any analyst who has studied or even dabbled with deep learning neural networks has probably experienced the seemingly boundless array of modeling choices. Any number of many layer types, each with a multitude of configuration options, can be interconnected, and once stacked the model can be trained using multiple optimization routines and numerous hyper-parameters. And there is the question of data, since it may be desirable to apply promising models to new datasets to observe their performance on unseen data or to gain a foundation for further model iterations.
For this application, I worked exclusively with image-classification data and models. TFDS includes audio, image, object-detection, structured, summarization, text, translate and video data and deep-learning models can be specifically constructed for these problems. While the out-of-the box code presented here will require some modifications and testing to be applied to other sets, its foundational framework will still be helpful.
While completing a highly informative AICamp online class taught by Tyler Elliot Bettilyon (TEB) called Deep Learning for Developers, I …
A collection of datasets ready to use with TensorFlow or other Python ML frameworks, such as Jax, enabling easy-to-use and high-performance input pipelines.
Tensorflow 2.0 comes with a set of pre-defined ready to use datasets. It is quite easy to use and is often handy when you are just playing…
In this video, we are going to build TensorFlow dataset pipeline for semantic segmentation task using the tf.data API. The tutorial would be useful in while ...
In this video we look at the datasets that are available to us through TensorFlow Datasets (tfds) and how we load them and then doing preprocessing, shuffling, batching, prefetching etc.