Practical Deep Learning: Image Search Engine

Practical Deep Learning: Image Search Engine

Practical Deep Learning: Image Search Engine

Artificial intelligence is one of the fastest growing fields of computer science today and the demand for excellent AI Engineers is increasing day in and day out. This course will help you stay competitive in the AI job market by teaching you how to create a Deep Learning End-to-End product on your own.

Most courses focus on the basics of Deep Learning and teach you about the very basics of different models. In this course, however, you will learn how to write a whole End-to-End pipeline, from data pre-processing across choosing the right hyper-parameters, to showing your users results in a browser.

The case that we will tackle in this course is an engine for Image to Image Search.

Why should you take this course?

This course is not focused on teaching you Neural Networks (ANNs, CNNs, RNNs…), but teaching you how to apply them in real world cases.

If you haven’t worked on a product that uses Deep Learning before, this is the perfect course for you! Throughout the course we will work together on the Image to Image Search engine, starting from ground zero - image pre-processing, creating a model, training it, then testing. After that we will create a simple web application and use it to serve our model in production.

Another cool thing about this course is that we will use multiple programming languages to create the whole application around the model itself. This will make you not only a better AI Engineer but also get you on the path towards becoming a Full stack AI Engineer.

After taking this course you will guarantee yourself to be one step closer to landing your dream job as an AI/ML Engineer by having your own AI product/project in your portfolio.

Libraries/Tools used in the course:

The whole Deep learning back-end of our pipeline will be built using Tensorflow 1.10.0. For some image pre-processing task we will use some basic functionality from OpenCV, the most important Python library for image processing tasks!

For the app's back-end (model handling, image uploading, page navigation, etc.) we will use the Flask python framework.

And for our interactive, front-end we are going to use HTML, CSS, JavaScript and Jinja templating language. So at the end of the course you will have full stack working application.

Who is this course for?

As you can see the course is meant to teach you how to create your own Deep Learning product from scratch.

If you are just starting out with Deep Learning, this course might be too hard for you. But if you like challenges, I do recommend following it. Although I will not be explaining the meat of Neural Networks (ANNs, CNNs), I will explain most concepts in great detail, so even if you are a total beginner you should be able to follow with the help of your peers or my help through the comments section.

If you have Deep Learning experience and want to move it to the next level you will find this course very useful! You can consider it as a level up for your skills by putting your already great skills to new use. At the end of the course you will not only have learned how to create a working End-to-End pipeline, but also hold proof of your skills for potential employers!


The conclusion is this - this is very rare opportunity, not only to learn Deep Learning concepts, but also how to apply that knowledge and create your own web application (as a complete product) from scratch.

I hope to see you in class!


Basic knowledge
Python programming
Basic conceptual understanding of Convolutional Neural Networks (CNN)
(optional) Previous coding experience with TensorFlow
What will you learn
What are Image-to-Image Search engines
How to build your AI based Image-to-Image Search engine
How to create simple web based interface for your Deep learning models using the Python framework Flask
Coding a Convolutional Neural Network (CNN) from scratch in Tensorflow 1.10.0
Using the Python framework Flask to serve a Deep Learning model in production
How to create an End-to-End pipeline for any Deep Learning model using Tensorflow
How to create a Flask application from scratch

Deep Learning for Image Segmentation with U-Net Architecture

Deep Learning for Image Segmentation with U-Net Architecture

U-Net is more successful than conventional models, in terms of architecture and in term pixel-based image segmentation formed from convolutional neural network layers. Now, we learn about using U-Net Architecture for Image Segmentation in Deep Learning

Why segmentation is needed and what U-Net offers?

Basically, segmentation is a process that partitions an image into regions. It is an image processing approach that allows us to separate objects and textures in images. Segmentation is especially preferred in applications such as remote sensing or tumor detection in biomedicine.

There are many traditional ways of doing this. For example; point, line, and edge detection methods, thresholding, region-based, pixel-based clustering, morphological approaches, etc. Various methods have been developed for segmentation with convolutional neural networks (a common deep learning architecture), which have become indispensable in tackling more advanced challenges with image segmentation. In this post, we’ll take a closer look at one such architecture: u-net.

In Deep learning, it’s known that we need large datasets for model training. But there are some problems we run into at this point! We often cannot afford the amount of data that needs to be collected for an object classification problem. In this context, affordability means time, money, and most importantly, hardware.

For example, it isn’t possible to collect many biomedical images with the camera on your mobile phone. So we need more systematic ways to collect data. There’s also the data labeling process, for which a single developer/engineer will not suffice—this will require a lot of expertise and experience in classifying the relevant images. This is especially true with highly-specialized areas such as medical diagnostics.

Another critical point is to provide education about the general image in classically convolutional neural networks through class labels. However, some problems require knowledge of localization/positioning with pixel-based approaches. In areas that require sensitive approaches, such as biomedical or defense, we need class information for each pixel.

U-Net is more successful than conventional models, in terms of architecture and in terms pixel-based image segmentation formed from convolutional neural network layers. It’s even effective with limited dataset images. The presentation of this architecture was first realized through the analysis of biomedical images.

Differences that make U-Net special!

As it’s commonly known, the dimension reduction process in the height and width that we apply throughout the convolutional neural network—that is, the pooling layer — is applied in the form of a dimension increase in the second half of the model.


The pooling layer reduces height and width information by keeping the number of channels of the input matrix constant. The calculation is a step used to reduce complexity (Each element of the image matrix is called a pixel). In summary, the pooling layer refers to a pixel that represents groups of pixels.

Note: Pooling layers can work with different approaches, including maximum, average, or median layers.

These layers are intended to increase the resolution of the output. For localization, the sampled output is combined with high-resolution features throughout the model. A sequential convolution layer then aims to produce a more precise output based on this information.

U-Net takes its name from the architecture, which when visualized, appears similar to the letter U, as shown in the figure above. Input images are obtained as a segmented output map. The most special aspect of the architecture in the second half. The network does not have a fully-connected layer. Only the convolution layers are used. Each standard convolution process is activated by a ReLU activation function.

U-Net consists of a contracting path (left side) and an expansive path (right side)!

Representation of a convolution and deconvolution process in U-Net

The pixels in the border region are symmetrically added around the image so that images can be segmented continuously. With this strategy, the image is segmented completely. The padding (pixel adding) method is important for applying the U-Net model to large images; otherwise, the resolution will be limited by the capacity of the GPU memory. The result of padding and segmenting with the mirroring I mentioned is shown in the figure below.

Overlap-tile strategy

The difference between U-Net and the autoencoder architecture

To help highlight what makes U-Net unique, it might be helpful to quickly compare it to a different traditional approach to image segmentation: the autoencoder architecture.

In a classical autoencoder architecture, the size of the input information is initially reduced, along with the following layers.

At this point, the encoder part of the architecture is completed and the decoder part begins. Linear feature representation is learned in this section, and the size gradually increases. At the end of the architecture, the output size is equal to the input size.

This architecture is ideal in preserving the output size, but one problem is that it compresses the input linearly, which results in a bottleneck in which all features cannot be transmitted.

Autoencoders Model

This is where U-Net differs. U-Net performs deconvolution on the decoder side (i.e. in the second half) and, in addition, can overcome this bottleneck problem, which results in the loss of features through connections from the encoder side of the architecture.

Let’s continue with U-Net!

Let’s return to our specific use case at hand—biomedical image segmentation. The most common variation in tissue in a biomedical image is deformation, and realistic deformations can be efficiently simulated. In this way, the learning process is more successful with the elastic deformation approach, which helps us increase the size of our dataset.

Representation of Elastic Deformation

In addition, it’s difficult to determine the boundaries when there are parts of the same class that touches each other. For this purpose, it’s recommended to use the values that have a large weight in the loss function, while separating the information to be segmented from the background first.

HeLa cells recorded by DIC (differential interference contrast) microscopy. a) raw image b) Ground truth segmentation. Different colors show different examples of HeLa cells. c) Created segmentation mask (black and white) d) A map with a lost weight in pixels to allow the network to learn edge pixels.

Loss Approaches

Loss can be calculated by standard binary cross-entropy and Dice loss, which is a frequently-used performance criterion for assessing success in biomedical images.

Loss: Binary cross-entropy and Dice

Intersection over Union (IoU)

Is a pixel-based criterion and is often used when evaluating segmentation performance.

The varying pixel ratio between the target matrix and the resulting matrix is considered. This metric is also associated with the Dice calculation.

Visualization of IoU expression

Input and image labeled by input

Here’s a look at how U-Net performs on EM image segmentation, as compared to other approaches:

Here is the U-Net

Results from PhC-U373 and DIC-HeLa datasets and comparison with previous studies:

U-net’s segmentation success on PhC-U373 (a-b) and DIC-HeLa (c-d) datasets

Of course, segmentation isn’t only used for medical images; earth sciences or remote sensing systems from satellite imagery also use segmentation, as do autonomous vehicle systems. After all, there are patterns everywhere.

TGS Salt Identification Challenge

There are large deposits of oil and gas and large deposits of salt beneath the surface in various areas of the Earth. Unfortunately, it’s very difficult to know where the large salt deposits are.

Professional seismic imaging requires expert interpretation of salt bodies. This leads to very subjective, variable predictions. To generate the most accurate seismic images and 3D imaging, TGS (geology data company) hopes that Kaggle’s machine learning community can create an algorithm that automatically and accurately determines whether an underground target is a salt.

Here are some examples of successful u-net approaches:

Salt Identification Challenge

Mapping Challenge — Building Missing Maps with Segmentation

The determination of map regions by using satellite imagery is another u-net application area. In fact, it can be said that the applications that will emerge with the development of this field will greatly facilitate the work of mapping and environmental engineers.

We can use this method not only for defense industry applications but also for urban district planning applications. For example, in the competition for the detection of buildings, mean accuracy of 0.943 and mean sensitivity of 0.954 is reached. You can see the u-net model of this successful study here.

Result of the Mapping Challenge — Neptune.ML

U-net’s inspiration for other deep learning approaches

U-net inspired the combination of different architectures as well as other computer vision deep learning models.

For example, the ResNet of ResNet (RoR) concept is one of them. The structure, which can be defined as the second half of the u-net architecture, is applied to the skip connections in classical residual networks.

Original ResNet (left) — RoR approach (right)

As can be seen from the classic ResNet model architecture, each blue block has a skip connection. In the RoR approach, new connections are added from the input to the output via the previous connections. There are different versions of RoR as in ResNet. Take a look at the various references at the end of this post if you want to examine the details.

RoR-3 : Original ResNet use m = 3 for RoR

Pre-RoR-3 : RoR, Before Activation ResNet m = 3 use

RoR-3-WRN : RoR, m = 3 with WRN use


Segmenting images can be a challenging problem, especially when lacking enough high- and low-resolution data. It’s an area where new approaches can be developed by evaluating different, current, and old approaches.

Remember, biomedical imaging isn’t the only use case!

Other areas of application for segmentation include geology, geophysics, environmental engineering, mapping, and remote sensing, including various autonomous tools.

Thanks for reading. keep visiting!

☞ Deep Learning With TensorFlow 2.0

☞ Deep Learning A-Z™: Hands-On Artificial Neural Networks

Deep Learning Tutorial with Python | Machine Learning with Neural Networks

☞ Machine Learning, Data Science and Deep Learning with Python

Originally published here

Deep Learning vs. Conventional Machine Learning

Deep Learning vs. Conventional Machine Learning

Over the past few years, deep learning has given rise to a massive collection of ideas and techniques which are disruptive to conventional machine learning practices. However, are those ideas totally different from the traditional methods? Where are the connections and differences? What are the advantages and disadvantages? How practical are the deep learning methods for business applications? Chao will share her thoughts on those questions based on her readings and hands on experiments in the areas of text analytics (question answering system, sentiment analysis) and healthcare image classification.

Over the past few years, deep learning has given rise to a massive collection of ideas and techniques which are disruptive to conventional machine learning practices. However, are those ideas totally different from the traditional methods? Where are the connections and differences? What are the advantages and disadvantages? How practical are the deep learning methods for business applications? Chao will share her thoughts on those questions based on her readings and hands on experiments in the areas of text analytics (question answering system, sentiment analysis) and healthcare image classification.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Deep Learning and Machine Learning

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning for Front-End Developers

The Future of Machine Learning and JavaScript

Deep Learning With TensorFlow 2.0

How to get started with Python for Deep Learning and Data Science

The best machine learning and deep learning libraries

The best machine learning and deep learning libraries

You are asking Why TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models.If you’re starting a new machine learning or deep learning project, you may be confused about which framework to choose...

You are asking Why TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models.If you’re starting a new machine learning or deep learning project, you may be confused about which framework to choose...

There is a difference between a machine learning framework and a deep learning framework. Essentially, a machine learning framework covers a variety of learning methods for classification, regression, clustering, anomaly detection, and data preparation, and may or may not include neural network methods.

A deep learning or deep neural network framework covers a variety of neural network topologies with many hidden layers. Keras, MXNet, PyTorch, and TensorFlow are deep learning frameworks. Scikit-learn and Spark MLlib are machine learning frameworks. (Click any of the previous links to read my stand-alone review of the product.)

In general, deep neural network computations run much faster on a GPU (specifically an Nvidia CUDA general-purpose GPU), TPU, or FPGA, rather than on a CPU. In general, simpler machine learning methods don’t benefit from a GPU.

While you can train deep neural networks on one or more CPUs, the training tends to be slow, and by slow I’m not talking about seconds or minutes. The more neurons and layers that need to be trained, and the more data available for training, the longer it takes. When the Google Brain team trained its language translation models for the new version of Google Translate in 2016, they ran their training sessions for a week at a time, on multiple GPUs. Without GPUs, each model training experiment would have taken months.

Since then, the Intel Math Kernel Library (MKL) has made it possible to train some neural networks on CPUs in a reasonable amount of time. Meanwhile GPUs, TPUs, and FPGAs have gotten even faster.

The training speed of all of the deep learning packages running on the same GPUs is nearly identical. That’s because the training inner loops spend most of their time in the Nvidia CuDNN package.

Apart from training speed, each of the deep learning libraries has its own set of pros and cons, and the same is true of Scikit-learn and Spark MLlib. Let’s dive in.


Keras is a high-level, front-end specification and implementation for building neural network models that ships with support for three back-end deep learning frameworks: TensorFlow, CNTK, and Theano. Amazon is currently working on developing a MXNet back-end for Keras. It’s also possible to use PlaidML (an independent project) as a back-end for Keras to take advantage of PlaidML’s OpenCL support for all GPUs.

TensorFlow is the default back-end for Keras, and the one recommended for many use cases involving GPU acceleration on Nvidia hardware via CUDA and cuDNN, as well as for TPU acceleration in Google Cloud. TensorFlow also contains an internal tf.keras class, separate from an external Keras installation.

Keras has a high-level environment that makes adding a layer to a neural network as easy as one line of code in its Sequential model, and requires only one function call each for compiling and training a model. Keras lets you work at a lower level if you want, with its Model or functional API.

Keras allows you to drop down even farther, to the Python coding level, by subclassing keras.Model, but prefers the functional API when possible. Keras also has a scikit-learn API, so that you can use the Scikit-learn grid search to perform hyperparameter optimization in Keras models.

Cost: Free open source.

Platform: Linux, MacOS, Windows, or Raspbian; TensorFlow, Theano, or CNTK back-end.


MXNet has evolved and improved quite a bit since moving under the Apache Software Foundation umbrella early in 2017. While there has been work on Keras with an MXNet back-end, a different high-level interface has become much more important: Gluon. Prior to the incorporation of Gluon, you could either write easy imperative code or fast symbolic code in MXNet, but not both at once. With Gluon, you can combine the best of both worlds, in a way that competes with both Keras and PyTorch.

The advantages claimed for Gluon include:

  • Simple, easy-to-understand code: Gluon offers a full set of plug-and-play neural network building blocks, including predefined layers, optimizers, and initializers.
  • Flexible, imperative structure: Gluon does not require the neural network model to be rigidly defined, but rather brings the training algorithm and model closer together to provide flexibility in the development process.
  • Dynamic graphs: Gluon enables developers to define neural network models that are dynamic, meaning they can be built on the fly, with any structure, and using any of Python’s native control flow.
  • High performance: Gluon provides all of the above benefits without impacting the training speed that the underlying engine provides.

These four advantages, along with a vastly expanded collection of model examples, bring Gluon/MXNet to rough parity with Keras/TensorFlow and PyTorch for ease of development and training speed. You can see code examples for each these on the main Gluon page and repeated on the overview page for the Gluon API.

The Gluon API includes functionality for neural network layers, recurrent neural networks, loss functions, dataset methods and vision datasets, a model zoo, and a set of contributed experimental neural network methods. You can freely combine Gluon with standard MXNet and NumPy modules, for example module**, **autograd, and ndarray, as well as with Python control flows.

Gluon has a good selection of layers for building models, including basic layers (Dense, Dropout, etc.), convolutional layers, pooling layers, and activation layers. Each of these is a one-line call. These can be used, among other places, inside of network containers such as gluon.nn.Sequential().

Cost: Free open source.

Platform: Linux, MacOS, Windows, Docker, Raspbian, and Nvidia Jetson; Python, R, Scala, Julia, Perl, C++, and Clojure (experimental). MXNet is included in the AWS Deep Learning AMI.


PyTorch builds on the old Torch and the new Caffe2 framework. As you might guess from the name, PyTorch uses Python as its scripting language, and it uses an evolved Torch C/CUDA back-end. The production features of Caffe2 are being incorporated into the PyTorch project.

PyTorch is billed as “Tensors and dynamic neural networks in Python with strong GPU acceleration.” What does that mean?

Tensors are a mathematical construct that is used heavily in physics and engineering. A tensor of rank two is a special kind of matrix; taking the inner product of a vector with the tensor yields another vector with a new magnitude and a new direction. TensorFlow takes its name from the way tensors (of synapse weights) flow around its network model. NumPy also uses tensors, but calls them an ndarray.

GPU acceleration is a given for most modern deep neural network frameworks. A dynamic neural network is one that can change from iteration to iteration, for example allowing a PyTorch model to add and remove hidden layers during training to improve its accuracy and generality. PyTorch recreates the graph on the fly at each iteration step. In contrast, TensorFlow by default creates a single dataflow graph, optimizes the graph code for performance, and then trains the model.

While eager execution mode is a fairly new option in TensorFlow, it’s the only way PyTorch runs: API calls execute when invoked, rather than being added to a graph to be run later. That might seem like it would be less computationally efficient, but PyTorch was designed to work that way, and it is no slouch when it comes to training or prediction speed.

PyTorch integrates acceleration libraries such as Intel MKL and Nvidia cuDNN and NCCL (Nvidia Collective Communications Library) to maximize speed. Its core CPU and GPU Tensor and neural network back-ends—TH (Torch), THC (Torch CUDA), THNN (Torch Neural Network), and THCUNN (Torch CUDA Neural Network)—are written as independent libraries with a C99 API. At the same time, PyTorch is not a Python binding into a monolithic C++ framework—the intention is for it to be deeply integrated with Python and to allow the use of other Python libraries.

Cost: Free open source.

Platform: Linux, MacOS, Windows; CPUs and Nvidia GPUs.


The Scikit-learn Python framework has a wide selection of robust machine learning algorithms, but no deep learning. If you’re a Python fan, Scikit-learn may well be the best option for you among the plain machine learning libraries.

Scikit-learn is a robust and well-proven machine learning library for Python with a wide assortment of well-established algorithms and integrated graphics. It is relatively easy to install, learn, and use, and it has good examples and tutorials.

On the con side, Scikit-learn does not cover deep learning or reinforcement learning, lacks graphical models and sequence prediction, and it can’t really be used from languages other than Python. It doesn’t support PyPy, the Python just-in-time compiler, or GPUs. That said, except for its minor foray into neural networks, it doesn’t really have speed problems. It uses Cython (the Python to C compiler) for functions that need to be fast, such as inner loops.

Scikit-learn has a good selection of algorithms for classification, regression, clustering, dimensionality reduction, model selection, and preprocessing. It has good documentation and examples for all of these, but lacks any kind of guided workflow for accomplishing these tasks.

Scikit-learn earns top marks for ease of development, mostly because the algorithms all work as documented, the APIs are consistent and well-designed, and there are few “impedance mismatches” between data structures. It’s a pleasure to work with a library whose features have been thoroughly fleshed out and whose bugs have been thoroughly flushed out.

On the other hand, the library does not cover deep learning or reinforcement learning, which leaves out the current hard but important problems, such as accurate image classification and reliable real-time language parsing and translation. Clearly, if you’re interested in deep learning, you should look elsewhere.

Nevertheless, there are many problems—ranging from building a prediction function linking different observations, to classifying observations, to learning the structure of an unlabeled dataset—that lend themselves to plain old machine learning without needing dozens of layers of neurons, and for those areas Scikit-learn is very good indeed.

Cost: Free open source.

Platform: Requires Python, NumPy, SciPy, and Matplotlib. Releases are available for MacOS, Linux, and Windows.

Spark MLlib

Spark MLlib, the open source machine learning library for Apache Spark, provides common machine learning algorithms such as classification, regression, clustering, and collaborative filtering (but not deep neural networks). It also includes tools for feature extraction, transformation, dimensionality reduction, and selection; tools for constructing, evaluating, and tuning machine learning pipelines; and utilities for saving and loading algorithms, models, and pipelines, for data handling, and for doing linear algebra and statistics.

Spark MLlib is written in Scala, and uses the linear algebra package Breeze. Breeze depends on netlib-java for optimized numerical processing, although in the open source distribution that means optimized use of the CPU. Databricks offers customized Spark clusters that use GPUs, which can potentially get you another 10x speed improvement for training complex machine learning models with big data.

Spark MLlib implements a truckload of common algorithms and models for classification and regression, to the point where a novice could become confused, but an expert would be likely to find a good choice of model for the data to be analyzed, eventually. To this plethora of models Spark 2.x adds the important feature of hyperparameter tuning, also known as model selection. Hyperparameter tuning allows the analyst to set up a parameter grid, an estimator, and an evaluator, and let the cross-validation method (time-consuming but accurate) or train validation split method (faster but less accurate) find the best model for the data.

Spark MLlib has full APIs for Scala and Java, mostly-full APIs for Python, and sketchy partial APIs for R. You can get a good feel for the coverage by counting the samples: 54 Java and 60 Scala machine learning examples, 52 Python machine learning examples, and only five R examples. In my experience Spark MLlib is easiest to work with using Jupyter notebooks, but you can certainly run it in a console if you tame the verbose Spark status messages.

Spark MLlib supplies pretty much anything you’d want in the way of basic machine learning, feature selection, pipelines, and persistence. It does a pretty good job with classification, regression, clustering, and filtering. Given that it is part of Spark, it has great access to databases, streams, and other data sources. On the other hand, Spark MLlib is not really set up to model and train deep neural networks in the same way as TensorFlow, PyTorch, MXNet, and Keras.

Cost: Free open source.

Platform: Spark runs on both Windows and Unix-like systems (e.g. Linux, MacOS), with Java 7 or later, Python 2.6/3.4 or later, and R 3.1 or later. For the Scala API, Spark 2.0.1 uses Scala 2.11. Spark requires Hadoop/HDFS.


TensorFlow is probably the gold standard for deep neural network development, although it is not without its defects. Two of the biggest issues with TensorFlow historically were that it was too hard to learn and that it took too much code to create a model. Both issues have been addressed over the last few years.

To make TensorFlow easier to learn, the TensorFlow team has produced more learning materials as well as clarifying the existing “getting started” tutorials. A number of third parties have produced their own tutorial materials (including InfoWorld). There are now multiple TensorFlow books in print, and several online TensorFlow courses. You can even follow the CS20 course at Stanford, TensorFlow for Deep Learning Research, which posts all the slides and lecture notes online.

There are several new sections of the TensorFlow library that offer interfaces that require less programming to create and train models. These include tf.keras, which provides a TensorFlow-only version of the otherwise engine-neutral Keras package, and tf.estimator, which provides a number of high-level facilities for working with models. These include both regressors and classifiers for linear, deep neural networks, and combined linear and deep neural networks, plus a base class from which you can build your own estimators. In addition, the Dataset APIenables you to build complex input pipelines from simple, reusable pieces. You don’t have to choose just one. As this tutorial shows, you can usefully make tf.keras,, and tf.estimator work together.

TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices, which enables on-device machine learning inference (but not training) with low latency and a small binary size. TensorFlow Lite also supports hardware acceleration with the Android Neural Networks API. TensorFlow Lite models are small enough to run on mobile devices, and can serve the offline use case.

The basic idea of TensorFlow Lite is that you train a full-blown TensorFlow model and convert it to the TensorFlow Lite model format. Then you can use the converted file in your mobile application on Android or iOS.

Alternatively, you can use one of the pre-trained TensorFlow Lite models for image classification or smart replies. Smart replies are contextually relevant messages that can be offered as response options; this essentially provides the same reply prediction functionality as found in Google’s Gmail clients.

Yet another option is to retrain an existing TensorFlow model against a new tagged dataset, an important technique called transfer learning, which reduces training times significantly. A hands-on tutorial on this process is called TensorFlow for Poets.

Cost: Free open source.

Platform: Ubuntu 14.04 or later, MacOS 10.11 or later, Windows 7 or later; Nvidia GPU and CUDA recommended. Most clouds now support TensorFlow with Nvidia GPUs. TensorFlow Lite runs trained models on Android and iOS.

Machine learning or deep learning?

Sometimes you know that you’ll need a deep neural network to solve a particular problem effectively, for example to classify images, recognize speech, or translate languages. Other times, you don’t know whether that’s necessary, for example to predict next month’s sales figures or to detect outliers in your data.

If you do need a deep neural network, then Keras, MXNet with Gluon, PyTorch, and TensorFlow with Keras or Estimators are all good choices. If you aren’t sure, then start with Scikit-learn or Spark MLlib and try all the relevant algorithms. If you get satisfactory results from the best model or an ensemble of several models, you can stop.

If you need better results, then try to perform transfer learning on a trained deep neural network. If you still don’t get what you need, then try building and training a deep neural network from scratch. To refine your model, try hyperparameter tuning.

No matter what method you use to train a model, remember that the model is only as good as the data you use for training. Remember to clean it, to standardize it, and to balance the sizes of your training classes.