Spreadsheet Powered AI

Spreadsheet Powered AI

Spreadsheet Powered AI - TL;DR Multilayer perceptron that can be used to solve XOR problems only using spreadsheet formulas

Spreadsheet Powered AI - TL;DR Multilayer perceptron that can be used to solve XOR problems only using spreadsheet formulas

How To Use the AI Sheet

Get Your Own Copy

Straight to business, here's a link that automatically makes a copy of the sheet, I've had issues with this method, so I'm still including the manual steps below.

Get Your Own Copy (Manual Method)

In case that link doesn't work, here's the main sheet (not a copy).. For obvious reasons, the main version is readonly. To play around with it, select File->Make a Copy.

That will give you your own version of the sheet.

Iteration Issues

Sometimes the iterations setting for sheets won't carry over to copies. To enable this setting go to File->Spreadsheet settings

Next, navigate to the Calculation tab and pick settings similar to these (feel free to play with max iterations, prepare for potential lag).

Running the Network

To get the network working, no ML knowledge is needed. The sheet works like a for loop, the "Current Iteration" cell D1 increments after each iteration of the network. Before incrementing, it checks the value of the "Target Iterations" cell B5. This is equivalent to the following code.

for (int current_iterations = 0; current_iterations < target_iterations; current_iterations++) {
  // run the network
}

So if you want the network to run, simply increase the value of target iterations. I would reccommend doing this in small increments (10 - 50) or the network can get out of wack. If the network does get out of wack, I've found copying cells B69:E78 then deleting and immediately pasting the cells back to be particularly effective.

I've also provided an "Update" button which simply increases the "Target Iterations" cell value by 10. You might need to enable the script on your copy of the sheet.

"Success" is when the "Current Predictions" G9:G15 match the "target result" E9:E15. At that point it has 100% accuracy with any XOR input (try moving the inputs around).

Explanation/Precursor Knowledge

If You Already Know About Neural Networks, Feel Free to Skip Ahead

Linear Separability

Data is said to be ["linearly separable"](https://en.wikipedia.org/wiki/Linear_separability ""linearly separable"") if a single straight line can cleanly separate that data's features on a graph. For example, here is a graph of two distinct groups (red and blue) that are linearly separable (green line).

To separate data linearly, you would normally turn to an equation such as linear regression. But what if we wanted to work with data that can't be separated linearly (shown below)?

Algorithms such as SVM handle this problem by artificially adding dimensions to your data. This is done in the hope, that at higher dimensions, there exists a single line that can separate the data. But there's also another option, neural networks!

Perceptron

The simplest form of a neural network is the Perceptron (AKA the "artificial neuron"). Perceptrons take in n inputs, and give a single binary output, indicating whether the provided input was below or above a predefined threshold. Because of these principles, perceptrons are often used as binary linear separators.

But didn't we just say that we wanted to solve the separation problem for non-linear data? How can a perceptron be of use to us?

Multilayer Perceptron

Although a single perceptron can only separate data in a linear fashion, a few groups of perceptrons working together are able to accomplish the task of separating non-linear data. We refer to this collection of perceptrons working together as a MLP(multilayer perceptron). A multilayer perceptron consists of at least three layers (sets of perceptrons), the input layer, the hidden layer and and the output layer.

MLP is a supervised algorithm, which means it needs to be "told" when it correctly or incorrectly predicts something. MLP accomplishes this by having two distinct phases.

Feed-forward Stage: During this stage, the input perceptrons of the network are fed training input data. The input passes through each layer of perceptrons, eventually resulting in a single binary output that represents the "prediction" made by the network.

Back Propagation Stage: The output prediction from the feed-forward stage is compared to the expected output defined by the training set. If the predictions are correct, there is very little to be done. But if the predictions are incorrect, the network "propagates" the difference between the expected and actual results back through the network. This process gives the network a chance to "correct" the mistakes that caused the failed prediction, hopefully increasing the accuracy of subsequent runs. Usually, this is accomplished through the use of an optimization function such as the SGD (Stochastic Gradient Descent) algorithm.

XOR Problem

We need a non-linear problem that we can try and use a network to solve. To keep things simple, we'll attempt to build an "AI" that can correctly predict XOR when given a set of inputs. For those unfamiliar, XOR is a simple bitwise operator that is defined by the following truth table.

Input Bit 0Input Bit 1Input Bit 2Expected Output1001010100110110110010101110

If there is more than 1 bit in a row of the XOR table, the expression is false. If there exists only a single bit (as with rows 1, 2, 3), the expression is true.

Getting the Sheet Ready

To start building the spreadsheet XOR predictor, we need to first copy the XOR training set from our table (above).

B9:E15

We also need to define some stateful variables to make sure our network can persist and learn. Specifically, there variables represent the perceptrons "weights" at each layer of the network.

B74:E78

Here's the formula from one of the above cells.

IF(ISNUMBER(B69), B69, (RAND() * 2) - 1)

The weights of the perceptrons need to start out as random values in a defined range of -1 to +1. This is easily accomplished using the (RAND() * 2) - 1 formula (RAND() generates a random number from 0 - 1).

The problem arises once we need to update the weights. We don't want them to be randomly set and therefore override the progress made by our network. IF allows us to start with random values, but once cell B69 has content, the formula will use the value of that cell (instead of generating randomly).

Feed Forward Stage

Input Activation Potentials B17:E23

Next, we will begin implementing the feed-forward stage of our network. The first step is to calculate "Input Activation Potentials" by taking the dot product of each row of the training input data B9:D15 and each column of the hidden perceptron weights B76:E78.

=ARRAYFORMULA(
 IF(
 AND(current_iteration < target_iterations,
  ISNUMBER(current_iteration)),
    { 
      MMULT({training_inputs, ones},
        TRANSPOSE(hidden_perceptrons)),
      ones
    },
    INDIRECT("RC", FALSE)))

The IF portion simply stops the network from looping infinitely, by checking the cell holding the current number of iterations.

IF(
AND(current_iteration < target_iterations,
 ISNUMBER(current_iteration)), 

The arguments following the IF are evaluated based on the result of the branch. If...

current iterations == target iterations

we use INDIRECT("RC", FALSE), which is a comparable to using this in modern programming languages. Otherwise, we calculate the dot products using MMULT (matrix multiplication is just multiple dot products).

Hidden Layer Activators B25:E31

Next we need to calculate the activators of the hidden layer of the network. To compute the activators, we run a ["activation function"](https://en.wikipedia.org/wiki/Activation_function ""activation function"") on the potentials B17:E23. In practice there are many choices for your activator, in this case we will use the Sigmoid function.

Sigmoid Function 1 / (1 + exp(-x))

={
  ARRAYFORMULA(1 / ( 1 + EXP(-potentials))),
  potentials_bias
 }

Output Layer Activators B33:B39

Calculating the activations of our outputs is next. Unlike the hidden activations, we don't need to run our output potentials through the activator function. This makes calculating our output activations much easier.

=MMULT(
  hidden_activators, TRANSPOSE(output_perceptron)
 )

Just like we did for the hidden potentials, we take the dot product of the output perceptrons B74:E74 with the "inputs" (hidden activators B25:E31) to generate the initial output of the network.

Output Binary D33:D39

We now need to take the "raw" values emitted from the output layer B33:B39 and convert them into a binary representation. This is required, because we are trying to solve a XOR problem, where only boolean values are considered valid.

=IF(B33 < 0.5, 0, 1)

Each cell of the "Output Binary" D33:D39 checks it's partner cell in the "Output Activations" (what was calculated in the last step, cells B33:B39). If the output activation value is below 0.5, the binary result is 0. Otherwise, the binary result is 1.

Congratulations! Even though the current prediction is almost statistically guaranteed to be wrong, you can at least say a you got to the point where a prediction was made!

Backward Propagation Stage

This stage is where real "learning" takes place. Using the predictions and expected results, we calculate derivatives to adjust our weights.

Output Layer Delta F33:F39

To continue, we need to calculate the amount the output differed from the target results. This is accomplished by subtracting each target result (E9:E15) from it's corresponding output activator (B33:B39). The 2 is simply a scalar used to increase the significance of the difference.

=(B33 - E9) * 2

Hidden Layer Sum B41:E47

This next step is very similar to a portion of the feed-forward stage. We're just calculating dot products between the output deltas previously calculated and the current "Output Perceptron" weight values.

=MMULT(
  output_delta, output_perceptron
 )

Hidden Layer Deltas B49:E55

Next, we determine how much the hidden layer weights should be updated. This is calculated using our hidden activators B25:E31 and the derivative of the sigmoid function (the derivative of our activator in the feed forward stage).

Derived Sigmoid Function x * (1 - x)

=ARRAYFORMULA(
  (hidden_activators * (1 - hidden_activators)) * hidden_sum
 )

Output Change B57:E63, Output Change Average B65:E65 and Updated Output B67:E67

We want to calculate how much our output should change based on "Output Delta" F33:F39, our "Hidden Activators" B25:E31 and the "Learning Rate" B3. Due to a limitation of sheets, we have to do this in three discrete steps.

Output Change B57:E63

We want to calculate how much the output perceptrons should be changed. The "Output Delta" represents how much our prediction differed from the expected results. We multiply by the "Hidden Activators", because they represent the state that produced the most recent predictions.

"Learning rate" B3 is a scalar that controls how much each iteration influences the "learning" of the network. A high learning rate may sound enticing, but it introduces the possibility of overstepping a minimum and missing a convergence. Feel free to play around with learning rate, but take small steps.

=ARRAYFORMULA(
  ((output_delta * hidden_activators) * learning_rate)
 )

Output Change Avg B65:E65

For both clarity and practicality, it was easier to separate out the averaging of our changes into a separate step. Averaging is necessary because we are actually feeding the network 7 inputs at once B9:E15. That means we're actually dealing with 7 outputs at once, and averaging provides a way to consolidate that into a single value.

=SUM(dim0_output_change)

Updated Output B67:E67

Finally, subtract the calculated "Output Change Average" B65:E65 from the current values of the "Output Perceptron" B74:E74.

=ARRAYFORMULA(
  output_perceptron - output_weights_avg_change
 )

Updated Hidden B69:E72

We just updated the weights of our "Output Perceptrons" so now we need to update the weights for our "Hidden Perceptrons". Mirroring the structure used in the previous step, we use the "Hidden Deltas" B49:E55 the "Learning Rate" B3 and the "Test Input" B9:E15, to calculate how much our "Hidden Perceptrons" should change, based on the previous iteration of the network.

The TRANSPOSE({1,1,1,1,1,1,1}) represents our bias neurons, which function similarly to b in y = mx + b.

=ARRAYFORMULA(
  hidden_perceptrons - ARRAYFORMULA(
    (learning_rate * (
        MMULT(
          TRANSPOSE(hidden_delta),
          { 
            training_inputs, TRANSPOSE({1,1,1,1,1,1,1})
          }
        )
      )
    )
  )
 )

Instead of the 3 distinct steps we used to calculate the "Output Change", we keep this condensed to a single step.

Loop

At the beginning of the explanation, we described how our weights are initialized to random values (using the RAND formula). Specifically, the IFconditional referred to a set of cells, and if those cells were empty, RAND was the fallback. Now that we have calculated valid values for the "Updated Output" and "Updated Hidden", the IF no longer evaluates the RANDbranch. Instead, it proxies those "Updated" values as its own.

Output Perceptron

=IF(ISNUMBER(B67), B67, (RAND() * 2) - 1)

Hidden Perceptron

=IF(ISNUMBER(B69), B69, (RAND() * 2) - 1)

And because our "Activation Potentials" B17:E23 depend on the "Hidden Perceptrons", any updates to the "Hidden Perceptrons" force a recalculation of the "Activation Potentials". This means, as long as "Current Iteration" D1 is less than "Target Iterations" B5, the network will loop.

Idea Origin

At a previous company, I led a technical team implementing high performance neural networks on top of our core product (scale-out distributed compute platform). During this time, I quickly learned, that for most people (including upper-level management), neural networks are indistinguishable from alien technology.

It became clear, that I needed a way to explain the basic mechanics behind deep learning, without explaining the intricacies of implementing a high performance neural network. At first I tried using simple network engines written in python, but this only made sense to those who were already python programmers. But then, I had an epiphany! I realized that it might be possible to build a simple neural network, entirely with spreadsheet formulas.

The best machine learning and deep learning libraries

The best machine learning and deep learning libraries

You are asking Why TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models.If you’re starting a new machine learning or deep learning project, you may be confused about which framework to choose...

You are asking Why TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models.If you’re starting a new machine learning or deep learning project, you may be confused about which framework to choose...

There is a difference between a machine learning framework and a deep learning framework. Essentially, a machine learning framework covers a variety of learning methods for classification, regression, clustering, anomaly detection, and data preparation, and may or may not include neural network methods.

A deep learning or deep neural network framework covers a variety of neural network topologies with many hidden layers. Keras, MXNet, PyTorch, and TensorFlow are deep learning frameworks. Scikit-learn and Spark MLlib are machine learning frameworks. (Click any of the previous links to read my stand-alone review of the product.)

In general, deep neural network computations run much faster on a GPU (specifically an Nvidia CUDA general-purpose GPU), TPU, or FPGA, rather than on a CPU. In general, simpler machine learning methods don’t benefit from a GPU.

While you can train deep neural networks on one or more CPUs, the training tends to be slow, and by slow I’m not talking about seconds or minutes. The more neurons and layers that need to be trained, and the more data available for training, the longer it takes. When the Google Brain team trained its language translation models for the new version of Google Translate in 2016, they ran their training sessions for a week at a time, on multiple GPUs. Without GPUs, each model training experiment would have taken months.

Since then, the Intel Math Kernel Library (MKL) has made it possible to train some neural networks on CPUs in a reasonable amount of time. Meanwhile GPUs, TPUs, and FPGAs have gotten even faster.

The training speed of all of the deep learning packages running on the same GPUs is nearly identical. That’s because the training inner loops spend most of their time in the Nvidia CuDNN package.

Apart from training speed, each of the deep learning libraries has its own set of pros and cons, and the same is true of Scikit-learn and Spark MLlib. Let’s dive in.

Keras

Keras is a high-level, front-end specification and implementation for building neural network models that ships with support for three back-end deep learning frameworks: TensorFlow, CNTK, and Theano. Amazon is currently working on developing a MXNet back-end for Keras. It’s also possible to use PlaidML (an independent project) as a back-end for Keras to take advantage of PlaidML’s OpenCL support for all GPUs.

TensorFlow is the default back-end for Keras, and the one recommended for many use cases involving GPU acceleration on Nvidia hardware via CUDA and cuDNN, as well as for TPU acceleration in Google Cloud. TensorFlow also contains an internal tf.keras class, separate from an external Keras installation.

Keras has a high-level environment that makes adding a layer to a neural network as easy as one line of code in its Sequential model, and requires only one function call each for compiling and training a model. Keras lets you work at a lower level if you want, with its Model or functional API.

Keras allows you to drop down even farther, to the Python coding level, by subclassing keras.Model, but prefers the functional API when possible. Keras also has a scikit-learn API, so that you can use the Scikit-learn grid search to perform hyperparameter optimization in Keras models.

Cost: Free open source.

Platform: Linux, MacOS, Windows, or Raspbian; TensorFlow, Theano, or CNTK back-end.

MXNet

MXNet has evolved and improved quite a bit since moving under the Apache Software Foundation umbrella early in 2017. While there has been work on Keras with an MXNet back-end, a different high-level interface has become much more important: Gluon. Prior to the incorporation of Gluon, you could either write easy imperative code or fast symbolic code in MXNet, but not both at once. With Gluon, you can combine the best of both worlds, in a way that competes with both Keras and PyTorch.

The advantages claimed for Gluon include:

  • Simple, easy-to-understand code: Gluon offers a full set of plug-and-play neural network building blocks, including predefined layers, optimizers, and initializers.
  • Flexible, imperative structure: Gluon does not require the neural network model to be rigidly defined, but rather brings the training algorithm and model closer together to provide flexibility in the development process.
  • Dynamic graphs: Gluon enables developers to define neural network models that are dynamic, meaning they can be built on the fly, with any structure, and using any of Python’s native control flow.
  • High performance: Gluon provides all of the above benefits without impacting the training speed that the underlying engine provides.

These four advantages, along with a vastly expanded collection of model examples, bring Gluon/MXNet to rough parity with Keras/TensorFlow and PyTorch for ease of development and training speed. You can see code examples for each these on the main Gluon page and repeated on the overview page for the Gluon API.

The Gluon API includes functionality for neural network layers, recurrent neural networks, loss functions, dataset methods and vision datasets, a model zoo, and a set of contributed experimental neural network methods. You can freely combine Gluon with standard MXNet and NumPy modules, for example module**, **autograd, and ndarray, as well as with Python control flows.

Gluon has a good selection of layers for building models, including basic layers (Dense, Dropout, etc.), convolutional layers, pooling layers, and activation layers. Each of these is a one-line call. These can be used, among other places, inside of network containers such as gluon.nn.Sequential().

Cost: Free open source.

Platform: Linux, MacOS, Windows, Docker, Raspbian, and Nvidia Jetson; Python, R, Scala, Julia, Perl, C++, and Clojure (experimental). MXNet is included in the AWS Deep Learning AMI.

PyTorch

PyTorch builds on the old Torch and the new Caffe2 framework. As you might guess from the name, PyTorch uses Python as its scripting language, and it uses an evolved Torch C/CUDA back-end. The production features of Caffe2 are being incorporated into the PyTorch project.

PyTorch is billed as “Tensors and dynamic neural networks in Python with strong GPU acceleration.” What does that mean?

Tensors are a mathematical construct that is used heavily in physics and engineering. A tensor of rank two is a special kind of matrix; taking the inner product of a vector with the tensor yields another vector with a new magnitude and a new direction. TensorFlow takes its name from the way tensors (of synapse weights) flow around its network model. NumPy also uses tensors, but calls them an ndarray.

GPU acceleration is a given for most modern deep neural network frameworks. A dynamic neural network is one that can change from iteration to iteration, for example allowing a PyTorch model to add and remove hidden layers during training to improve its accuracy and generality. PyTorch recreates the graph on the fly at each iteration step. In contrast, TensorFlow by default creates a single dataflow graph, optimizes the graph code for performance, and then trains the model.

While eager execution mode is a fairly new option in TensorFlow, it’s the only way PyTorch runs: API calls execute when invoked, rather than being added to a graph to be run later. That might seem like it would be less computationally efficient, but PyTorch was designed to work that way, and it is no slouch when it comes to training or prediction speed.

PyTorch integrates acceleration libraries such as Intel MKL and Nvidia cuDNN and NCCL (Nvidia Collective Communications Library) to maximize speed. Its core CPU and GPU Tensor and neural network back-ends—TH (Torch), THC (Torch CUDA), THNN (Torch Neural Network), and THCUNN (Torch CUDA Neural Network)—are written as independent libraries with a C99 API. At the same time, PyTorch is not a Python binding into a monolithic C++ framework—the intention is for it to be deeply integrated with Python and to allow the use of other Python libraries.

Cost: Free open source.

Platform: Linux, MacOS, Windows; CPUs and Nvidia GPUs.

Scikit-learn

The Scikit-learn Python framework has a wide selection of robust machine learning algorithms, but no deep learning. If you’re a Python fan, Scikit-learn may well be the best option for you among the plain machine learning libraries.

Scikit-learn is a robust and well-proven machine learning library for Python with a wide assortment of well-established algorithms and integrated graphics. It is relatively easy to install, learn, and use, and it has good examples and tutorials.

On the con side, Scikit-learn does not cover deep learning or reinforcement learning, lacks graphical models and sequence prediction, and it can’t really be used from languages other than Python. It doesn’t support PyPy, the Python just-in-time compiler, or GPUs. That said, except for its minor foray into neural networks, it doesn’t really have speed problems. It uses Cython (the Python to C compiler) for functions that need to be fast, such as inner loops.

Scikit-learn has a good selection of algorithms for classification, regression, clustering, dimensionality reduction, model selection, and preprocessing. It has good documentation and examples for all of these, but lacks any kind of guided workflow for accomplishing these tasks.

Scikit-learn earns top marks for ease of development, mostly because the algorithms all work as documented, the APIs are consistent and well-designed, and there are few “impedance mismatches” between data structures. It’s a pleasure to work with a library whose features have been thoroughly fleshed out and whose bugs have been thoroughly flushed out.

On the other hand, the library does not cover deep learning or reinforcement learning, which leaves out the current hard but important problems, such as accurate image classification and reliable real-time language parsing and translation. Clearly, if you’re interested in deep learning, you should look elsewhere.

Nevertheless, there are many problems—ranging from building a prediction function linking different observations, to classifying observations, to learning the structure of an unlabeled dataset—that lend themselves to plain old machine learning without needing dozens of layers of neurons, and for those areas Scikit-learn is very good indeed.

Cost: Free open source.

Platform: Requires Python, NumPy, SciPy, and Matplotlib. Releases are available for MacOS, Linux, and Windows.

Spark MLlib

Spark MLlib, the open source machine learning library for Apache Spark, provides common machine learning algorithms such as classification, regression, clustering, and collaborative filtering (but not deep neural networks). It also includes tools for feature extraction, transformation, dimensionality reduction, and selection; tools for constructing, evaluating, and tuning machine learning pipelines; and utilities for saving and loading algorithms, models, and pipelines, for data handling, and for doing linear algebra and statistics.

Spark MLlib is written in Scala, and uses the linear algebra package Breeze. Breeze depends on netlib-java for optimized numerical processing, although in the open source distribution that means optimized use of the CPU. Databricks offers customized Spark clusters that use GPUs, which can potentially get you another 10x speed improvement for training complex machine learning models with big data.

Spark MLlib implements a truckload of common algorithms and models for classification and regression, to the point where a novice could become confused, but an expert would be likely to find a good choice of model for the data to be analyzed, eventually. To this plethora of models Spark 2.x adds the important feature of hyperparameter tuning, also known as model selection. Hyperparameter tuning allows the analyst to set up a parameter grid, an estimator, and an evaluator, and let the cross-validation method (time-consuming but accurate) or train validation split method (faster but less accurate) find the best model for the data.

Spark MLlib has full APIs for Scala and Java, mostly-full APIs for Python, and sketchy partial APIs for R. You can get a good feel for the coverage by counting the samples: 54 Java and 60 Scala machine learning examples, 52 Python machine learning examples, and only five R examples. In my experience Spark MLlib is easiest to work with using Jupyter notebooks, but you can certainly run it in a console if you tame the verbose Spark status messages.

Spark MLlib supplies pretty much anything you’d want in the way of basic machine learning, feature selection, pipelines, and persistence. It does a pretty good job with classification, regression, clustering, and filtering. Given that it is part of Spark, it has great access to databases, streams, and other data sources. On the other hand, Spark MLlib is not really set up to model and train deep neural networks in the same way as TensorFlow, PyTorch, MXNet, and Keras.

Cost: Free open source.

Platform: Spark runs on both Windows and Unix-like systems (e.g. Linux, MacOS), with Java 7 or later, Python 2.6/3.4 or later, and R 3.1 or later. For the Scala API, Spark 2.0.1 uses Scala 2.11. Spark requires Hadoop/HDFS.

TensorFlow

TensorFlow is probably the gold standard for deep neural network development, although it is not without its defects. Two of the biggest issues with TensorFlow historically were that it was too hard to learn and that it took too much code to create a model. Both issues have been addressed over the last few years.

To make TensorFlow easier to learn, the TensorFlow team has produced more learning materials as well as clarifying the existing “getting started” tutorials. A number of third parties have produced their own tutorial materials (including InfoWorld). There are now multiple TensorFlow books in print, and several online TensorFlow courses. You can even follow the CS20 course at Stanford, TensorFlow for Deep Learning Research, which posts all the slides and lecture notes online.

There are several new sections of the TensorFlow library that offer interfaces that require less programming to create and train models. These include tf.keras, which provides a TensorFlow-only version of the otherwise engine-neutral Keras package, and tf.estimator, which provides a number of high-level facilities for working with models. These include both regressors and classifiers for linear, deep neural networks, and combined linear and deep neural networks, plus a base class from which you can build your own estimators. In addition, the Dataset APIenables you to build complex input pipelines from simple, reusable pieces. You don’t have to choose just one. As this tutorial shows, you can usefully make tf.keras, tf.data.dataset, and tf.estimator work together.

TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices, which enables on-device machine learning inference (but not training) with low latency and a small binary size. TensorFlow Lite also supports hardware acceleration with the Android Neural Networks API. TensorFlow Lite models are small enough to run on mobile devices, and can serve the offline use case.

The basic idea of TensorFlow Lite is that you train a full-blown TensorFlow model and convert it to the TensorFlow Lite model format. Then you can use the converted file in your mobile application on Android or iOS.

Alternatively, you can use one of the pre-trained TensorFlow Lite models for image classification or smart replies. Smart replies are contextually relevant messages that can be offered as response options; this essentially provides the same reply prediction functionality as found in Google’s Gmail clients.

Yet another option is to retrain an existing TensorFlow model against a new tagged dataset, an important technique called transfer learning, which reduces training times significantly. A hands-on tutorial on this process is called TensorFlow for Poets.

Cost: Free open source.

Platform: Ubuntu 14.04 or later, MacOS 10.11 or later, Windows 7 or later; Nvidia GPU and CUDA recommended. Most clouds now support TensorFlow with Nvidia GPUs. TensorFlow Lite runs trained models on Android and iOS.

Machine learning or deep learning?

Sometimes you know that you’ll need a deep neural network to solve a particular problem effectively, for example to classify images, recognize speech, or translate languages. Other times, you don’t know whether that’s necessary, for example to predict next month’s sales figures or to detect outliers in your data.

If you do need a deep neural network, then Keras, MXNet with Gluon, PyTorch, and TensorFlow with Keras or Estimators are all good choices. If you aren’t sure, then start with Scikit-learn or Spark MLlib and try all the relevant algorithms. If you get satisfactory results from the best model or an ensemble of several models, you can stop.

If you need better results, then try to perform transfer learning on a trained deep neural network. If you still don’t get what you need, then try building and training a deep neural network from scratch. To refine your model, try hyperparameter tuning.

No matter what method you use to train a model, remember that the model is only as good as the data you use for training. Remember to clean it, to standardize it, and to balance the sizes of your training classes.

Deep Learning vs. Conventional Machine Learning

Deep Learning vs. Conventional Machine Learning

Over the past few years, deep learning has given rise to a massive collection of ideas and techniques which are disruptive to conventional machine learning practices. However, are those ideas totally different from the traditional methods? Where are the connections and differences? What are the advantages and disadvantages? How practical are the deep learning methods for business applications? Chao will share her thoughts on those questions based on her readings and hands on experiments in the areas of text analytics (question answering system, sentiment analysis) and healthcare image classification.

Over the past few years, deep learning has given rise to a massive collection of ideas and techniques which are disruptive to conventional machine learning practices. However, are those ideas totally different from the traditional methods? Where are the connections and differences? What are the advantages and disadvantages? How practical are the deep learning methods for business applications? Chao will share her thoughts on those questions based on her readings and hands on experiments in the areas of text analytics (question answering system, sentiment analysis) and healthcare image classification.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Deep Learning and Machine Learning

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning for Front-End Developers

The Future of Machine Learning and JavaScript

Deep Learning With TensorFlow 2.0

How to get started with Python for Deep Learning and Data Science

Machine Learning Full Course - Learn Machine Learning

Machine Learning Full Course - Learn Machine Learning

This complete Machine Learning full course video covers all the topics that you need to know to become a master in the field of Machine Learning.

Machine Learning Full Course | Learn Machine Learning | Machine Learning Tutorial

It covers all the basics of Machine Learning (01:46), the different types of Machine Learning (18:32), and the various applications of Machine Learning used in different industries (04:54:48).This video will help you learn different Machine Learning algorithms in Python. Linear Regression, Logistic Regression (23:38), K Means Clustering (01:26:20), Decision Tree (02:15:15), and Support Vector Machines (03:48:31) are some of the important algorithms you will understand with a hands-on demo. Finally, you will see the essential skills required to become a Machine Learning Engineer (04:59:46) and come across a few important Machine Learning interview questions (05:09:03). Now, let's get started with Machine Learning.

Below topics are explained in this Machine Learning course for beginners:

  1. Basics of Machine Learning - 01:46

  2. Why Machine Learning - 09:18

  3. What is Machine Learning - 13:25

  4. Types of Machine Learning - 18:32

  5. Supervised Learning - 18:44

  6. Reinforcement Learning - 21:06

  7. Supervised VS Unsupervised - 22:26

  8. Linear Regression - 23:38

  9. Introduction to Machine Learning - 25:08

  10. Application of Linear Regression - 26:40

  11. Understanding Linear Regression - 27:19

  12. Regression Equation - 28:00

  13. Multiple Linear Regression - 35:57

  14. Logistic Regression - 55:45

  15. What is Logistic Regression - 56:04

  16. What is Linear Regression - 59:35

  17. Comparing Linear & Logistic Regression - 01:05:28

  18. What is K-Means Clustering - 01:26:20

  19. How does K-Means Clustering work - 01:38:00

  20. What is Decision Tree - 02:15:15

  21. How does Decision Tree work - 02:25:15 

  22. Random Forest Tutorial - 02:39:56

  23. Why Random Forest - 02:41:52

  24. What is Random Forest - 02:43:21

  25. How does Decision Tree work- 02:52:02

  26. K-Nearest Neighbors Algorithm Tutorial - 03:22:02

  27. Why KNN - 03:24:11

  28. What is KNN - 03:24:24

  29. How do we choose 'K' - 03:25:38

  30. When do we use KNN - 03:27:37

  31. Applications of Support Vector Machine - 03:48:31

  32. Why Support Vector Machine - 03:48:55

  33. What Support Vector Machine - 03:50:34

  34. Advantages of Support Vector Machine - 03:54:54

  35. What is Naive Bayes - 04:13:06

  36. Where is Naive Bayes used - 04:17:45

  37. Top 10 Application of Machine Learning - 04:54:48

  38. How to become a Machine Learning Engineer - 04:59:46

  39. Machine Learning Interview Questions - 05:09:03