PyTorch vs TensorFlow: Which Framework Is Best?

PyTorch vs TensorFlow: Which Framework Is Best?

The pros and cons of using PyTorch or TensorFlow for deep learning in Python projects.

If you are reading this you've probably already started your journey into deep learning. If you are new to this field, in simple terms deep learning is an add-on to develop human-like computers to solve real-world problems with its special brain-like architectures called artificial neural networks. To help develop these architectures, tech giants like Google, Facebook and Uber have released various frameworks for the Python deep learning environment, making it easier for to learn, build and train diversified neural networks. In this article, we’ll take a look at two popular frameworks and compare them: PyTorch vs. TensorFlow. be comparing, in brief, the most used and relied Python frameworks TensorFlow and PyTorch.


TensorFlow is open source deep learning framework created by developers at Google and released in 2015. The official research is published in the paper “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.” 

TensorFlow is now widely used by companies, startups, and business firms to automate things and develop new systems. It draws its reputation from its distributed training support, scalable production and deployment options, and support for various devices like Android.


PyTorch is one of the latest deep learning frameworks and was developed by the team at Facebook and open sourced on GitHub in 2017. You can read more about its development in the research paper ["Automatic Differentiation in PyTorch."]( ""Automatic Differentiation in PyTorch."")

PyTorch is gaining popularity for its simplicity, ease of use, dynamic computational graph and efficient memory usage, which we'll discuss in more detail later.


Initially, neural networks were used to solve simple classification problems like handwritten digit recognition or identifying a car’s registration number using cameras. But thanks to the latest frameworks and NVIDIA’s high computational graphics processing units (GPU’s), we can train neural networks on terra bytes of data and solve far more complex problems. A few notable achievements include reaching state of the art performance on the IMAGENET dataset using convolutional neural networks implemented in both TensorFlow and PyTorch. The trained model can be used in different applications, such as object detection, image semantic segmentation and more.

Although the architecture of a neural network can be implemented on any of these frameworks, the result will not be the same. The training process has a lot of parameters that are framework dependent. For example, if you are training a dataset on PyTorch you can enhance the training process using GPU’s as they run on CUDA (a C++ backend). In TensorFlow you can access GPU’s but it uses its own inbuilt GPU acceleration, so the time to train these models will always vary based on the framework you choose.


Magenta: An open source research project exploring the role of machine learning as a tool in the creative process. ( 

Sonnet: Sonnet is a library built on top of TensorFlow for building complex neural networks. (

Ludwig: Ludwig is a toolbox to train and test deep learning models without the need to write code. (


CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. (

PYRO: Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. ( 

Horizon: A platform for applied reinforcement learning (Applied RL) (

These are a few frameworks and projects that are built on top of TensorFlow and PyTorch. You can find more on Github and the official websites of TF and PyTorch.


The key difference between PyTorch and TensorFlow is the way they execute code. Both frameworks work on the fundamental datatype tensor. You can imagine a tensor as a multi-dimensional array shown in the below picture.



TensorFlow is a framework composed of two core building blocks:

  1. A library for defining computational graphs and runtime for executing such graphs on a variety of different hardware.
  2. A computational graph which has many advantages (but more on that in just a moment).

A computational graph is an abstract way of describing computations as a directed graph. A graph is a data structure consisting of nodes (vertices) and edges. It’s a set of vertices connected pairwise by directed edges. 

When you run code in TensorFlow, the computation graphs are defined statically. All communication with the outer world is performed via tf.Sessionobject and tf.Placeholder, which are tensors that will be substituted by external data at runtime. For example, consider the following code snippet. 

This is how a computational graph is generated in a static way before the code is run in TensorFlow. The core advantage of having a computational graph is allowing parallelism or dependency driving scheduling which makes training faster and more efficient.



Similar to TensorFlow, PyTorch has two core building blocks: 

  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.

As you can see in the animation below, the graphs change and execute nodes as you go with no special session interfaces or placeholders. Overall, the framework is more tightly integrated with the Python language and feels more native most of the time. Hence, PyTorch is more of a pythonic framework and TensorFlow feels like a completely new language.

These differ a lot in the software fields based on the framework you use. TensorFlow provides a way of implementing dynamic graph using a library called TensorFlow Fold, but PyTorch has it inbuilt. 



One main feature that distinguishes PyTorch from TensorFlow is data parallelism. PyTorch optimizes performance by taking advantage of native support for asynchronous execution from Python. In TensorFlow, you'll have to manually code and fine tune every operation to be run on a specific device to allow distributed training. However, you can replicate everything in TensorFlow from PyTorch but you need to put in more effort. Below is the code snippet explaining how simple it is to implement distributed training for a model in PyTorch.



When it comes to visualization of the training process, TensorFlow takes the lead. Visualization helps the developer track the training process and debug in a more convenient way. TenforFlow’s visualization library is called TensorBoard. PyTorch developers use Visdom, however, the features provided by Visdom are very minimalistic and limited, so TensorBoard scores a point in visualizing the training process.

Features of TensorBoard

  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.

Visualizing training in TensorBoard.

Features of Visdom 

  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.

Visualizing training in Visdom.



When it comes to deploying trained models to production, TensorFlow is the clear winner. We can directly deploy models in TensorFlow using TensorFlow serving which is a framework that uses REST Client API.

In PyTorch, these production deployments became easier to handle than in it’s latest 1.0 stable version, but it doesn't provide any framework to deploy models directly on to the web. You'll have to use either Flask or Django as the backend server. So, TensorFlow serving may be a better option if performance is a concern.



Let's compare how we declare the neural network in PyTorch and TensorFlow.

In PyTorch, your neural network will be a class and using torch.nn package we import the necessary layers that are needed to build your architecture. All the layers are first declared in the __init__() method, and then in the forward()method we define how input x is traversed to all the layers in the network. Lastly, we declare a variable model and assign it to the defined architecture (model = NeuralNet()).

Recently Keras, a neural network framework which uses TensorFlow as the backend was merged into TF Repository. From then on the syntax of declaring layers in TensorFlow was similar to the syntax of Keras. First, we declare the variable and assign it to the type of architecture we will be declaring, in this case a “Sequential()” architecture. Next, we directly add layers in a sequential manner using model.add() method. The type of layer can be imported from tf.layers as shown in the code snippet below.




  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.


  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.



  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.



  • Imperative and dynamic building of computational graphs.
  • Autograds: Performs automatic differentiation of the dynamic graphs.



Recently PyTorch and TensorFlow released new versions, PyTorch 1.0 (the first stable version) and TensorFlow 2.0 (running on beta). Both these versions have major updates and new features that make the training process more efficient, smooth and powerful.

To install the latest version of these frameworks on your machine you can either build from source or install from pip


●  macOS and Linux

pip3 install torch torchvision

●  Windows

pip3 install []( "")

pip3 install []( "")



●  macOS, Linux, and Windows

# Current stable release for CPU-only

pip install tensorflow

# Install TensorFlow 2.0 Beta

pip install tensorflow==2.0.0-beta1

To check if you’re installation was successful, go to your command prompt or terminal and follow the below steps.


TensorFlow is a very powerful and mature deep learning library with strong visualization capabilities and several options to use for high-level model development. It has production-ready deployment options and support for mobile platforms. PyTorch, on the other hand, is still a young framework with stronger community movement and it's more Python friendly.

What I would recommend is if you want to make things faster and build AI-related products, TensorFlow is a good choice. PyTorch is mostly recommended for research-oriented developers as it supports fast and dynamic training.

Further reading:

Building A Logistic Regression in Python

Productive Python Development with PyCharm

Machine Learning Tutorial

The Image Processing Tutorial from Zero to One

Top 5 Machine Learning Libraries

Guide to R and Python in a Single Jupyter Notebook

Not Hotdog with Keras and TensorFlow.js

Positional-only arguments in Python

A Web Developer's Guide to Machine Learning in JavaScript

Learn TensorFlow.js - Deep Learning and Neural Networks with JavaScript

Learn TensorFlow.js - Deep Learning and Neural Networks with JavaScript

This full course introduces the concept of client-side artificial neural networks. We will learn how to deploy and run models along with full deep learning applications in the browser! To implement this cool capability, we’ll be using TensorFlow.js (TFJS), TensorFlow’s JavaScript library.

By the end of this video tutorial, you will have built and deployed a web application that runs a neural network in the browser to classify images! To get there, we'll learn about client-server deep learning architectures, converting Keras models to TFJS models, serving models with Node.js, tensor operations, and more!

⭐️Course Sections⭐️

⌨️ 0:00 - Intro to deep learning with client-side neural networks

⌨️ 6:06 - Convert Keras model to Layers API format

⌨️ 11:16 - Serve deep learning models with Node.js and Express

⌨️ 19:22 - Building UI for neural network web app

⌨️ 27:08 - Loading model into a neural network web app

⌨️ 36:55 - Explore tensor operations with VGG16 preprocessing

⌨️ 45:16 - Examining tensors with the debugger

⌨️ 1:00:37 - Broadcasting with tensors

⌨️ 1:11:30 - Running MobileNet in the browser

Deep Learning Using TensorFlow

Deep Learning Using TensorFlow

In this TensorFlow tutorial for professionals and enthusiasts who are interested in applying Deep Learning Algorithm using TensorFlow to solve various problems.

In this TensorFlow tutorial for professionals and enthusiasts who are interested in applying Deep Learning Algorithm using TensorFlow to solve various problems.

TensorFlow is an open source deep learning library that is based on the concept of data flow graphs for building models. It allows you to create large-scale neural networks with many layers. Learning the use of this library is also a fundamental part of the AI & Deep Learning course curriculum. Following are the topics that will be discussed in this TensorFlow tutorial:
**What is TensorFlowTensorFlow Code BasicsTensorFlow UseCase **##

What are Tensors?

In this TensorFlow tutorial, before talking about TensorFlow, let us first understand what are tensors. **Tensors **are nothing but a de facto for representing the data in deep learning.

As shown in the image above, tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set. In fact, the name “TensorFlow” has been derived from the operations which neural networks perform on tensors. It’s literally a flow of tensors. Since, you have understood what are tensors, let us move ahead in this **TensorFlow **tutorial and understand – what is TensorFlow?

What is TensorFlow?

**TensorFlow **is a library based on Python that provides different types of functionality for implementing Deep Learning Models. As discussed earlier, the term TensorFlow is made up of two terms – Tensor & Flow:

In TensorFlow, the term tensor refers to the representation of data as multi-dimensional array whereas the term flow refers to the series of operations that one performs on tensors as shown in the above image.

Now we have covered enough background about TensorFlow.

Next up, in this TensorFlow tutorial we will be discussing about TensorFlow code-basics.

TensorFlow Tutorial: Code Basics

Basically, the overall process of writing a TensorFlow program involves two steps:

  1. Building a Computational Graph
  2. Running a Computational Graph

Let me explain you the above two steps one by one:

1. Building a Computational Graph

So, what is a computational graph? Well, a computational graph is a series of TensorFlow operations arranged as nodes in the graph. Each nodes take 0 or more tensors as input and produces a tensor as output. Let me give you an example of a simple computational graph which consists of three nodes – a, b & c as shown below:

Explanation of the Above Computational Graph:

**What is TensorFlowTensorFlow Code BasicsTensorFlow UseCase **
Basically, one can think of a computational graph as an alternative way of conceptualizing mathematical calculations that takes place in a TensorFlow program. The operations assigned to different nodes of a Computational Graph can be performed in parallel, thus, providing a better performance in terms of computations.

Here we just describe the computation, it doesn’t compute anything, it does not hold any values, it just defines the operations specified in your code.

2. Running a Computational Graph

Let us take the previous example of computational graph and understand how to execute it. Following is the code from previous example:

Example 1:

import tensorflow as tf
# Build a graph
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b

Now, in order to get the output of node c, we need to run the computational graph within a session. Session places the graph operations onto Devices, such as CPUs or GPUs, and provides methods to execute them.

A session encapsulates the control and state of the *TensorFlow *runtime i.e. it stores the information about the order in which all the operations will be performed and passes the result of already computed operation to the next operation in the pipeline. Let me show you how to run the above computational graph within a session (Explanation of each line of code has been added as a comment):

# Create the session object
sess = tf.Session()
#Run the graph within a session and store the output to a variable
output_c =
#Print the output of node c
#Close the session to free up some resources

So, this was all about session and running a computational graph within it. Now, let us talk about variables and placeholders that we will be using extensively while building deep learning model using TensorFlow.

Constants, Placeholder and Variables

In TensorFlow, constants, placeholders and variables are used to represent different parameters of a deep learning model. Since, I have already discussed constants earlier, I will start with placeholders.


A TensorFlow constant allows you to store a value but, what if, you want your nodes to take inputs on the run? For this kind of functionality, placeholders are used which allows your graph to take external inputs as parameters. Basically, a placeholder is a promise to provide a value later or during runtime. Let me give you an example to make things simpler:

import tensorflow as tf
# Creating placeholders
a = tf. placeholder(tf.float32)
b = tf. placeholder(tf.float32)
# Assigning multiplication operation w.r.t. a & b to node mul
mul = a*b
# Create session object
sess = tf.Session()
# Executing mul by passing the values [1, 3] [2, 4] for a and b respectively
output =, {a: [1,3], b: [2, 4]})
print('Multiplying a b:', output)
[2. 12.]

Points to Remember about placeholders:

**What is TensorFlowTensorFlow Code BasicsTensorFlow UseCase **
Now, let us move ahead and understand – what are variables?


In deep learning, placeholders are used to take arbitrary inputs in your model or graph. Apart from taking input, you also need to modify the graph such that it can produce new outputs w.r.t. same inputs. For this you will be using variables. In a nutshell, a variable allows you to add such parameters or node to the graph that are trainable i.e. the value can be modified over the period of a time. Variables are defined by providing their initial value and type as shown below:

var = tf.Variable( [0.4], dtype = tf.float32 )

**Note: **
**What is TensorFlowTensorFlow Code BasicsTensorFlow UseCase **
Constants are initialized when you call tf.constant, and their value can never change. On the contrary, variables are not initialized when you call tf.Variable. To initialize all the variables in a TensorFlow program, you must explicitly call a special operation as shown below:

init = tf.global_variables_initializer()

Always remember that a variable must be initialized before a graph is used for the first time.

Note: TensorFlow variables are in-memory buffers that contain tensors, but unlike normal tensors that are only instantiated when a graph is run and are immediately deleted afterwards, variables survive across multiple executions of a graph.

Now that we have covered enough basics of TensorFlow, let us go ahead and understand how to implement a linear regression model using TensorFlow.

Linear Regression Model Using TensorFlow

Linear Regression Model is used for predicting the unknown value of a variable (Dependent Variable) from the known value of another variables (Independent Variable) using linear regression equation as shown below:

Therefore, for creating a linear model, you need:

  1. Building a Computational Graph
  2. Running a Computational Graph

So, let us begin building linear model using TensorFlow:

Copy the code by clicking the button given below:

# Creating variable for parameter slope (W) with initial value as 0.4
W = tf.Variable([.4], tf.float32)
#Creating variable for parameter bias (b) with initial value as -0.4
b = tf.Variable([-0.4], tf.float32)
# Creating placeholders for providing input or independent variable, denoted by x
x = tf.placeholder(tf.float32)
# Equation of Linear Regression
linear_model = W * x + b
# Initializing all the variables
sess = tf.Session()
init = tf.global_variables_initializer()
# Running regression model to calculate the output w.r.t. to provided x values
print( {x: [1, 2, 3, 4]})) 


[ 0.     0.40000001 0.80000007 1.20000005]

The above stated code just represents the basic idea behind the implementation of regression model i.e. how you follow the equation of regression line so as to get output w.r.t. a set of input values. But, there are two more things left to be added in this model to make it a complete regression model:
**What is TensorFlowTensorFlow Code BasicsTensorFlow UseCase **
Now let us understand how can I incorporate the above stated functionalities into my code for regression model.

Loss Function – Model Validation

A loss function measures how far apart the current output of the model is from that of the desired or target output. I’ll use a most commonly used loss function for my linear regression model called as Sum of Squared Error or SSE. SSE calculated w.r.t. model output (represent by linear_model) and desired or target output (y) as:

y = tf.placeholder(tf.float32)
error = linear_model - y
squared_errors = tf.square(error)
loss = tf.reduce_sum(squared_errors)
print(, {x:[1,2,3,4], y:[2, 4, 6, 8]})


As you can see, we are getting a high loss value. Therefore, we need to adjust our weights (W) and bias (b) so as to reduce the error that we are receiving.

tf.train API – Training the Model

TensorFlow provides optimizers that slowly change each variable in order to minimize the loss function or error. The simplest optimizer is gradient descent. It modifies each variable according to the magnitude of the derivative of loss with respect to that variable.

#Creating an instance of gradient descent optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
for i in range(1000):, {x:[1, 2, 3, 4], y:[2, 4, 6, 8]})
print([W, b]))

 [array([ 1.99999964], dtype=float32), array([ 9.86305167e-07], dtype=float32)]

So, this is how you create a linear model using TensorFlow and train it to get the desired output.

Deep Learning from Scratch and Using Tensorflow in Python

Deep Learning from Scratch and Using Tensorflow in Python

In this article, we will learn how deep learning works and get familiar with its terminology — such as backpropagation and batch size

Originally published by Milad Toutounchian at
Deep learning is one of the most popular models currently being used in real-world, Data Science applications. It’s been an effective model in areas that range from image to text to voice/music. With the increase in its use, the ability to quickly and scalably implement deep learning becomes paramount. The rise of deep learning platforms such as Tensorflow, help developers implement what they need to in easier ways.

In this article, we will learn how deep learning works and get familiar with its terminology — such as backpropagation and batch size. We will implement a simple deep learning model — from theory to scratch implementation — for a predefined input and output in Python, and then do the same using deep learning platforms such as Keras and Tensorflow. We have written this simple deep learning model using Keras and Tensorflow version 1.x and version 2.0 with three different levels of complexity and ease of coding.

Deep Learning Implementation from Scratch

Consider a simple multi-layer-perceptron with four input neurons, one hidden layer with three neurons and an output layer with one neuron. We have three data-samples for the input denoted as X, and three data-samples for the desired output denoted as yt. So, each input data-sample has four features.

# Inputs and outputs of the neural net:
import numpy as np

X=np.array([[1.0, 0.0, 1.0, 0.0],[1.0, 0.0, 1.0, 1.0],[0.0, 1.0, 0.0, 1.0]])

The x*(m) in this figure is one-sample of Xh(m) is the output of the hidden layer for input x(m), and Wi* and Wh are the weights.

The goal of a neural net (NN) is to obtain weights and biases such that for a given input, the NN provides the desired output. But, we do not know the appropriate weights and biases in advance, so we update the weights and biases such that the error between the output of NN, yp(m), and desired ones, yt(m), is minimized. This iterative minimization process is called the NN training.

Assume the activation functions for both hidden and output layers are sigmoid functions. Therefore,

The size of weights, biases and the relationships between input and outputs of the neural net

Where activation function is the sigmoid, m is the mth data-sample and yp(m) is the NN output.

The error function, which measures the difference between the output of NN with the desired one, can be expressed mathematically as:

The Error defined for the neural net which is squared error

The pseudocode for the above NN has been summarized below:

pseudocode for the neural net training

From our pseudocode, we realize that the partial derivative of Error (E) with respect to parameters (weights and biases) should be computed. Using the chain rule from calculus we can write:

We have two options here for updating the weights and biases in backward path (backward path means updating weights and biases such that error is minimized):

  1. Use all *N * samples of the training data
  2. Use one sample (or a couple of samples)

For the first one, we say the batch size is N. For the second one, we say batch size is 1, if use one sample to updates the parameters. So batch size means how many data samples are being used for updating the weights and biases.

You can find the implementation of the above neural net, in which the gradient of the error with respect to parameters is calculated Symbolically, with different batch sizes here.

As you can see with the above example, creating a simple deep learning model from scratch involves methods that are very complex. In the next section, we will see how deep learning frameworks can assist in introducing scalability and greater ease of implementation to our model.

Deep Learning implementation using Keras, Tensorflow 1.x and 2.0

In the previous section, we computed the gradient of Error w.r.t. parameters from using the chain rule. We saw first-hand that it is not an easy or scalable approach. Also, keep in mind that we evaluate the partial derivatives at each iteration, and as a result, the Symbolic Gradient is not needed although its value is important. This is where deep-learning frameworks such as Keras and Tensorflow can play their role. The deep-learning frameworks use an AutoDiff method for numerical calculations of partial gradients. If you’re not familiar with AutoDiff, StackExchange has a great example to walk through.

The AutoDiff decomposes the complex expression into a set of primitive ones, i.e. expressions consisting of at most a single function call. As the differentiation rules for each separate expression are already known, the final results can be computed in an efficient way.

We have implemented the NN model with three different levels in Keras, Tensorflow 1.x and Tensorflow 2.0:

1- High-Level (Keras and Tensorflow 2.0): High-Level Tensorflow 2.0 with Batch Size 1

2- Medium-Level (Tensorflow 1.x and 2.0): Medium-Level Tensorflow 1.x with Batch Size 1 , Medium-Level Tensorflow 1.x with Batch Size NMedium-Level Tensorflow 2.0 with Batch Size 1Medium-Level Tensorflow v 2.0 with Batch Size N

3- Low-Level (Tensorflow 1.x): Low-Level Tensorflow 1.x with Batch Size N

Code Snippets:

For the High-Level, we have accomplished the implementation using Keras and Tensorflow v 2.0 with model.train_on_batch:

# High-Level implementation of the neural net in Tensorflow:
model.compile(loss=mse, optimizer=optimizer)
for _ in range(2000):
    for step, (x, y) in enumerate(zip(X_data, y_data)):
        model.train_on_batch(np.array([x]), np.array([y]))

In the Medium-Level using Tensorflow 1.x, we have defined:

E = tf.reduce_sum(tf.pow(ypred - Y, 2))
optimizer = tf.train.GradientDescentOptimizer(0.1)
grads = optimizer.compute_gradients(E, [W_h, b_h, W_o, b_o])
updates = optimizer.apply_gradients(grads)

This ensures that in the for loop, the updates variable will be updated. For Medium-Level, the gradients and their updates are defined outside the for_loop and inside the for_loop updates is iteratively updated. In the Medium-Level using Tensorflow v 2.x, we have used:

# Medium-Level implementation of the neural net in Tensorflow

# In for_loop
with tf.GradientTape() as tape:
   x = tf.convert_to_tensor(np.array([x]), dtype=tf.float64)
   y = tf.convert_to_tensor(np.array([y]), dtype=tf.float64)
   ypred = model(x)
   loss = mse(y, ypred)
gradients = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(gradients, model.trainable_weights))

In Low-Level implementation, each weight and bias is updated separately. In the Low-Level using Tensorflow v 1.x, we have defined:

# Low-Level implementation of the neural net in Tensorflow:
E = tf.reduce_sum(tf.pow(ypred - Y, 2))
dE_dW_h = tf.gradients(E, [W_h])[0]
dE_db_h = tf.gradients(E, [b_h])[0]
dE_dW_o = tf.gradients(E, [W_o])[0]
dE_db_o = tf.gradients(E, [b_o])[0]
# In for_loop:
evaluated_dE_dW_h =,
                                     feed_dict={W_h: W_h_i, b_h: b_h_i, W_o: W_o_i, b_o: b_o_i, X: X_data.T, Y: y_data.T})
        W_h_i = W_h_i - 0.1 * evaluated_dE_dW_h
        evaluated_dE_db_h =,
                                     feed_dict={W_h: W_h_i, b_h: b_h_i, W_o: W_o_i, b_o: b_o_i, X: X_data.T, Y: y_data.T})
        b_h_i = b_h_i - 0.1 * evaluated_dE_db_h
        evaluated_dE_dW_o =,
                                     feed_dict={W_h: W_h_i, b_h: b_h_i, W_o: W_o_i, b_o: b_o_i, X: X_data.T, Y: y_data.T})
        W_o_i = W_o_i - 0.1 * evaluated_dE_dW_o
        evaluated_dE_db_o =,
                                     feed_dict={W_h: W_h_i, b_h: b_h_i, W_o: W_o_i, b_o: b_o_i, X: X_data.T, Y: y_data.T})
        b_o_i = b_o_i - 0.1 * evaluated_dE_db_o

As you can see with the above low level implementation, the developer has more control over every single step of numerical operations and calculations.


We have now shown that implementing from scratch even a simple deep learning model by using Symbolic gradient computation for weight and bias updates is not an easy or scalable approach. Using deep learning frameworks accelerates this process as a result of using AutoDiff, which is basically a stable numerical gradient computation for updating weights and biases.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading

Machine Learning A-Z™: Hands-On Python & R In Data Science

Python for Data Science and Machine Learning Bootcamp

Machine Learning, Data Science and Deep Learning with Python

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Artificial Intelligence A-Z™: Learn How To Build An AI

A Complete Machine Learning Project Walk-Through in Python

Machine Learning: how to go from Zero to Hero

Top 18 Machine Learning Platforms For Developers

10 Amazing Articles On Python Programming And Machine Learning

100+ Basic Machine Learning Interview Questions and Answers