Deep Learning with PyTorch: Guide for Beginners

Deep Learning with PyTorch: Guide for Beginners

In 2019, there is a growing adoption of PyTorch by researchers and students due to ease of use, while in industry, Tensorflow is currently still the platform of choice.

In 2019, the war for ML frameworks has two main contenders: PyTorch and TensorFlow. There is a growing adoption of PyTorch by researchers and students due to ease of use, while in industry, Tensorflow is currently still the platform of choice.

Some of the key advantages of PyTorch are:

  • Simplicity: It is very pythonic and integrates easily with the rest of the Python ecosystem. It is easy to learn, use, extend, and debug.
  • Great API: PyTorch shines in term of usability due to better designed Object Oriented classes which encapsulate all of the important data choices along with the choice of model architecture. The documentation of PyTorch is also very brilliant and helpful for beginners.
  • Dynamic Graphs: PyTorch implements dynamic computational graphs. Which means that the network can change behavior as it is being run, with little or no overhead. This is extremely helpful for debugging and also for constructing sophisticated models with minimal effort. allowing PyTorch expressions to be automatically differentiated.


There is a growing popularity of PyTorch in research. Below plot showing monthly number of mentions of the word “PyTorch” as a percentage of all mentions among other deep learning frameworks. We can see there is an steep upward trend of PyTorch in arXiv in 2019 reaching almost 50%.

arXiv papers mentioning PyTorch is growing

Dynamic graph generation, tight Python language integration, and a relatively simple API makes PyTorch an excellent platform for research and experimentation.


PyTorch provides a very clean interface to get the right combination of tools to be installed. Below a snapshot to choose and the corresponding command. Stable represents the most currently tested and supported version of PyTorch. This should be suitable for many users. Preview is available if you want the latest version, not fully tested and supported. You can choose from Anaconda (recommended) and Pip installation packages and supporting various CUDA versions as well.

PyTorch Modules

Now we will discuss key PyTorch Library modules like Tensors, Autograd, Optimizers and Neural Networks (NN ) which are essential to create and train neural networks.


Tensors are the workhorse of PyTorch. We can think of tensors as multi-dimensional arrays. PyTorch has an extensive library of operations on them provided by the torch module. PyTorch Tensors are very close to the very popular NumPy arrays . In fact, PyTorch features seamless interoperability with NumPy. Compared with NumPy arrays, PyTorch tensors have added advantage that both tensors and related operations can run on the CPU or GPU. The second important thing that PyTorch provides allows tensors to keep track of the operations performed on them that helps to compute gradients or derivatives of an output with respect to any of its inputs.

Basic Tensor Operations

Tensor refers to the generalization of vectors and matrices to an arbitrary number of dimensions. The dimensionality of a tensor coincides with the number of indexes used to refer to scalar values within the tensor. A tensor of order zero (0D tensor) is just a number or a scalar. A tensor of order one (1D tensor) is an array of numbers or a vector. Similarly a 2nd-order tensor (2D)is an array of vectors or a matrix.

Now let us create a tensor in PyTorch.

After importing the torch module, we called a function torch.ones that creates a (2D) tensor of size 9 filled with the values 1.0.

Other ways include using **t_orch.zeros_**; zero filled tensor, **_torch.randn_**; from random uniform distribution.

Type and Size

Each tensor has an associated type and size. The default tensor type when you use the **torch.Tensor** constructor is **torch.FloatTensor**. However, you can convert a tensor to a different type (**float**, **long**, **double**, etc.) by specifying it at initialization or later using one of the typecasting methods. There are two ways to specify the initialization type: either by directly calling the constructor of a specific tensor type, such as **FloatTensor**or **LongTensor**, or using a special method, **torch.tensor()**, and providing the **dtype**.

Some useful tensor operations:

To find the maximum item in a tensor as well as the index that contains the maximum value. These can be done with the **_max()_** and **_argmax()_** functions. We can also use **item()** to extract a standard Python value from a 1D tensor.

Most functions that operate on a tensor and return a tensor create a new tensor to store the result. If you need an in-place function look for a function with an appended underscore (___) e.g **_torch.transpose__** will do in-place transpose of a tensor.

Converting between tensors and Numpy is very simple using **_torch.from_numpy_** & **_torch.numpy()_**.

Another common operation is reshaping a tensor. This is one of the frequently used operations and very useful too. We can do this with either **_view()_** or **_reshape()_**:

**_Tensor.reshape()_** and **_Tensor.view()_** though are not the same.

  • **_Tensor.view()_** works only on contiguous tensors and will never copy memory. It will raise an error on a non-contiguous tensor. But you can make the tensor contiguous by calling **_contiguous()_** and then you can call **_view()_**.
  • **_Tensor.reshape()_** will work on any tensor and can make a clone if it is needed.
Tensor Broadcasting

PyTorch supports broadcasting similar to NumPy. Broadcasting allows you to perform operations between two tensors. Refer here for the broadcasting semantics.

Tensor in a nutshell: Where, How & What

Three attributes which uniquely define a tensor are:

device: Where the tensor’s physical memory is actually stored, e.g., on a CPU, or a GPU. The **torch.device** contains a device type ('**cpu**' or '**cuda**') and optional device ordinal for the device type.

layout: How we logically interpret this physical memory. The most common layout is a strided tensor. Strides are a list of integers: the k-th stride represents the jump in the memory necessary to go from one element to the next one in the k-th dimension of the Tensor.

dtype: What is actually stored in each element of the tensor? This could be floats or integers etc. PyTorch has nine different data types.


Autograd is automatic differentiation system. What does automatic differentiation do? Given a network, it calculates the gradients automatically. When computing the forwards pass, autograd simultaneously performs the requested computations and builds up a graph representing the function that computes the gradient.

How is this achieved?

PyTorch tensors can remember where they come from in terms of the operations and parent tensors that originated them, and they can provide the chain of derivatives of such operations with respect to their inputs automatically. This is achieved through **requires_grad**, if set to True.

**t= torch.tensor([1.0, 0.0], requires_grad=True)**

After calculating the gradient, the value of the derivative is automatically populated as a **grad** attribute of the tensor. For any composition of functions with any number of tensors with **requires_grad= True**; PyTorch would compute derivatives throughout the chain of functions and accumulate their values in the **grad** attribute of those tensors.


Optimizers are used to update weights and biases i.e. the internal parameters of a model to reduce the error. Please refer to my another article for more details.

PyTorch has an **torch.optim** package with various optimization algorithms like SGD (Stochastic Gradient Descent), Adam, RMSprop etc .

Let us see how we can create one of the provided optimizers SGD or Adam.

import torch.optim as optim
params = torch.tensor([1.0, 0.0], requires_grad=True)
learning_rate = 1e-3
## SGD
optimizer = optim.SGD([params], lr=learning_rate)
## Adam
optimizer = optim.Adam([params], lr=learning_rate)

Without using optimizers, we would need to manually update the model parameters by something like:

for params in model.parameters(): 
       params -= params.grad * learning_rate

We can use the **step()** method from our optimizer to take a forward step, instead of manually updating each parameter.


The value of params is updated when step is called. The optimizer looks into **params.grad** and updates **params** by subtracting **learning_rate**times **grad**from it, exactly as we did in without using optimizer.

**torch.optim**module helps us to abstract away the specific optimization scheme with just passing a list of params. Since there are multiple optimization schemes to choose from, we just need to choose one for our problem and rest the underlying PyTorch library does the magic for us.


In PyTorch the **torch.nn** package defines a set of modules which are similar to the layers of a neural network. A module receives input tensors and computes output tensors. The **torch.nn** package also defines a set of useful loss functions that are commonly used when training neural networks.

Steps of building a neural network are:

  • Neural Network Construction: Create the neural network layers. setting up parameters (weights, biases)
  • Forward Propagation: Calculate the predicted output. Measure error.
  • Back-propagation: After finding the error, webackward propagate our error gradient to update our weight parameters. We do this by taking the derivative of the errorfunctionwith respect to the parameters of our NN.
  • Iterative Optimization: We want to minimize error as much as possible. We keep updating the parameters iteratively by gradient descent.
Build a Neural Network

Let us follow the above steps and create a simple neural network in PyTorch.

Step 1: Neural Network Construction

We call our NN**Net**here**.** We’re inheriting from **nn.Module**. Combined with super().__init__() this creates a class that tracks the architecture and provides a lot of useful methods and attributes.

Our neural network **Net** has one hidden layer **self.hl** and one output layer **self.ol**.

self.hl = nn.Linear(1, 10)

This line creates a module for a linear transformation with 1 inputs and 10 outputs. It also automatically creates the weight and bias tensors. You can access the weight and bias tensors once the network **net** is created with **net.hl.weight** and **net.hl.bias**.

We have defined activation using **self.relu = nn.ReLU()** .

Step 2: Forward propagation

PyTorch networks created with **nn.Module** must have a **forward()** method defined. It takes in a tensor **x** and passes it through the operations you defined in the **__init__** method.

   def forward(self, x):
   hidden = self.hl(x)
   activation = self.relu(hidden)
   output = self.ol(activation)

We can see that the input tensor goes through the hidden layer, then activation function (relu), then the output layer.

Step 3: Back-propagation

Here we have to calculate error or loss and backward propagate our error gradient to update our weight parameters.

A loss function takes the (output, target) and computes a value that estimates how far away the **output** is from the **target**.There are several different loss functions under the **torch.nn** package . A simple loss is **nn.MSELoss** which computes the mean-squared error between the input and the target.

output = net(input)
loss_fn = nn.MSELoss()
loss = loss_fn(output, target)

A simple function call**loss.backward()**propagates the error. Don’t forget to clear the existing gradients though else gradients will be accumulated to existing gradients. After calling**loss.backward()** have a look at hidden layer bias gradients before and after the backward call.

So after calling the backward(), we see the gradients are calculated for the hidden layer.

Step 4: Iterative Optimization

We have already seen how optimizer helps us to update the parameters of the model.

# create your optimizer
optimizer = optim.Adam(net.parameters(), lr=1e-2)
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)     # calculate output
loss = loss_fn(output, target) #calculate loss
loss.backward()      # calculate gradient
optimizer.step()     # update parameters

Now with our basic steps (1,2,3) complete, we just need to iteratively train our neural network to find the minimum loss. So we run the **training_loop**for many epochs until we minimize the loss.

Let us run our neural network to train for input **x_t** and target**y_t**.

We call **training_loop** for 1500 epochs an pass all other arguments like **optimizer**, **model**, **loss_fn**``**inputs**, and **target**. After every 300 epochs we print the loss and we can see the loss decreasing after every iteration. Looks like our very basic neural network is learning.

We plot the model output (black crosses) and target data (red circles), the model seems to learn quickly.

So far we discussed the basic or essential elements of PyTorch to get you started. Creating machine learning based solutions for real problems involves significant effort into data preparation. However, PyTorch library provides many tools to make data loading easy and more readable like **torchvision**, **torchtext** and **torchaudio**to work with image, text and audio data respectively.

Before I finish the article, I also want to mention a very important tool called TensorBoard. Training machine learning models is often very hard.A tool that can help in visualizing our model and understanding the training progress is always needed, when we encounter some problems.

TensorBoard helps to log events from our model training, including various scalars (e.g. accuracy, loss), images, histograms etc. Since release of PyTorch 1.2.0, TensorBoard is now a PyTorch built-in feature. Please follow this and this tutorials for installation and use of TensorBoard in Pytorch.

PyTorch Tutorial - Deep Learning Using PyTorch - Learn PyTorch from Basics to Advanced

PyTorch Tutorial - Deep Learning Using PyTorch - Learn PyTorch from Basics to Advanced

PyTorch Tutorial - Deep Learning Using PyTorch - Learn PyTorch from Basics to Advanced. Learn PyTorch from the very basics to advanced models like Generative Adverserial Networks and Image Captioning. "PyTorch: Zero to GANs" is an online course and series of tutorials on building deep learning models with PyTorch, an open source neural networks library.

Learn PyTorch from the very basics to advanced models like Generative Adverserial Networks and Image Captioning

"PyTorch: Zero to GANs" is an online course and series of tutorials on building deep learning models with PyTorch, an open source neural networks library. Here are the concepts covered in this course:

  • PyTorch Basics: Tensors & Gradients
  • Linear Regression & Gradient Descent
  • Classification using Logistic Regression
  • Feedforward Neural Networks & Training on GPUs (this post)

What you'll learn

  • A coding-focused introduction to Deep Learning using PyTorch, starting from the very basics and going all the way up to advanced topics like Generative Adverserial Networks

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Python, PyTorch and Deep Learning

Complete Python Bootcamp: Go from zero to hero in Python 3

A Complete Machine Learning Project Walk-Through in Python

Deep Learning With TensorFlow 2.0

Introduction to PyTorch and Machine Learning

Machine Learning A-Z™: Hands-On Python & R In Data Science

Deep Learning A-Z™: Hands-On Artificial Neural Networks

PyTorch Tutorial for Beginners

Implementing Deep Learning Papers - Deep Deterministic Policy Gradients (using Python)

Implementing Deep Learning Papers - Deep Deterministic Policy Gradients (using Python)

Implementing Deep Learning Papers - Deep Deterministic Policy Gradients (using Python)

In this intermediate deep learning tutorial, you will learn how to go from reading a paper on deep deterministic policy gradients to implementing the concepts in Tensorflow. This process can be applied to any deep learning paper, not just deep reinforcement learning.

In the second part, you will learn how to code a deep deterministic policy gradient (DDPG) agent using Python and PyTorch, to beat the continuous lunar lander environment (a classic machine learning problem).

DDPG combines the best of Deep Q Learning and Actor Critic Methods into an algorithm that can solve environments with continuous action spaces. We will have an actor network that learns the (deterministic) policy, coupled with a critic network to learn the action-value functions. We will make use of a replay buffer to maximize sample efficiency, as well as target networks to assist in algorithm convergence and stability.

Thanks for watching

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Deep Learning and Python

Machine Learning A-Z™: Hands-On Python & R In Data Science

Python for Data Science and Machine Learning Bootcamp

Machine Learning, Data Science and Deep Learning with Python

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Artificial Intelligence A-Z™: Learn How To Build An AI

A Complete Machine Learning Project Walk-Through in Python

Machine Learning: how to go from Zero to Hero

Top 18 Machine Learning Platforms For Developers

10 Amazing Articles On Python Programming And Machine Learning

100+ Basic Machine Learning Interview Questions and Answers

Machine Learning for Front-End Developers

Top 30 Python Libraries for Machine Learning

Learn Python Tutorial from Basic to Advance

Learn Python Tutorial from Basic to Advance

Basic programming concept in any language will help but not require to attend this tutorial

Become a Python Programmer and learn one of employer's most requested skills of 21st century!

This is the most comprehensive, yet straight-forward, course for the Python programming language on Simpliv! Whether you have never programmed before, already know basic syntax, or want to learn about the advanced features of Python, this course is for you! In this course we will teach you Python 3. (Note, we also provide older Python 2 notes in case you need them)

With over 40 lectures and more than 3 hours of video this comprehensive course leaves no stone unturned! This course includes tests, and homework assignments as well as 3 major projects to create a Python project portfolio!

This course will teach you Python in a practical manner, with every lecture comes a full coding screencast and a corresponding code notebook! Learn in whatever manner is best for you!

We will start by helping you get Python installed on your computer, regardless of your operating system, whether its Linux, MacOS, or Windows, we've got you covered!

We cover a wide variety of topics, including:

Command Line Basics
Installing Python
Running Python Code
Number Data Types
Print Formatting
Built-in Functions
Debugging and Error Handling
External Modules
Object Oriented Programming
File I/O
Web scrapping
Database Connection
Email sending
and much more!
Project that we will complete:

Guess the number
Guess the word using speech recognition
Love Calculator
google search in python
Image download from a link
Click and save image using openCV
Ludo game dice simulator
open wikipedia on command prompt
Password generator
QR code reader and generator
You will get lifetime access to over 40 lectures.

So what are you waiting for? Learn Python in a way that will advance your career and increase your knowledge, all in a fun and practical way!

Basic knowledge
Basic programming concept in any language will help but not require to attend this tutorial
What will you learn
Learn to use Python professionally, learning both Python 2 and Python 3!
Create games with Python, like Tic Tac Toe and Blackjack!
Learn advanced Python features, like the collections module and how to work with timestamps!
Learn to use Object Oriented Programming with classes!
Understand complex topics, like decorators.
Understand how to use both the pycharm and create .py files
Get an understanding of how to create GUIs in the pycharm!
Build a complete understanding of Python from the ground up!