Elvis Miranda

Elvis Miranda


Deep Learning with PyTorch: Guide for Beginners

In 2019, the war for ML frameworks has two main contenders: PyTorch and TensorFlow. There is a growing adoption of PyTorch by researchers and students due to ease of use, while in industry, Tensorflow is currently still the platform of choice.

Some of the key advantages of PyTorch are:

  • Simplicity: It is very pythonic and integrates easily with the rest of the Python ecosystem. It is easy to learn, use, extend, and debug.
  • Great API: PyTorch shines in term of usability due to better designed Object Oriented classes which encapsulate all of the important data choices along with the choice of model architecture. The documentation of PyTorch is also very brilliant and helpful for beginners.
  • Dynamic Graphs: PyTorch implements dynamic computational graphs. Which means that the network can change behavior as it is being run, with little or no overhead. This is extremely helpful for debugging and also for constructing sophisticated models with minimal effort. allowing PyTorch expressions to be automatically differentiated.

This is image title

There is a growing popularity of PyTorch in research. Below plot showing monthly number of mentions of the word “PyTorch” as a percentage of all mentions among other deep learning frameworks. We can see there is an steep upward trend of PyTorch in arXiv in 2019 reaching almost 50%.

This is image title
arXiv papers mentioning PyTorch is growing

Dynamic graph generation, tight Python language integration, and a relatively simple API makes PyTorch an excellent platform for research and experimentation.


PyTorch provides a very clean interface to get the right combination of tools to be installed. Below a snapshot to choose and the corresponding command. Stable represents the most currently tested and supported version of PyTorch. This should be suitable for many users. Preview is available if you want the latest version, not fully tested and supported. You can choose from Anaconda (recommended) and Pip installation packages and supporting various CUDA versions as well.

This is image title

PyTorch Modules

Now we will discuss key PyTorch Library modules like Tensors, Autograd, Optimizers and Neural Networks (NN ) which are essential to create and train neural networks.


Tensors are the workhorse of PyTorch. We can think of tensors as multi-dimensional arrays. PyTorch has an extensive library of operations on them provided by the torch module. PyTorch Tensors are very close to the very popular NumPy arrays . In fact, PyTorch features seamless interoperability with NumPy. Compared with NumPy arrays, PyTorch tensors have added advantage that both tensors and related operations can run on the CPU or GPU. The second important thing that PyTorch provides allows tensors to keep track of the operations performed on them that helps to compute gradients or derivatives of an output with respect to any of its inputs.

Basic Tensor Operations

Tensor refers to the generalization of vectors and matrices to an arbitrary number of dimensions. The dimensionality of a tensor coincides with the number of indexes used to refer to scalar values within the tensor. A tensor of order zero (0D tensor) is just a number or a scalar. A tensor of order one (1D tensor) is an array of numbers or a vector. Similarly a 2nd-order tensor (2D)is an array of vectors or a matrix.

This is image title

Now let us create a tensor in PyTorch.

This is image title

After importing the torch module, we called a function torch.ones that creates a (2D) tensor of size 9 filled with the values 1.0.

Other ways include using **t_orch.zeros_**; zero filled tensor, **_torch.randn_**; from random uniform distribution.

Type and Size

Each tensor has an associated type and size. The default tensor type when you use the **torch.Tensor** constructor is **torch.FloatTensor**. However, you can convert a tensor to a different type (**float**, **long**, **double**, etc.) by specifying it at initialization or later using one of the typecasting methods. There are two ways to specify the initialization type: either by directly calling the constructor of a specific tensor type, such as **FloatTensor**or **LongTensor**, or using a special method, **torch.tensor()**, and providing the **dtype**.

This is image title

Some useful tensor operations:

This is image title

To find the maximum item in a tensor as well as the index that contains the maximum value. These can be done with the **_max()_** and **_argmax()_** functions. We can also use **item()** to extract a standard Python value from a 1D tensor.

Most functions that operate on a tensor and return a tensor create a new tensor to store the result. If you need an in-place function look for a function with an appended underscore (___) e.g **_torch.transpose__** will do in-place transpose of a tensor.

Converting between tensors and Numpy is very simple using **_torch.from_numpy_** & **_torch.numpy()_**.

Another common operation is reshaping a tensor. This is one of the frequently used operations and very useful too. We can do this with either **_view()_** or **_reshape()_**:

This is image title

**_Tensor.reshape()_** and **_Tensor.view()_** though are not the same.

  • **_Tensor.view()_** works only on contiguous tensors and will never copy memory. It will raise an error on a non-contiguous tensor. But you can make the tensor contiguous by calling **_contiguous()_** and then you can call **_view()_**.
  • **_Tensor.reshape()_** will work on any tensor and can make a clone if it is needed.

Tensor Broadcasting

PyTorch supports broadcasting similar to NumPy. Broadcasting allows you to perform operations between two tensors. Refer here for the broadcasting semantics.

Tensor in a nutshell: Where, How & What

Three attributes which uniquely define a tensor are:

device: Where the tensor’s physical memory is actually stored, e.g., on a CPU, or a GPU. The **torch.device** contains a device type ('**cpu**' or '**cuda**') and optional device ordinal for the device type.

layout: How we logically interpret this physical memory. The most common layout is a strided tensor. Strides are a list of integers: the k-th stride represents the jump in the memory necessary to go from one element to the next one in the k-th dimension of the Tensor.

dtype: What is actually stored in each element of the tensor? This could be floats or integers etc. PyTorch has nine different data types.


Autograd is automatic differentiation system. What does automatic differentiation do? Given a network, it calculates the gradients automatically. When computing the forwards pass, autograd simultaneously performs the requested computations and builds up a graph representing the function that computes the gradient.

How is this achieved?

PyTorch tensors can remember where they come from in terms of the operations and parent tensors that originated them, and they can provide the chain of derivatives of such operations with respect to their inputs automatically. This is achieved through **requires_grad**, if set to True.

**t= torch.tensor([1.0, 0.0], requires_grad=True)**

After calculating the gradient, the value of the derivative is automatically populated as a **grad** attribute of the tensor. For any composition of functions with any number of tensors with **requires_grad= True**; PyTorch would compute derivatives throughout the chain of functions and accumulate their values in the **grad** attribute of those tensors.


Optimizers are used to update weights and biases i.e. the internal parameters of a model to reduce the error. Please refer to my another article for more details.

PyTorch has an **torch.optim** package with various optimization algorithms like SGD (Stochastic Gradient Descent), Adam, RMSprop etc .

Let us see how we can create one of the provided optimizers SGD or Adam.

import torch.optim as optim
params = torch.tensor([1.0, 0.0], requires_grad=True)
learning_rate = 1e-3
## SGD
optimizer = optim.SGD([params], lr=learning_rate)
## Adam
optimizer = optim.Adam([params], lr=learning_rate)

Without using optimizers, we would need to manually update the model parameters by something like:

for params in model.parameters(): 
       params -= params.grad * learning_rate

We can use the **step()** method from our optimizer to take a forward step, instead of manually updating each parameter.


The value of params is updated when step is called. The optimizer looks into **params.grad** and updates **params** by subtracting **learning_rate**times **grad**from it, exactly as we did in without using optimizer.

**torch.optim**module helps us to abstract away the specific optimization scheme with just passing a list of params. Since there are multiple optimization schemes to choose from, we just need to choose one for our problem and rest the underlying PyTorch library does the magic for us.


In PyTorch the **torch.nn** package defines a set of modules which are similar to the layers of a neural network. A module receives input tensors and computes output tensors. The **torch.nn** package also defines a set of useful loss functions that are commonly used when training neural networks.

Steps of building a neural network are:

  • Neural Network Construction: Create the neural network layers. setting up parameters (weights, biases)
  • Forward Propagation: Calculate the predicted output. Measure error.
  • Back-propagation: After finding the error, webackward propagate our error gradient to update our weight parameters. We do this by taking the derivative of the errorfunctionwith respect to the parameters of our NN.
  • Iterative Optimization: We want to minimize error as much as possible. We keep updating the parameters iteratively by gradient descent.

Build a Neural Network

Let us follow the above steps and create a simple neural network in PyTorch.

Step 1: Neural Network Construction

We call our NN**Net**here**.** We’re inheriting from **nn.Module**. Combined with super().__init__() this creates a class that tracks the architecture and provides a lot of useful methods and attributes.

This is image title

Our neural network **Net** has one hidden layer **self.hl** and one output layer **self.ol**.

self.hl = nn.Linear(1, 10)

This line creates a module for a linear transformation with 1 inputs and 10 outputs. It also automatically creates the weight and bias tensors. You can access the weight and bias tensors once the network **net** is created with **net.hl.weight** and **net.hl.bias**.

This is image title

We have defined activation using **self.relu = nn.ReLU()** .

Step 2: Forward propagation

PyTorch networks created with **nn.Module** must have a **forward()** method defined. It takes in a tensor **x** and passes it through the operations you defined in the **__init__** method.

   def forward(self, x):
   hidden = self.hl(x)
   activation = self.relu(hidden)
   output = self.ol(activation)

We can see that the input tensor goes through the hidden layer, then activation function (relu), then the output layer.

Step 3: Back-propagation

Here we have to calculate error or loss and backward propagate our error gradient to update our weight parameters.

A loss function takes the (output, target) and computes a value that estimates how far away the **output** is from the **target**.There are several different loss functions under the **torch.nn** package . A simple loss is **nn.MSELoss** which computes the mean-squared error between the input and the target.

output = net(input)
loss_fn = nn.MSELoss()
loss = loss_fn(output, target)


A simple function call**loss.backward()**propagates the error. Don’t forget to clear the existing gradients though else gradients will be accumulated to existing gradients. After calling**loss.backward()** have a look at hidden layer bias gradients before and after the backward call.

This is image title

So after calling the backward(), we see the gradients are calculated for the hidden layer.

Step 4: Iterative Optimization

We have already seen how optimizer helps us to update the parameters of the model.

# create your optimizer
optimizer = optim.Adam(net.parameters(), lr=1e-2)
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)     # calculate output
loss = loss_fn(output, target) #calculate loss
loss.backward()      # calculate gradient
optimizer.step()     # update parameters

Now with our basic steps (1,2,3) complete, we just need to iteratively train our neural network to find the minimum loss. So we run the **training_loop**for many epochs until we minimize the loss.

This is image title

Let us run our neural network to train for input **x_t** and target**y_t**.

This is image title

We call **training_loop** for 1500 epochs an pass all other arguments like **optimizer**, **model**, **loss_fn**``**inputs**, and **target**. After every 300 epochs we print the loss and we can see the loss decreasing after every iteration. Looks like our very basic neural network is learning.

This is image title

We plot the model output (black crosses) and target data (red circles), the model seems to learn quickly.

This is image title

So far we discussed the basic or essential elements of PyTorch to get you started. Creating machine learning based solutions for real problems involves significant effort into data preparation. However, PyTorch library provides many tools to make data loading easy and more readable like **torchvision**, **torchtext** and **torchaudio**to work with image, text and audio data respectively.

Before I finish the article, I also want to mention a very important tool called TensorBoard. Training machine learning models is often very hard.A tool that can help in visualizing our model and understanding the training progress is always needed, when we encounter some problems.

TensorBoard helps to log events from our model training, including various scalars (e.g. accuracy, loss), images, histograms etc. Since release of PyTorch 1.2.0, TensorBoard is now a PyTorch built-in feature. Please follow this and this tutorials for installation and use of TensorBoard in Pytorch.

#python #PyTorch #Deep Learning

What is GEEK

Buddha Community

Deep Learning with PyTorch: Guide for Beginners
Marget D

Marget D


Top Deep Learning Development Services | Hire Deep Learning Developer

View more: https://www.inexture.com/services/deep-learning-development/

We at Inexture, strategically work on every project we are associated with. We propose a robust set of AI, ML, and DL consulting services. Our virtuoso team of data scientists and developers meticulously work on every project and add a personalized touch to it. Because we keep our clientele aware of everything being done associated with their project so there’s a sense of transparency being maintained. Leverage our services for your next AI project for end-to-end optimum services.

#deep learning development #deep learning framework #deep learning expert #deep learning ai #deep learning services

Mikel  Okuneva

Mikel Okuneva


Top 10 Deep Learning Sessions To Look Forward To At DVDC 2020

The Deep Learning DevCon 2020, DLDC 2020, has exciting talks and sessions around the latest developments in the field of deep learning, that will not only be interesting for professionals of this field but also for the enthusiasts who are willing to make a career in the field of deep learning. The two-day conference scheduled for 29th and 30th October will host paper presentations, tech talks, workshops that will uncover some interesting developments as well as the latest research and advancement of this area. Further to this, with deep learning gaining massive traction, this conference will highlight some fascinating use cases across the world.

Here are ten interesting talks and sessions of DLDC 2020 that one should definitely attend:

Also Read: Why Deep Learning DevCon Comes At The Right Time

Adversarial Robustness in Deep Learning

By Dipanjan Sarkar

**About: **Adversarial Robustness in Deep Learning is a session presented by Dipanjan Sarkar, a Data Science Lead at Applied Materials, as well as a Google Developer Expert in Machine Learning. In this session, he will focus on the adversarial robustness in the field of deep learning, where he talks about its importance, different types of adversarial attacks, and will showcase some ways to train the neural networks with adversarial realisation. Considering abstract deep learning has brought us tremendous achievements in the fields of computer vision and natural language processing, this talk will be really interesting for people working in this area. With this session, the attendees will have a comprehensive understanding of adversarial perturbations in the field of deep learning and ways to deal with them with common recipes.

Read an interview with Dipanjan Sarkar.

Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER

By Divye Singh

**About: **Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER is a paper presentation by Divye Singh, who has a masters in technology degree in Mathematical Modeling and Simulation and has the interest to research in the field of artificial intelligence, learning-based systems, machine learning, etc. In this paper presentation, he will talk about the common problem of class imbalance in medical diagnosis and anomaly detection, and how the problem can be solved with a deep learning framework. The talk focuses on the paper, where he has proposed a synergistic over-sampling method generating informative synthetic minority class data by filtering the noise from the over-sampled examples. Further, he will also showcase the experimental results on several real-life imbalanced datasets to prove the effectiveness of the proposed method for binary classification problems.

Default Rate Prediction Models for Self-Employment in Korea using Ridge, Random Forest & Deep Neural Network

By Dongsuk Hong

About: This is a paper presentation given by Dongsuk Hong, who is a PhD in Computer Science, and works in the big data centre of Korea Credit Information Services. This talk will introduce the attendees with machine learning and deep learning models for predicting self-employment default rates using credit information. He will talk about the study, where the DNN model is implemented for two purposes — a sub-model for the selection of credit information variables; and works for cascading to the final model that predicts default rates. Hong’s main research area is data analysis of credit information, where she is particularly interested in evaluating the performance of prediction models based on machine learning and deep learning. This talk will be interesting for the deep learning practitioners who are willing to make a career in this field.

#opinions #attend dldc 2020 #deep learning #deep learning sessions #deep learning talks #dldc 2020 #top deep learning sessions at dldc 2020 #top deep learning talks at dldc 2020

PyTorch For Deep Learning 

What is Pytorch ?

Pytorch is a Deep Learning Library Devoloped by Facebook. it can be used for various purposes such as Natural Language Processing , Computer Vision, etc


Python, Numpy, Pandas and Matplotlib

Tensor Basics

What is a tensor ?

A Tensor is a n-dimensional array of elements. In pytorch, everything is a defined as a tensor.

#pytorch #pytorch-tutorial #pytorch-course #deep-learning-course #deep-learning

Tia  Gottlieb

Tia Gottlieb


Beginners Guide to Machine Learning on GCP

Introduction to Machine Learning

  • Machine Learning is a way to use some set of algorithms to derive predictive analytics from data. It is different than Business Intelligence and Data Analytics in a sense that In BI and Data analytics Businesses make decision based on historical data, but In case of Machine Learning , Businesses predict the future based on the historical data. Example, It’s a difference between what happened to the business vs what will happen to the business.Its like making BI much smarter and scalable so that it can predict future rather than just showing the state of the business.
  • **ML is based on Standard algorithms which are used to create use case specific model based on the data **. For example we can build the model to predict delivery time of the food, or we can build the model to predict the Delinquency rate in Finance business , but to build these model algorithm might be similar but the training would be different.Model training requires tones of examples (data).
  • Basically you train your standard algorithm with your Input data. So algorithms are always same but trained models are different based on use cases. Your trained model will be as good as your data.

ML, AI , Deep learning ? What is the difference?

Image for post

ML is type of AI

AI is a discipline , Machine Learning is tool set to achieve AI. DL is type of ML when data is unstructured like image, speech , video etc.

Barrier to Entry Has Fallen

AI & ML was daunting and with high barrier to entry until cloud become more robust and natural AI platform. Entry barrier to AI & ML has fallen significantly due to

  • Increasing availability in data (big data).
  • Increase in sophistication in algorithm.
  • And availability of hardware and software due to cloud computing.

GCP Machine Learning Spectrum

Image for post

  • For Data scientist and ML experts , TensorFlow on AI platform is more natural choice since they will build their own custom ML models.
  • But for the users who are not experts will potentially use Cloud AutoML or Pre-trained ready to go model.
  • In case of AutoML we can trained our custom model with Google taking care of much of the operational tasks.
  • Pre-trained models are the one which are already trained with tones of data and ready to be used by users to predict on their test data.

Prebuilt ML Models (No ML Expertise Needed)

  • As discuss earlier , GCP has lot of Prebuilt models that are ready to use to solve common ML task . Such as image classification, Sentiment analysis.
  • Most of the businesses are having many unstructured data sources such as e-mail, logs, web pages, ppt, documents, chat, comments etc.( 90% or more as per various studies)
  • Now to process these unstructured data in the form of text, we should use Cloud Natural Language API.
  • Similarly For common ML problems in the form of speech, video, vision we should use respective Prebuilt models.

#ml-guide-on-gcp #ml-for-beginners-on-gcp #beginner-ml-guide-on-gcp #machine-learning #machine-learning-gcp #deep learning

Few Shot Learning — A Case Study (2)

In the previous blog, we looked into the fact why Few Shot Learning is essential and what are the applications of it. In this article, I will be explaining the Relation Network for Few-Shot Classification (especially for image classification) in the simplest way possible. Moreover, I will be analyzing the Relation Network in terms of:

  1. Effectiveness of different architectures such as Residual and Inception Networks
  2. Effects of transfer learning via using pre-trained classifier on ImageNet dataset

Moreover, effectiveness will be evaluated on the accuracy, time required for training, and the number of required training parameters.

Please watch the GitHub repository to check out the implementations and keep updated with further experiments.

Introduction to Few-Shot Classification

In few shot classification, our objective is to design a method which can identify any object images by analyzing few sample images of the same class. Let’s the take one example to understand this. Suppose Bob has a client project to design a 5 class classifier, where 5 classes can be anything and these 5 classes can even change with time. As discussed in previous blog, collecting the huge amount of data is very tedious task. Hence, in such cases, Bob will rely upon few shot classification methods where his client can give few set of example images for each classes and after that his system can perform classification young these examples with or without the need of additional training.

In general, in few shot classification four terminologies (N way, K shot, support set, and query set) are used.

  1. N way: It means that there will be total N classes which we will be using for training/testing, like 5 classes in above example.
  2. K shot: Here, K means we have only K example images available for each classes during training/testing.
  3. Support set: It represents a collection of all available K examples images from each classes. Therefore, in support set we have total N*K images.
  4. Query set: This set will have all the images for which we want to predict the respective classes.

At this point, someone new to this concept will have doubt regarding the need of support and query set. So, let’s understand it intuitively. Whenever humans sees any object for the first time, we get the rough idea about that object. Now, in future if we see the same object second time then we will compare it with the image stored in memory from the when we see it for the first time. This applied to all of our surroundings things whether we see, read, or hear. Similarly, to recognise new images from query set, we will provide our model a set of examples i.e., support set to compare.

And this is the basic concept behind Relation Network as well. In next sections, I will be giving the rough idea behind Relation Network and I will be performing different experiments on 102-flower dataset.

About Relation Network

The Core idea behind Relation Network is to learn the generalized image representations for each classes using support set such that we can compare lower dimensional representation of query images with each of the class representations. And based on this comparison decide the class of each query images. Relation Network has two modules which allows us to perform above two tasks:

  1. Embedding module: This module will extract the required underlying representations from each input images irrespective of the their classes.
  2. Relation Module: This module will score the relation of embedding of query image with each class embedding.

Training/Testing procedure:

We can define the whole procedure in just 5 steps.

  1. Use the support set and get underlying representations of each images using embedding module.
  2. Take the average of between each class images and get the single underlying representation for each class.
  3. Then get the embedding for each query images and concatenate them with each class’ embedding.
  4. Use the relation module to get the scores. And class with highest score will be the label of respective query image.
  5. [Only during training] Use MSE loss functions to train both (embedding + relation) modules.

Few things to know during the training is that we will use only images from the set of selective class, and during the testing, we will be using images from unseen classes. For example, from the 102-flower dataset, we will use 50% classes for training, and rest will be used for validation and testing. Moreover, in each episode, we will randomly select 5 classes to create the support and query set and follow the above 5 steps.

That is all need to know about the implementation point of view. Although the whole process is simple and easy to understand, I’ll recommend reading the published research paper, Learning to Compare: Relation Network for Few-Shot Learning, for better understanding.

#deep-learning #few-shot-learning #computer-vision #machine-learning #deep learning #deep learning