Pytorch : Everything you need to know in 10 mins

Pytorch : Everything you need to know in 10 mins
<p class="ql-align-center"> Originally published by Aman Juneja  at</p><p>It is increasingly making it easier for developers to build Machine Learning capabilities into their applications while testing their code is real time. In this piece about Pytorch Tutorial, I talk about the new platform in Deep Learning.</p><p>The latest version of the platform brings a lot of new capabilities to the table and is clocking vibrant support from the whole industry. It is remarkable how Pytorch is being touted as a serious contender to Google’s Tensorflow just within a couple of years of its release. Its popularity is mainly being driven by a smoother learning curve and a cleaner interface, which is providing developers with a more intuitive approach to build neural networks.</p><p>In fact, according to Soumith Chintala, AI Research Engineer at Facebook and creator of Pytorch, the inability to leverage the available Machine Learning libraries in an ideal development scenario is what prompted them to build something specific. “Google’s TensorFlow was released in 2015. We tried using it but were not super happy with it. Before this, we tried Caffe1, Theano, and Torch. At the time, we were using Torch and Caffe1 for research and production. The field has changed a lot, and we felt a new tool was needed. Looked like nobody else was building it, not the way we thought will be needed in the future. So, we felt we should build it.”</p><p>So what’s new in Pytorch 1.0? Here are the highlights of the new release:</p>
    <li>Torch Script</li><li>PyTorch 1.0 introduces JIT for model graphs that revolve around the concept of Torch Script which is a restricted subset of the Python language. It has its very own compiler and transform passes, optimizations, etc. Class and method annotations are used to indicate the scripts as a part of the Python code. Examples include @torch.jit.script, and @torch.jit.script_method. Annotations help to preserve elements such as loops, print statements, and control flow.</li>
    <li>Although it should be noted that you need to remove the annotations in the case you want to:</li><li>Debug the scripts using standard Python tools</li>
<p>Switch to eager execution modeSubsets that are valid Torch Scripts include:</p>
    <li>Tensors and numeric primitives</li><li>If statements</li><li>Simple Loops</li><li>Code organizations with nn.Module</li><li>Tuples, lists, strings, and print</li><li>Gradient propagation through script functions</li><li>In-place updates to tensors and lists</li><li>Direct use of standard nn.Modules such as nn/ConvCalling functions like grad() or backwards() within @scripts</li><li class="ql-indent-1">You can work with Torch Scripts in two ways:Tracing Mode</li><li class="ql-indent-1">Scripting Mode</li>
    <li>Tracing Mode</li><li>The Pytorch tracer (torch.jit.trace) records native Pytorch operations that are executed in a code region. Along with this, it also records the data dependencies between them. Although it has had a tracer since version 0.3, it can now re-execute the tracer for you by leveraging a high-performance C++ runtime environment. The trace no longer needs to be executed elsewhere as the latest version has made it possible to integrate the optimizations and hardware integrations of Caffe2.You can also create Torch Scripts by using a tracing JIT. Here the computational graph nodes will be visited and the final script will be produced after recording the operations.</li><li>Scripting Mode</li><li>The Scripting Mode makes it possible for you to write regular Python functions without the need to use complicated language functions. The @script decorator can be used to compile a function once the desired functionality has been isolated.Such an annotation would directly transform the Python function into a C++ runtime for higher performance.Torch Scripts can be created by providing custom scripts where you provide the description of your model. Though it is necessary to take the limitations of Torch Scripts into account for this purpose.</li><li>Integration of Research and Production</li><li>PyTorch 1.0 integrates research with production very intuitively. Although past versions quickly rose to popularity for the flexibility they provided in Artificial Intelligence development and research, performance at production scale remained a challenge. Developers had to translate the research code into a graph model representation in Caffe2 for production purposes. Such a migration used to be manual and time-consuming.But now PyTorch 1.0 integrates immediate and graph execution modes to help developers handle research and production simultaneously. With the help of a hybrid front-end, you can now share code between both the modes for seamless prototyping and production.</li><li>C++ API and Frontend</li><li>Python has not been a popular option for deployment in C++ due to factors such as high overheads on small models, multi-threading services bottleneck on GIL, etc. PyTorch 1.0 provides developers with a two-way pathway from Python to C++ and vice versa. This helps in functions such as debugging and refactoring. The C++ API allows you to write custom implementations such as calls to third-party functions.In addition to this, the beta version of C++ Frontend was also announced. Though it is currently marked as ‘API Unstable’. This makes it ready to be used for building research applications but its utilization for production purposes will take some time to stabilize.</li><li>Optimization and Export</li><li>It does not matter whether you are using the Tracing mode or the Scripting mode. With PyTorch 1.0, the result is always a Python free representation of your model which can be used in two ways – to optimize the model or export the model – in the production environments.</li>
<p>Whole program optimizations become possible with the ability to extract bigger segments of the model into an intermediate representation. Computations can also be offloaded to specialized AI accelerators. PyTorch 1.0 also includes passes to fuse GPU operations together and improve the performance of smaller RNN models.</p>

Support From the Ecosystem

<p>The tech world has been quick to respond to the added capabilities of PyTorch with major market players announcing extended support to create a thriving ecosystem around the Deep Learning platform. Here is a wrap up of the major announcements that the release of PyTorch 1.0 has attracted:</p>
    <li>Amazon Web Services (AWS):Amazon SageMaker now features PyTorch 1.0 images. Developers can even train and deploy their Deep Learning models of PyTorch in SageMarker. Once the PyTorch script has been written, Amazon SageMaker training can handle subsequent tasks such as setting up the distributed training cluster, hyperparameter tuning and transferring of data. This makes Amazon SageMaker as a managed online endpoint with a coveted ability to scale up automatically whenever the need arises.Using PyTorch in Amazon SageMarker starts with the developer providing the relevant script and then use the PyTorch estimator from Amazon SageMarker Python SDK as follows:</li>
<p>estimator = PyTorch(entry_point=””,</p><p>role=role,</p><p>train_instance_count=2,</p><p>train_instance_type=’ml.p2.xlarge’,</p><p>hyperparameters={‘epochs’: 10,</p><p>‘lr’: 0.01})</p><p>Developers can package their code as a Docker container to host it or deploy it for inference.</p>
    <li>Microsoft:</li><li>Microsoft has also been quick in announcing major support for PyTorch. The highlights include:Setting up extensive Windows support for PyTorch. Actively contributing to the GitHub code.Allocating a dedicated team of Developers to improve PyTorch.Closely working with the community.Integration of PyTorch in all Machine Learning products of Microsoft which includes:</li><li>VS Code</li><li>Azure</li><li>Data Science VM</li><li>Azure ML</li><li>Google Cloud Platform (GCP):</li><li>Google has not held back in any way by jumping into the mix with a few major announcements of its own in partnership with the PyTorch 1.0 release:Although Kubeflow already supported PyTorch, Google has extended the TensorRT package in Kubeflow to support serving PyTorch models.A collaboration of Tenserboard with PyTorch. This includes Cloud TPU and TPU pods with support for easy scaling.Broadened support for PyTorch throughout the AI platforms and services of Google Cloud Support.Fully hybrid Python and C/C++ front-end support and native distribution execution support for production environments.</li><li>Nvidia:</li><li>Nvidia and Facebook also have a healthy collaboration history with the companies joining hands together in 2017 to create large-scale distributed training scenarios to develop Machine Learning based applications for Edge devices. Collaborative efforts continue today with Nvidia actively working to integrate Pytorch into their current offerings:</li>
    <li>A PyTorch Extension Tools (APEX) for easy Mixed Precision and Distributed Training.</li><li>Support for PyTorch framework across the inference workflow. Developers can:</li><li class="ql-indent-1">Import PyTorch models with the ONNX format</li><li class="ql-indent-1">Apply INT8 and FP16 optimizations</li><li class="ql-indent-1">Calibrate for lower precision with high accuracy</li><li class="ql-indent-1">Generate runtimes for production deployment</li><li>Availability of PyTorch container from the Nvidia GPU Cloud container registry to help developers get started quickly with the platform.</li>

Key Capabilities and Features 

<p>Having started out just 2 years ago, it is incredible how quickly it has matured to add new capabilities and functionalities. A host of improved abilities have been introduced in PyTorch 1.0. Here is a quick look at what the open source Deep Learning platform is capable of today:</p>
    <li>Hybrid Front-end: Provides ease of use and better flexibility in eager mode. Provides graph mode for speed, optimization, and functionality in C++ runtime environments.</li><li>Distributed Training: Optimized performance for both research and production. Provides asynchronous execution of collective operations and peer to peer communication.</li><li>Python First: PyTorch has been built to be deeply integrated with Python and can be actively used with popular libraries and packages such as Cython and Numba.</li><li>Tools and Libraries: The community of PyTorch is highly active, which has led to the development of a rich ecosystem of tools and libraries. This has extended the reach and supported development in numerous areas.</li><li>Native ONNX Support: PyTorch also offers export models in the standard Open Neural Network Exchange format. This provides developers with direct access to ONNX-compatible platforms, runtimes, visualizers, etc.</li><li>C++ Front-end: A C++ interface that is intended to enable research in high performance or low latency C++ applications.</li><li>Cloud Partners: As established by the support provided from the ecosystem, all major cloud computing platforms are supporting PyTorch today. This paves way for a smooth development process, easy scaling, large-scale training on GPUs, etc.</li>

Main Elements in PyTorch

<p>If you are planning to fuel your development process by leveraging the phenomenal capabilities, there are some main elements that you should know about before starting out to plan your development process in the most optimum way. Let’s take a look:</p>
    <li>PyTorch Tensors</li>
Tensors are multidimensional arrays. Tensors are similar to numpy’s ndarrays, though they can also be used on GPUs. A simple one-dimensional matrix can be defined as:# import pytorch
//import torch
# define a tensor
2[torch.FloatTensor of size 1]
    <li>Mathematical Operations</li>
PyTorch provides you with 200+ mathematical operators to work with. This meets the need of a scientific computing library making efficient implementations of mathematical functions. Here is how addition works out :a = torch.FloatTensor([2])
b = torch.FloatTensor([1])
a + b
3[torch.FloatTensor of size 1]
    <li>Various functions on matrices can also be performed on the defined PyTorch Tensors. Here’s an example:</li>
matrix = torch.randn(3, 3)
0.4182 2.1159 8.3576
-0.4563 -0.2357 -2.5800
-0.5081 -2.1937 -0.0291
[torch.FloatTensor of size 3×3]
0.4182 -0.4563 -0.5081
2.1159 -0.2357 -2.1937
8.3576 -2.5800 -0.0291
[torch.FloatTensor of size 3×3]
    <li>Autograd Module</li><li>PyTorch makes use of Automatic differentiation. A recorder records all the performed operations and then plays it back to compute the gradients. This technique finds extensive usage when neural networks are built.from torch.autograd import Variable</li>
x = Variable(train_x)
y = Variable(train_y, requires_grad=False)
    <li>Optim Module</li><li>The torch.optim module helps you to implement optimization algorithms to build neural networks. The best feature is the support for most of the commonly used methods. This eliminates the need to build them from scratch.</li>
For instance, here is how you can use the Adam.optimizer:optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    <li>nn Module</li><li>It can be difficult to define complex neural networks with raw autograd. The nn Module helps in this regard by allowing you to define a set of modules that can be considered as a neural network layer.</li>
import torch
# define model
model = torch.nn.Sequential(
torch.nn.Linear(input_num_units, hidden_num_units),
torch.nn.Linear(hidden_num_units, output_num_units),
loss_fn = torch.nn.CrossEntropyLoss()

Important Things to Keep in Mind

<p>With the basics covered, you can now kickstart building your very own neural network with PyTorch and make use of the maturing ecosystem to bring your ideas to life.</p><p>But before you begin, here are some important details to keep in mind to avoid certain pitfalls that might give you trouble at a later stage.</p>
    <li>Data Types</li><li>Data types matter a lot in PyTorch. For example, not all NumPy arrays can be converted to torch Tensor. Only certain NumPy data types can be converted to torch Tensor type such as numpy.uint8 to torch.ByteTensor, numpy.int16 to torch.ShortTensor, and numpy.int32 to torch.IntTensor.</li><li>Numerical Stability</li>
The thumb rule is – if it can overflow or underflow, it probably will. For instance, let’s say that you have to relate samples with tags in both positive and negative ways. You use the classical sigmoid + log loss for this purpose.sigmoid = torch.nn.functional.sigmoiddot_p =, tag_p)
loss_pos = -torch.log(sigmoid(dot_p)) #(1)
dot_n =, tag_n)
loss_neg = -torch.log(1 – sigmoid(dot_n)) #(2)
    <li>Log(0) is the critical point here. Since Log is undefined for this input, there are two ways in which this situation can go down:</li>
sigmoid(x) = 0, which means x is a “large” negative value.
sigmoid(x) = 1, which means x is a “large” positive value.
    <li>Regardless of the case, -log(y) evaluates to zero. This leads to a numerical instability which hinders with further optimization steps.</li><li>A workaround here can be to bound the values of sigmoid to be slightly below one and slightly above zero.</li>
value = torch.nn.functional.sigmoid(x)
value = torch.clamp(torch.clamp(value, min=eps), max=1-eps)
    <li>This makes sigmoid(dot_p) to be always positive and (1 – sigmoid(dot_n)) to never amount to zero. Although this is not rocket science, you need to keep such evaluations in mind to ensure numerical stability while you code.</li><li>Gradients</li><li>In PyTorch, Gradients accumulate by default. To understand this, consider a scenario in which you run a computation once, both forward and backward, and everything seems to be working correctly. But when you run it for the second time, new gradients get added to the gradients from the first operation. This is easy to forget, especially for developers who are dealing with a Machine Learning platform/library for the first time.A quick solution in such a scenario would be to manually set the gradients to zero between every two runs. This can be done with:</li>
<p>Conclusion</p><p>PyTorch is taking the world of Deep Learning by storm by paving way for better innovation in the whole ecosystem that even includes the likes of education providers such as Udacity and All this and more makes the future of PyTorch quite promising and provides huge incentives to developers to start depending on the platform confidently. Subscribe to the blog for further tutorials and updates on PyTorch</p><p>
</p><p class="ql-align-center">Originally published by Aman Juneja  at</p><p>============================================</p><p>Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter</p>

Learn More

<p>☞ Data Science, Deep Learning, & Machine Learning with Python</p><p>☞ Deep Learning A-Z™: Hands-On Artificial Neural Networks</p><p>☞ Machine Learning A-Z™: Hands-On Python & R In Data Science</p><p>☞ Python for Data Science and Machine Learning Bootcamp</p><p>☞ Machine Learning, Data Science and Deep Learning with Python</p><p>☞ [2019] Machine Learning Classification Bootcamp in Python</p><p>☞ Introduction to Machine Learning & Deep Learning in Python</p><p>☞ Machine Learning Career Guide – Technical Interview</p><p>☞ Machine Learning Guide: Learn Machine Learning Algorithms</p><p>☞ Machine Learning Basics: Building Regression Model in Python</p><p class="ql-align-justify">☞ Machine Learning using Python - A Beginner’s Guide</p><p>

How to Using Python Pandas for Log Analysis

How to Using Python Pandas for Log Analysis

In this quick article, I'll walk you through how to use the Pandas library for Python in order to analyze a CSV log file for offload analysis.


Python Pandas is a library that provides data science capabilities to Python. Using this library, you can use data structures like DataFrames. This data structure allows you to model the data like an in-memory database. By doing so, you will get query-like capabilities over the data set.

Use Case

Suppose we have a URL report from taken from either the Akamai Edge server logs or the Akamai Portal report. In this case, I am using the Akamai Portal report. In this workflow, I am trying to find the top URLs that have a volume offload less than 50%. I've attached the code at the end. I am going to walk through the code line-by-line. Here are the column names within the CSV file for reference.

Offloaded Hits,Origin Hits,Origin OK Volume (MB),Origin Error Volume (MB)

Initialize the Library

The first step is to initialize the Pandas library. In almost all the references, this library is imported as pd. We'll follow the same convention.

import pandas as pd

Read the CSV as a DataFrame

The next step is to read the whole CSV file into a DataFrame. Note that this function to read CSV data also has options to ignore leading rows, trailing rows, handling missing values, and a lot more. I am not using these options for now.

urls_df = pd.read_csv('urls_report.csv')

Pandas automatically detects the right data formats for the columns. So the URL is treated as a string and all the other values are considered floating point values.

Compute Volume Offload

The default URL report does not have a column for Offload by Volume. So we need to compute this new column.

urls_df['Volume Offload'] = (urls_df['OK Volume']*100) / (urls_df[

We are using the columns named OK Volume and Origin OK Volumn (MB) to arrive at the percent offloads.

Filter the Data

At this point, we need to have the entire data set with the offload percentage computed. Since we are interested in URLs that have a low offload, we add two filters:

  • Consider the rows having a volume offload of less than 50% and it should have at least some traffic (we don't want rows that have zero traffic).

  • We will also remove some known patterns. This is based on the customer context but essentially indicates URLs that can never be cached.

Sort Data

At this point, we have the right set of URLs but they are unsorted. We need the rows to be sorted by URLs that have the most volume and least offload. We can achieve this sorting by columns using the sort command.

low_offload_urls.sort_values(by=['OK Volume','Volume Offload'],inplace

Print the Data

For simplicity, I am just listing the URLs. We can export the result to CSV or Excel as well.

First, we project the URL (i.e., extract just one column) from the dataframe. We then list the URLs with a simple for loop as the projection results in an array.

for each_url in low_offload_urls['URL']:
print (each_url)

I hope you found this useful and get inspired to pick up Pandas for your analytics as well!


I was able to pick up Pandas after going through an excellent course on Coursera titled Introduction to Data Science in Python. During this course, I realized that Pandas has excellent documentation.

  • Pandas Documentation:

Full Code

import pandas as pd

urls_df = pd.read_csv('urls_report.csv')

#now convert to right types
urls_df['Volume Offload'] = (urls_df['OK Volume']*100) / (urls_df['OK Volume'] + urls_df['Origin OK Volume (MB)'])

low_offload_urls = urls_df[(urls_df['OK Volume'] > 0) & (urls_df['Volume Offload']<50.0)]
low_offload_urls = low_offload_urls[(~low_offload_urls.URL.str.contains("")) & (~low_offload_urls.URL.str.contains("stateful-apis")) ]

low_offload_urls.sort_values(by=['OK Volume','Volume Offload'],inplace=True, ascending=['True','False'])

for each_url in low_offload_urls['URL']:
print (each_url)

Thanks for reading.

30 Helpful Python Snippets You Should Learn Today

30 Helpful Python Snippets You Should Learn Today

Python is a no-BS programming language. Readability and simplicity of design are two of the biggest reasons for its immense popularity.

Part of the reason for this popularity is its simplicity and easiness to learn it.

If you are reading this, then it is highly likely that you already use Python or at least have an interest in it.

In this article, we will briefly see 30 short code snippets that you can understand and learn in 30 seconds or less.

The following tricks will prove handy in your day-to-day coding exercises.

1. All unique

The following method checks whether the given list has duplicate elements. It uses the property of set() which removes duplicate elements from the list.

def all_unique(lst):
    return len(lst) == len(set(lst))

x = [1,1,2,2,3,2,3,4,5,6]
y = [1,2,3,4,5]
all_unique(x) # False
all_unique(y) # True

2. Anagrams

This method can be used to check if two strings are anagrams. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

from collections import Counter

def anagram(first, second):
    return Counter(first) == Counter(second)

anagram("abcd3", "3acdb") # True

3. Memory

This snippet can be used to check the memory usage of an object.

import sys 

variable = 30 
print(sys.getsizeof(variable)) # 24

4. Byte size

This method returns the length of a string in bytes.

def byte_size(string):
byte_size('😀') # 4
byte_size('Hello World') # 11 

5. Print a string N times

This snippet can be used to print a string n times without having to use loops to do it.

n = 2; 
s ="Programming"; 

print(s * n); # ProgrammingProgramming

6. Capitalize first letters

This snippet simply uses the method title() to capitalize first letters of every word in a string.

s = "programming is awesome"

print(s.title()) # Programming Is Awesome

7. Chunk

This method chunks a list into smaller lists of a specified size.

def chunk(list, size):
    return [list[i:i+size] for i in range(0,len(list), size)]

8. Compact

This method removes falsy values (False, None, 0 and “”) from a list by using filter().

def compact(lst):
    return list(filter(None, lst))
compact([0, 1, False, 2, '', 3, 'a', 's', 34]) # [ 1, 2, 3, 'a', 's', 34 ]

9. Count by

This snippet can be used to transpose a 2D array.

array = [['a', 'b'], ['c', 'd'], ['e', 'f']]
transposed = zip(*array)
print(transposed) # [('a', 'c', 'e'), ('b', 'd', 'f')]

10. Chained comparison

You can do multiple comparisons with all kinds of operators in a single line.

a = 3
print( 2 < a < 8) # True
print(1 == a < 2) # False

11. Comma-separated

This snippet can be used to turn a list of strings into a single string with each element from the list separated by commas.

hobbies = ["basketball", "football", "swimming"]

print("My hobbies are:") # My hobbies are:
print(", ".join(hobbies)) # basketball, football, swimming

12. Get vowels

This method gets vowels (‘a’, ‘e’, ‘i’, ‘o’, ‘u’) found in a string.

def get_vowels(string):
    return [each for each in string if each in 'aeiou'] 

get_vowels('foobar') # ['o', 'o', 'a']
get_vowels('gym') # []

13. Decapitalize

This method can be used to turn the first letter of the given string into lowercase.

def decapitalize(str):
    return str[:1].lower() + str[1:]
decapitalize('FooBar') # 'fooBar'
decapitalize('FooBar') # 'fooBar'

14. Flatten

The following methods flatten a potentially deep list using recursion.

def spread(arg):
    ret = []
    for i in arg:
        if isinstance(i, list):
    return ret

def deep_flatten(xs):
    flat_list = []
    [flat_list.extend(deep_flatten(x)) for x in xs] if isinstance(xs, list) else flat_list.append(xs)
    return flat_list

deep_flatten([1, [2], [[3], 4], 5]) # [1,2,3,4,5]

15. Difference

This method finds the difference between two iterables by keeping only the values that are in the first one.

def difference(a, b):
    set_a = set(a)
    set_b = set(b)
    comparison = set_a.difference(set_b)
    return list(comparison)

difference([1,2,3], [1,2,4]) # [3]

16. Difference by

The following method returns the difference between two lists after applying a given function to each element of both lists.

def difference_by(a, b, fn):
    b = set(map(fn, b))
    return [item for item in a if fn(item) not in b]

from math import floor
difference_by([2.1, 1.2], [2.3, 3.4], floor) # [1.2]
difference_by([{ 'x': 2 }, { 'x': 1 }], [{ 'x': 1 }], lambda v : v['x']) # [ { x: 2 } ]

17. Chained function call

You can call multiple functions inside a single line.

def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

a, b = 4, 5
print((subtract if a > b else add)(a, b)) # 9   

18. Has duplicates

The following method checks whether a list has duplicate values by using the fact that set() contains only unique elements.

def has_duplicates(lst):
    return len(lst) != len(set(lst))
x = [1,2,3,4,5,5]
y = [1,2,3,4,5]
has_duplicates(x) # True
has_duplicates(y) # False

19. Merge two dictionaries

The following method can be used to merge two dictionaries.

def merge_two_dicts(a, b):
    c = a.copy()   # make a copy of a 
    c.update(b)    # modify keys and values of a with the ones from b
    return c

a = { 'x': 1, 'y': 2}
b = { 'y': 3, 'z': 4}
print(merge_two_dicts(a, b)) # {'y': 3, 'x': 1, 'z': 4}

In Python 3.5 and above, you can also do it like the following:

def merge_dictionaries(a, b)
   return {**a, **b}

a = { 'x': 1, 'y': 2}
b = { 'y': 3, 'z': 4}
print(merge_dictionaries(a, b)) # {'y': 3, 'x': 1, 'z': 4}

20. Convert two lists into a dictionary

The following method can be used to convert two lists into a dictionary.

def to_dictionary(keys, values):
    return dict(zip(keys, values))

keys = ["a", "b", "c"]    
values = [2, 3, 4]
print(to_dictionary(keys, values)) # {'a': 2, 'c': 4, 'b': 3}

21. Use enumerate

This snippet shows that you can use enumerate to get both the values and the indexes of lists.

list = ["a", "b", "c", "d"]
for index, element in enumerate(list): 
    print("Value", element, "Index ", index, )
# ('Value', 'a', 'Index ', 0)
# ('Value', 'b', 'Index ', 1)
#('Value', 'c', 'Index ', 2)
# ('Value', 'd', 'Index ', 3)   

22. Time spent

This snippet can be used to calculate the time it takes to execute a particular code.

import time

start_time = time.time()

a = 1
b = 2
c = a + b
print(c) #3

end_time = time.time()
total_time = end_time - start_time
print("Time: ", total_time)

# ('Time: ', 1.1205673217773438e-05)

23. Try else

You can have an else clause as part of a try/except block, which is executed if no exception is thrown.

except TypeError:
    print("An exception was raised")
    print("Thank God, no exceptions were raised.")

#Thank God, no exceptions were raised.

24. Most frequent

This method returns the most frequent element that appears in a list.

def most_frequent(list):
    return max(set(list), key = list.count)

numbers = [1,2,1,2,3,2,1,4,2]

25. Palindrome

This method checks whether a given string is a palindrome.

def palindrome(a):
    return a == a[::-1]

palindrome('mom') # True

26. Calculator without if-else

The following snippet shows how you can write a simple calculator without the need to use if-else conditions.

import operator
action = {
    "+": operator.add,
    "-": operator.sub,
    "/": operator.truediv,
    "*": operator.mul,
    "**": pow
print(action['-'](50, 25)) # 25

27. Shuffle

This snippet can be used to randomize the order of the elements in a list. Note that shuffle works in place, and returns None.

from random import shuffle

foo = [1, 2, 3, 4]
print(foo) # [1, 4, 3, 2] , foo = [1, 2, 3, 4]

28. Spread

This method flattens a list similarly like [].concat(…arr) in JavaScript.

def spread(arg):
    ret = []
    for i in arg:
        if isinstance(i, list):
    return ret

spread([1,2,3,[4,5,6],[7],8,9]) # [1,2,3,4,5,6,7,8,9]

29. Swap values

A really quick way for swapping two variables without having to use an additional one.

a, b = -1, 14
a, b = b, a

print(a) # 14
print(b) # -1

30. Get default value for missing keys

This snippet shows how you can get a default value in case a key you are looking for is not included in the dictionary.

d = {'a': 1, 'b': 2}

print(d.get('c', 3)) # 3

How to building Python Data Science Container using Docker

How to building Python Data Science Container using Docker

Artificial Intelligence(AI) and Machine Learning(ML) are literally on fire these days. Powering a wide spectrum of use-cases ranging from self-driving cars to drug discovery and to God knows what. AI and ML have a bright and thriving future ahead of them.

On the other hand, Docker revolutionized the computing world through the introduction of ephemeral lightweight containers. Containers basically package all the software required to run inside an image(a bunch of read-only layers) with a COW(Copy On Write) layer to persist the data.

Python Data Science Packages

Our Python data science container makes use of the following super cool python packages:

  1. NumPy: NumPy or Numeric Python supports large, multi-dimensional arrays and matrices. It provides fast precompiled functions for mathematical and numerical routines. In addition, NumPy optimizes Python programming with powerful data structures for efficient computation of multi-dimensional arrays and matrices.

  2. SciPy: SciPy provides useful functions for regression, minimization, Fourier-transformation, and many more. Based on NumPy, SciPy extends its capabilities. SciPy’s main data structure is again a multidimensional array, implemented by Numpy. The package contains tools that help with solving linear algebra, probability theory, integral calculus, and many more tasks.

  3. Pandas: Pandas offer versatile and powerful tools for manipulating data structures and performing extensive data analysis. It works well with incomplete, unstructured, and unordered real-world data — and comes with tools for shaping, aggregating, analyzing, and visualizing datasets.

  4. SciKit-Learn: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. It is one of the best-known machine-learning libraries for python. The Scikit-learn package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. The primary emphasis is upon ease of use, performance, documentation, and API consistency. With minimal dependencies and easy distribution under the simplified BSD license, SciKit-Learn is widely used in academic and commercial settings. Scikit-learn exposes a concise and consistent interface to the common machine learning algorithms, making it simple to bring ML into production systems.

  5. Matplotlib: Matplotlib is a Python 2D plotting library, capable of producing publication quality figures in a wide variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shell, the Jupyter notebook, web application servers, and four graphical user interface toolkits.

  6. NLTK: NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

Building the Data Science Container

Python is fast becoming the go-to language for data scientists and for this reason we are going to use Python as the language of choice for building our data science container.

The Base Alpine Linux Image

Alpine Linux is a tiny Linux distribution designed for power users who appreciate security, simplicity and resource efficiency.

As claimed by Alpine:

Small. Simple. Secure. Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.

The Alpine image is surprisingly tiny with a size of no more than 8MB for containers. With minimal packages installed to reduce the attack surface on the underlying container. This makes Alpine an image of choice for our data science container.

Downloading and Running an Alpine Linux container is as simple as:

$ docker container run --rm alpine:latest cat /etc/os-release

In our, Dockerfile we can simply use the Alpine base image as:

FROM alpine:latest

Talk is cheap let’s build the Dockerfile

Now let’s work our way through the Dockerfile.

The FROM directive is used to set alpine:latest as the base image. Using the WORKDIR directive we set the /var/www as the working directory for our container. The ENV PACKAGES lists the software packages required for our container like git, blas and libgfortran. The python packages for our data science container are defined in the ENV PACKAGES.

We have combined all the commands under a single Dockerfile RUN directive to reduce the number of layers which in turn helps in reducing the resultant image size.

Building and tagging the image

Now that we have our Dockerfile defined, navigate to the folder with the Dockerfile using the terminal and build the image using the following command:

$ docker build -t faizanbashir/python-datascience:2.7 -f Dockerfile .

The -t flag is used to name a tag in the 'name:tag' format. The -f tag is used to define the name of the Dockerfile (Default is 'PATH/Dockerfile').

Running the container

We have successfully built and tagged the docker image, now we can run the container using the following command:

$ docker container run --rm -it faizanbashir/python-datascience:2.7 python

Voila, we are greeted by the sight of a python shell ready to perform all kinds of cool data science stuff.

Python 2.7.15 (default, Aug 16 2018, 14:17:09) [GCC 6.4.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

Our container comes with Python 2.7, but don’t be sad if you wanna work with Python 3.6. Lo, behold the Dockerfile for Python 3.6:

Build and tag the image like so:

$ docker build -t faizanbashir/python-datascience:3.6 -f Dockerfile .

Run the container like so:

$ docker container run --rm -it faizanbashir/python-datascience:3.6 python

With this, you have a ready to use container for doing all kinds of cool data science stuff.

Serving Puddin’

Figures, you have the time and resources to set up all this stuff. In case you don’t, you can pull the existing images that I have already built and pushed to Docker’s registry Docker Hub using:

# For Python 2.7 pull
$ docker pull faizanbashir/python-datascience:2.7# For Python 3.6 pull
$ docker pull faizanbashir/python-datascience:3.6

After pulling the images you can use the image or extend the same in your Dockerfile file or use it as an image in your docker-compose or stack file.


The world of AI, ML is getting pretty exciting these days and will continue to become even more exciting. Big players are investing heavily in these domains. About time you start to harness the power of data, who knows it might lead to something wonderful.

You can check out the code here.

Go Top Programming Languages in 2020 from Authentic Surveys

Go Top Programming Languages in 2020 from Authentic Surveys

Comparing Programming Languages is a very complex thing and so there are many graphical illustration/jokes trying to symbolize Programming language. I found few and I am starting this article with those.

This is image title

In simple words, Programming Language empowers human to instruct and control machine. So, it is natural that there will be so many languages which try to make this process more powerful and simple. For this very reason there are hundreds of programming languages, many of those programming languages are now out of active use, few are going to be obsolete in coming years and then there are languages which is going to continue and prove its usage in coming years and then there are new programming language fighting for it acceptance.

This article going to present the trends of top Programming Languages which will continue in the coming year 2020. To predict the trend of the programming language in 2020 this article uses data from authentic surveys, various collected statistics, search results and salary trends according to programming languages. This article will help the new learner to pick a programming language to learn and for expert, it will help to decide either to switch to another language or to continue with his expertise language.

In the next section, I have prepared two tables which summarize the popularity trend of Programming Languages in the last five years (2015-19). The data is taken from Stackoverflow popularity survey 2015-2019. For a clear and accurate understanding, the programming languages are divided into two groups, first, languages which have origin before 2000 and the second group has languages which came after 2000. The selection of 2000 as the boundary is just random but very helpful to understand the programming trend under these two groups. The table also lists origin year and main or documented purpose of these programming/scripting languages.

This is image title
This is image title


There is a decrease in the popularity of all languages from 2018 to 2019 except Python.


Python is the only language continuously on rising since last five years. It is a general-purpose language, so someone wants to learn just one programming in 2020 and want to cover more area of software development then Python could be chosen**.**


Java was on rising but fall in 2019, the reason could Kotlin gaining popularity on the Android platform. Java is a good choice for a programming language but now it is under Oracle and Google is promoting Kotlin so it is in the conflicted zone. As a matter of fact still, the large number of the company is using Java and going to continue with Java due to its developers base, framework, and legacy application.


C and C++ are still holding with approx 20% and it will be there due to its inherent features and legacy system.


JavaScript popularity can be attributed to the growth of popular JavaScript library and framework like node.js, etc. JS is the language for the dynamic website and this going to be top for coming years because of its active development, support from Mozilla and penalty of libraries and frameworks. So, if someone wants to be web development, javascript is a must.


R is gaining popularity in recent years and reason would be growth and popularity of data analysis. It is used by data scientist but again much behind in comparison to Python which has established as general-purpose languages and enjoy active developers with lots of data science libraries and modules. So, one can prefer Python over R if they have to choose only one otherwise if wanted carrier in Data Sciences then learning both will a good option.

<div class="5tjAASx1"><script type="bfa163966612c2c2fba9ece2-text/javascript">(adsbygoogle = window.adsbygoogle || []).push({});</script></div>


Like PHP, Ruby is also facing tough competition from JavaScript and even Python to establish as back-end web development programming language. So, again for web development javascript and Python (server-side (Flask, Django, etc.) would be a good choice and will offer more domain flexibility than Ruby.


There is a sharp decline in PHP popularity in 2019 and it can be traced back to server-side acceptance of javascript and Python. So, if someone wants to go to server-side web development then still PHP is a good choice with a large number of popular framework like CakePHP, Codeigniter, etc. otherwise choosing a general-purpose programming language would be better.


Objective-C was the main language for Apple’s software such as macOS, iOS, etc. before Apple moved to Swift language. So this transition is reflected in the popularity of both languages, i.e. there is a fall in popularity for Objective-C and the popularity of Swift is rising. So, again if someone wants to be a developer for Apple’s products then Swift should be the language of choice.

This is image title



Swift has replaced the Objective-C as the main language for Apple-related software and application. Since it is supported and promoted by Apple so there is an increase in popularity since its inception and as Apple is going to continue with it so if someone is looking for Apple-specific development platform then Swift is going to be a must-know programming language. This is mostly vendor and product-specific language with very less usage outside Apple’s eco-system.


Go (Golang) is getting popularity as maintain, use and promoted by Google. The motivation of Go development was to address criticism of some of the popular languages and keeping the best of them in one place. Since 2017, Go is moving upward in popularity and with Google support, it is going to enjoy this in coming years. Google is also making Go as a primary language for new projects and replacing other languages with Go, this trend going to make useful and important to learn in coming years so one can pick Go as a new programming language.


Kotlin is being offered as an alternative of Java for Android development and again it is supported and promoted by Google so it is also picking up by developers and gaining popularity in recent years. So, with the growth of Android, support of Google and with clean and short syntax it is going to be a choice of Android application developers and is a good choice to learn for Android App developer. Kotlin going to be shine as a prominent programming environment for Android development.


Scala tries to establish as an alternative to Java but didn’t get very well among developers. It doesn’t have big support from any multi-national company, perceive as functional languages and dependency on JVM doesn’t provide much scope to rise in popularity. There could be steady growth but very slow and surely not a language to learn as a beginner.


Julia aims to bring the speed of ‘C’ and simplicity of Python but it is strange that didn’t found any popularity in Stackoverflow survey but gaining popularity among data science domain and being seen as a challenger for R and Python in long run. Surely, there will be growth in Julia but still, Python or R is better for job and growth.


C# is the language for the .NET framework and developed by Microsoft. Its popularity is approx constant over past years and going to continue with a similar trend. This is vendor-specific language so one can pick this language if want to work in the Microsoft development environment. Recently, Microsoft has open-sourced the .NET so there would be some upward trend but again it is vendor-specific so there won’t be much affected.


Rust, Clojure, etc. are languages which have a user-base but not so popular so surely not going to have an upward swing in popularity.

A Picture Says a Thousand Words

To understand a clear trend and picture of top programming language growth let keep a picture of it by the various chart. The figure 1 and figure2 gives a very clear picture that in old language stack JavaScript is far ahead than others and credit goes to boom in web development, then C and C++ together competing very closer to Java. Python is moving upward in popularity and only language which popularity is constantly increasing in the last 5 years. New languages are getting popularity and most of them are supported by the multi-national company and bit IT giant like Microsoft, Google and Apple.

This is image title

This is image title

Loved and Wanted Languages

This is image title

This is image title

From above Table and Figure, few observations are very obvious, Love of Rust is growing in last five years whereas Swift is loosing love from developers and Python is in between these two and last two years have gain for Python. One more unique observation is that out of 5 loved languages 4 are from post 2000 group while only Python is the older language and Kotlin love started with addition of Kotlin for Android development post 2017.

This is image title

From above table, wish of developing in javascript and Python is growing in last years and this reflect in popularity and love for the language. There is a sharp decline in Java and this is due to the addition of Kotlin as alternative for Android app development and also change of policy by Oracle who own Java now.

This is image title

Technologies and Programming Languages

This is image title

In this figure, one can see that the largest cluster is for Web development and JavaScript and its various framework is dominating the cluster this is USP of JavaScript growth. The second-largest cluster is of Microsoft technologies and Python technologies which again clear the popularity and love for the language. Python cluster is linked with data science technologies which highlight the growth story of Python.


TIOBE index rank programming language based on search engine search result. The selection of search engines and programming language is defined in its page. The ratings are calculated by counting hits of the most popular search engines. The search query that is used is +”<language> programming”. In TIOBE index Java is dominating the ranking in the last two decades where C is holding the 1st and 2nd rank for the last 30 years. Python has come a long way in the last two decades i.e. 24th in 1999 to 3rd in 2019. If someone merges the C and C++ then it would hold the 1st positions forever.

This is image title

In the new languages (post-2000), Rust moved up in ranking i.e. from 33rd to 28th, and Julia from 50th to 39th. It is also interesting to note that Kotlin doesn’t seem to come closer to the top 20.

Popularity of Programming Language (PYPL) Index

The PYPL index is created by analyzing how often language tutorials are searched on Google. The more a language tutorial is searched, the more popular the language is assumed to be. It is a leading indicator. The raw data comes from Google Trends.

Below Figure verifies that the top 3 languages are Python, Java, and JavaScript. C#, PHP, C/C++ also secure top position, this trend is again similar to stack-overflow, and TIOBE index.

This is image title

Above Figure indicates that among new programming Language i.e. post 2000 Kotlin, Go, Rust, and Julia is moving up in the ranking.

This is image title

Job Market and Salary

Salary depends upon the geographical area and demand of the products, a programming language based salary comparison is just a tool to predict or estimate the salary trend. We have summarized salary based on programming language from popular survey i.e. Dice salary survey 2018 and Stack-overflow survey 2018 and 2019.

This is image title

From the above table, it is very clear from both survey that Go/Golang is a very high paid job in the market and even stands 1st rank in a high paid job in stack-overflow 2019 survey and Dice Salary Survey 2018.

Language Predictability

So, as closing remarks, It is easy to predict a language trend but choosing only one language to learn is a really difficult choice and totally depend upon the individual choice and their future plans, for example, if you want to work in Web Development you can’t afford neglecting Javascript, if you want to work with Apple’s products you can’t neglect Swift now, if your taste is in system-level programming then C and C++ is your friend, Python makes you run faster in many domains and currently darling in Data science. You see each language takes you on a different journey. Choose your destination and then drive with the language of that path.

You may also like: Programming Languages - Trend Predictions in 2020 and Next Years.

We’ll love to know your opinion. What is the Best Programming Language for you?

Thank for reading! If you enjoyed this article, please share it with others who may enjoy it as well.!

18 Python scripts that help you write code faster

18 Python scripts that help you write code faster

Python is a no-BS programming language. Readability and simplicity of design are two of the biggest reasons for its immense popularity.

This is why it is worthwhile to remember some common Python tricks to help improve your code design. These will save you the trouble of surfing Stack Overflow every time you need to do something.
The following tricks will prove handy in your day-to-day coding exercises.

1. Finding Unique Elements in a String

The following snippet can be used to find all the unique elements in a string. We use the property that all elements in a set are unique.

my_string = "aavvccccddddeee"

# converting the string to a set
temp_set = set(my_string)

# stitching set into a string using join
new_string = ''.join(temp_set)


2. Using rhe Title Case (First Letter Caps)

The following snippet can be used to convert a string to title case. This is done using the title() method of the string class.

my_string = "my name is chaitanya baweja"

# using the title() function of string class
new_string = my_string.title()


# Output
# My Name Is Chaitanya Baweja

3. Reversing a String

The following snippet reverses a string using the Python slicing operation.

# Reversing a string using slicing

my_string = "ABCDE"
reversed_string = my_string[::-1]


# Output

You can read more about this here.

4. Printing a String or a List n Times

You can use multiplication (*) with strings or lists. This allows us to multiply them as many times as we like.

n = 3 # number of repetitions

my_string = "abcd"
my_list = [1,2,3]

# abcdabcdabcd

# [1,2,3,1,2,3,1,2,3]

An interesting use case of this could be to define a list with a constant value — let’s say zero.

n = 4
my_list = [0]*n # n denotes the length of the required list
# [0, 0, 0, 0]

5. Combining a List of Strings Into a Single String

The join() method combines a list of strings passed as an argument into a single string. In our case, we separate them using the comma separator.

list_of_strings = ['My', 'name', 'is', 'Chaitanya', 'Baweja']

# Using join with the comma separator

# Output
# My,name,is,Chaitanya,Baweja

6. Swap Values Between Two Variables

Python makes it quite simple to swap values between two variables without using another variable.

a = 1
b = 2

a, b = b, a

print(a) # 2
print(b) # 1

7. Split a String Into a List of Substrings

We can split a string into a list of substrings using the .split() method in the string class. You can also pass as an argument the separator on which you wish to split.

string_1 = "My name is Chaitanya Baweja"
string_2 = "sample/ string 2"

# default separator ' '
# ['My', 'name', 'is', 'Chaitanya', 'Baweja']

# defining separator as '/'
# ['sample', ' string 2']

8. List Comprehension

List comprehension provides us with an elegant way of creating lists based on other lists.
The following snippet creates a new list by multiplying each element of the old list by two.

# Multiplying each element in a list by 2

original_list = [1,2,3,4]

new_list = [2*x for x in original_list]

# [2,4,6,8]

You can read more about it here.

9. Check If a Given String Is a Palindrome or Not

We have already discussed how to reverse a string. So palindromes become a straightforward program in Python.

my_string = "abcba"

if my_string == my_string[::-1]:
    print("not palindrome")

# Output
# palindrome

10. Using Enumerate to Get Index/Value Pairs

The following script uses enumerate to iterate through values in a list along with their indices.

my_list = ['a', 'b', 'c', 'd', 'e']

for index, value in enumerate(my_list):
    print('{0}: {1}'.format(index, value))

# 0: a
# 1: b
# 2: c
# 3: d
# 4: e

11. Find Whether Two Strings are Anagrams

An interesting application of the Counter class is to find anagrams.

An anagram is a word or phrase formed by rearranging the letters of a different word or phrase.

If the Counter objects of two strings are equal, then they are anagrams.

from collections import Counter

str_1, str_2, str_3 = "acbde", "abced", "abcda"
cnt_1, cnt_2, cnt_3  = Counter(str_1), Counter(str_2), Counter(str_3)

if cnt_1 == cnt_2:
    print('1 and 2 anagram')
if cnt_1 == cnt_3:
    print('1 and 3 anagram')

12. Using the try-except-else Block

Error handling in Python can be done easily using the try/except block. Adding an else statement to this block might be useful. It’s run when there is no exception raised in the try block.
If you need to run something irrespective of exception, use finally.

a, b = 1,0

    # exception raised when b is 0
except ZeroDivisionError:
    print("division by zero")
    print("no exceptions raised")
    print("Run this always")

13. Frequency of Elements in a List

There are multiple ways of doing this, but my favorite is using the Python Counter class.

Python counter keeps track of the frequency of each element in the container. Counter() returns a dictionary with elements as keys and frequency as values.

We also use the most_common() function to get themost_frequentelement in the list.

# finding frequency of each element in a list
from collections import Counter

my_list = ['a','a','b','b','b','c','d','d','d','d','d']
count = Counter(my_list) # defining a counter object

print(count) # Of all elements
# Counter({'d': 5, 'b': 3, 'a': 2, 'c': 1})

print(count['b']) # of individual element
# 3

print(count.most_common(1)) # most frequent element
# [('d', 5)]

14. Check the Memory Usage of an Object

The following script can be used to check the memory usage of an object. Read more about it here.

import sys

num = 21


# In Python 2, 24
# In Python 3, 28

15. Sampling From a List

The following snippet generates n number of random samples from a given list using the random library.

import random

my_list = ['a', 'b', 'c', 'd', 'e']
num_samples = 2

samples = random.sample(my_list,num_samples)
# [ 'a', 'e'] this will have any 2 random values

I have been recommended the secrets library for generating random samples for cryptography purposes. The following snippet will work
only on Python 3.

import secrets                              # imports secure module.
secure_random = secrets.SystemRandom()      # creates a secure random object.

my_list = ['a','b','c','d','e']
num_samples = 2

samples = secure_random.sample(my_list, num_samples)

# [ 'e', 'd'] this will have any 2 random values

16. Time Taken to Execute a Piece of Code

The following snippet uses the time library to calculate the time taken to execute a piece of code.

import time

start_time = time.time()
# Code to check follows
a, b = 1,2
c = a+ b
# Code to check ends
end_time = time.time()
time_taken_in_micro = (end_time- start_time)*(10**6)

print(" Time taken in micro_seconds: {0} ms").format(time_taken_in_micro)

17. Flattening a List of Lists

Sometimes you’re not sure about the nesting depth of your list, and you simply want all the elements in a single flat list.
Here’s how you can get that:

from iteration_utilities import deepflatten

# if you only have one depth nested_list, use this
def flatten(l):
  return [item for sublist in l for item in sublist]

l = [[1,2,3],[3]]
# [1, 2, 3, 3]

# if you don't know how deep the list is nested
l = [[1,2,3],[4,[5],[6,7]],[8,[9,[10]]]]

print(list(deepflatten(l, depth=3)))
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Numpy flatten is a better way to do this if you have a properly formatted array.

18. Merging Two Dictionaries

While in Python 2, we used the update() method to merge two dictionaries; Python 3.5 made the process even simpler.
In the script given below, two dictionaries are merged. Values from the second dictionary are used in case of intersections.

dict_1 = {'apple': 9, 'banana': 6}
dict_2 = {'banana': 4, 'orange': 8}

combined_dict = {**dict_1, **dict_2}

# Output
# {'apple': 9, 'banana': 4, 'orange': 8}

Thanks for reading !