In this article, I am going to explain how to use Pandas in Python. Pandas is one of the most popular modules in python that can be used for data manipulation and analysis using python. Basically, it provides an easy interface to interact with flowing data and apply transformations to them on the go. This module is covered under the BSD license and can be used for free. You can download this module by visiting the website or by installing it through the python package manager.

Pandas provide us with a range of data analysis options such as reading data from files and databases, to applying various transformations within the data frames, slicing and dicing the data, and then writing the data back to a database or prepare it for a visualization tool to be fed to. Pandas can also visualize data within the python environment by importing another module known as matplotlib and display stunning visuals within it. However, for the scope of this article, we will stick to learning Pandas in python only. As per the definition provided by Wikipedia, “The name Pandas is derived from the term ‘panel data’, an econometrics term for data sets that include observations over multiple time periods for the same individuals”. Over the last few years, this module has been gaining popularity and this can be explained if we see the search trends from Stack Overflow.

Pandas in python popularity from Stack Overflow

Figure 1 – Pandas popularity from Stack Overflow

If you see the above graph, it is clearly visible that in recent years, the trend of using Pandas has increased exponentially and it is now one of the most common modules used by the entire data science community.

#python #pandas #dataframes

Working with Pandas Dataframes in Python
1.10 GEEK