Data scientists are magicians. You give them data, they turn them into stories. Visuals included.

A crucial part of my journey to becoming a Pythonista for data science is meeting new friends called Python libraries. A library is a collection of powerful scripts that can make your work a lot easier. There are libraries for general purposes and there are libraries for more specific needs. Each library is downloaded as a package and is imported into the script so we can call it and use its contents in our program.

Preparing the Story Outline

The library called pandas is a Python package that provides data structures and is useful for structured and time-series data. It mainly works with Series (for 1 dimension) and DataFrames (for 2 dimensions). Using pandas is quite efficient in the initial analysis of our data.

We always import the needed libraries first so we can easily call them later.

We can initially inspect how much data and the kind of data types we are dealing with using a number of methods under the df object.

A screenshot of a cell in Google Colab, showing the code for inspecting data.

#data-storytelling #matplotlib #python #seaborn #data-science

Adventures with Python: Storytelling with pandas and Matplotlib (ft. Seaborn)
3.05 GEEK