Given the popularity of Python as a language for data analysis, this tutorial focuses on creating graphs using a popular Python library — Matplotlib.
Matplotlib is a huge library, which can be a bit overwhelming for a beginner — even if one is fairly comfortable with Python. While it is easy to generate a plot using a few lines of code, it may be difficult to comprehend what actually goes on in the back-end of this library. This tutorial explains the core concepts of Matplotlib so that one can explore its full potential.
Let’s get started!
The library that we will use in this tutorial to create graphs is Python’s matplotlib
. This post assumes you are using version 3.0.3
. To install it, run the following pip
command in the terminal.
pip install matplotlib==3.0.3
To verify the version of the library that you have installed, run the following commands in the Python interpreter.
>>> import matplotlib
>>> print(matplotlib.__version__)
'3.0.3'
If you are using Jupyter notebooks, you can display Matplotlib graphs inline using the following magic command.
%matplotlib inline
During the initial phases of its development, Mathworks’ MATLAB influenced John Hunter, the creator of Matplotlib. There is one key difference between the use of commands in MATLAB and Python. In MATLAB, all functions are available at the top level. Essentially, if you imported everthing from matplotlib.pylab
, functions such as plot()
would be available to use.
This feature was convenient for those who were accustomed to MATLAB. In Python, though, this could potentially create a conflict with other functions.
Therefore, it is a good practice to use the pyplot
source.
from matplotlib import pyplot as plt
All functions such as plot()
are available within pyplot
. You can use the same plot()
function using plt.plot()
after the import earlier.
The Matplotlib documentation describes the anatomy of a plot, which is essential in building an understanding of various features of the library.
The major parts of a Matplotlib plot are as follows:
Each element of a plot can be manipulated in Matplotlib’s, as we will see later.
Without further delay, let’s create our first plot!
Creating a plot is not a difficult task. First, import the pyplot
module. Although there is no convention, it is generally imported as a shorter form &mdash plt
. Use the .plot()
method and provide a list of numbers to create a plot. Then, use the .show()
method to display the plot.
from matplotlib import pyplot as plt
plt.plot([0,1,2,3,4])
plt.show()
Notice that Matplotlib creates a line plot by default. The numbers provided to the .plot()
method are interpreted as the y-values to create the plot. Here is the documentation of the .plot()
method for you to further explore.
Now that you have successfully created your first plot, let us explore various ways to customize your plots in Matplotlib.
Let us discuss the most popular customizations in your Matplotlib plot. Each of the options discussed here are methods of pyplot
that you can invoke to set the parameters.
title
: Sets the title of the chart, which is passed as an argument.ylabel
: Sets the label of the Y axis. xlabel
can be used to set the label of the X axis.yticks
: Sets which ticks to show on the Y axis. xticks
is the corresponding option for showing ticks on the X axis.legend
: Displays the legend on the plot. The loc
argument of the .legend()
method sets the position of the legend on the graph. The best
option for the loc
arguments lets Matplotlib decide the least intrusive position of the legend on the figure.Let us use these options in our plot.
plt.plot([0,1,2,3,4], label='y = x')
plt.title('Y = X Straight Line')
plt.ylabel('Y Axis')
plt.yticks([1,2,3,4])
plt.legend(loc = 'best')
plt.show()
Here is the output of the code above. Notice that a title has appeared in the figure, the Y axis is labelled, the number of ticks on the Y axis are lesser than those in the X axis and a legend is shown on the top left corner.
After tinkering with the basic options of a plot, let’s create multiple plots in same figure. Let us try to create two straight lines in our plot.
To achieve this, use the .plot()
method twice with different data sets. You can set the label for each line plot using the label
argument of the .plot()
method to make the code shorter.
plt.plot([0,1,2,3,4], label='y = x')
plt.plot([0,2,4,6,8], label='y = 2x')
plt.title('Two Straight Lines')
plt.legend(loc = 'best')
plt.show()
Next, let’s try to create a different type of plot. To create a scatter plot of points on the XY plane, use the .scatter()
method.
plt.scatter([1,2,3,4], [5,1,4,2])
plt.show()
Here is what the scatter plot looks like.
A number of other plots can be created on Matplotlib. You can use the .hist()
method to create a histogram. You can add multiple plots to a figure using the .subplot()
method. You can even create a vector path using the path
module of pyplot
.
After exploring various options while creating plots with Matplotlib, the next step is to export the plots that you have created. To save a figure as an image, you can use the .savefig()
method. The filename with the filepath should be provided as an argument to this method.
plt.savefig('my_figure.png')
While the documentation for savefig
lists various arguments, the two most important ones are listed below:
dpi
: This argument is used to set the resolution of the resulting image in DPI (dots per inch).transparent
: if set to True, the background of the figure is transparent.While the code above saves a single figure, you may need to save multiple figures in a same file. Matplotlib allows you to save multiple figures to a single PDF file using the PdfPages
class. The steps to create a PDF file with multiple plots are listed below:
PdfPages
class from matplotlib.backends.backend_pdf
and initialize it to an empty PDF file..figure()
class and create the plot. Once the plot is created, use the .savefig()
method of the PdfPages
class to save the figure..close()
method.To summarize the process, the following code snippet creates a PDF with the two figures that we created above.
from matplotlib.backends.backend_pdf import PdfPages
pdf = PdfPages('multipage.pdf')
fig1 = plt.figure()
plt.plot([0,1,2,3,4])
plt.close()
pdf.savefig(fig1)
fig2 = plt.figure()
plt.plot([0,2,4,6,8])
plt.close()
pdf.savefig(fig2)
pdf.close()
In this tutorial, we created plots in Python with the matplotlib
library. We discussed the concepts you need to know to understand how Matplotlib works, and set about creating and customizing real plots. And we showed you how to export your plots for use in real-world scenarios, like reports and presentations.
How do you create plots with Python? Let us know in the comments below.
30s ad
☞ How to Use C Functions in Python
☞ Getting Started with MySQL in Python
☞ Serverless Python Application Development with AWS Chalice
#python