The Complete Guide to Python Openpyxl for Beginners in 2020

In this tutorial, we will learn about the following topics:

Openpyxl Introduction
Openpyxl Working Process
Openpyxl Installation
Openpyxl Write Data to Cell
Openpyxl Reading Excel File
Openpyxl Read Data from Cell
Openpyxl Read Multiple Cells
Openpyxl Sheets
Openpyxl Filter and sort data
Openpyxl Merging cells
Openpyxl Append Values
Openpyxl Iterate by Column
Openpyxl Iterate by rows
Openpyxl Adding chart to Excel file
Openpyxl Adding Image

Python Openpyxl Introduction

Python provides the Openpyxl module, which is used to deal with Excel files without involving third-party Microsoft application software. By using this module, we can have control over excel without open the application. It is used to perform excel tasks such as read data from excel file, or write data to the excel file, draw some charts, accessing excel sheet, renaming sheet, modification (adding and deleting) in excel sheet, formatting, styling in the sheet, and any other task. Openpyxl is very efficient to perform these tasks for you.

Data scientists often use the Openpyxl to perform different operations such as data copying to data mining as well as data analysis.

Openpyxl Working Process

The Openpyxl library is used to write or read the data in the excel file and many other tasks. An excel file that we use for operation is called Workbook that contains a minimum of one Sheet and a maximum of tens of sheets.

Sheets consist of Rows (horizontal series) starting from 1 and Columns (vertical series) starting from A.
Row and column together make a grid and form a cell that may store some data. Data can be of any type, such as numeric, string.
Openpyxl provides flexibility to read data from the individual cell or write data to it.

Installation of Openpyxl

In the above section, we have discussed the openpyxl briefly and its working process. In order to use Openpyxl, one should have Python 3.7 and openpyxl 2.6.2 installed in the system. Let’s start working with openpyxl by installing openpyxl using the following command:

pip install openpyxl

The xlsx is the extension of the XML spreadsheet file. The xlsx file supports macros. Let’s understand the basic operation related to the excel file. Consider the following code:

from openpyxl import Workbook  
import time  
  
wb = Workbook()  
sheet = wb.active  
  
sheet['A1'] = 87  
sheet['A2'] = "Devansh"  
sheet['A3'] = 41.80  
sheet['A4'] = 10  
  
now = time.strftime("%x")  
sheet['A5'] = now  
  
wb.save("sample_file.xlsx")

Output:

Openpyxl Installation

In the above code, we have written data into the five cells A1, A2, A3, A4, and A5. These cells consist of different types of values. We have imported Workbook class from the openpyxl module. A workbook class is a container that contains all parts of the document.

Here we have defined a new workbook. At least one sheet is always made with a workbook.

wb = Workbook()

We get the location of the active sheet.

sheet['A1'] = 87  
sheet['A2'] = 'Devansh'

We have saved all data to the sample_file.xlsx file using the save() method.

Openpyxl Write Data to Cell

We can add data to the excel file using the following Python code. First, we will import the load_workbook function from the openpyxl module, then create the object of the file and pass filepath as an argument. Consider the following code:

from openpyxl import load_workbook  
wb = load_workbook(r'C:\Users\DEVANSH SHARMA\Desktop\demo.xlsx')  
  
sheet = wb.active  
sheet['A1'] = 'Devansh Sharma'  
  
sheet.cell(row=2, column=2).value = 5  
wb.save(r'C:\Users\DEVANSH SHARMA\Desktop\demo.xlsx')

Output:

Openpyxl Write Data to Cell

Openpyxl Append values

Openpyxl provides an append() method, which is used to append the group of values. We can append any type of value. These values are appended at the bottom of the current working sheet. Consider the following code:

from openpyxl import Workbook  
  
wb = Workbook()  
sheet = wb.active  
  
data = (  
    (11, 48, 50),  
    (81, 30, 82),  
    (20, 51, 72),  
    (21, 14, 60),  
    (28, 41, 49),  
    (74, 65, 53),  
    ("Peter", 'Andrew',45.63)  
)  
  
for i in data:  
    sheet.append(i)  
wb.save('appending_values.xlsx')

Output:

Openpyxl Append values

Openpyxl Read Data from cell

We can read the data that we have previously written in the cell. There are two methods to read a cell, firstly we can access it by cell name, and secondly, we can access it by the cell() function. For example, we are reading the data from the sample_file.xlrs file.

import openpyxl  
  
wb = openpyxl.load_workbook('sample_file.xlsx')  
  
sheet = wb.active  
  
x1 = sheet['A1']  
x2 = sheet['A2']  
#using cell() function  
x3 = sheet.cell(row=3, column=1)  
  
print("The first cell value:",x1.value)  
print("The second cell value:",x2.value)  
print("The third cell value:",x3.value)

Output:

The first cell value: 87
The second cell value: Devansh
The third cell value: 41.8

Openpyxl Read multiple cells

We can read the values from the multiple cells. In the following example, we have marks.xlsx named excel file and we will read each cell of file using the range operator. Let’s have a look at the following program:

import openpyxl  
  
wb = openpyxl.load_workbook('marks.xlsx')  
  
sheet = wb.active  
#  
cells = sheet['A1','B7']  
# cells behave like range operator  
for i1,i2 in cells:  
    print("{0:8} {1:8}".format(i1.value,i2.value))

Output:

Student_name        Marks
Tony Stark           47
Loki                 59
Oddin                73
Nick Fury            62
Samaul               75
Peter Parkar         80

Openpyxl Iterate by rows

The openpyxl provides the iter_row() function, which is used to read data corresponding to rows. Consider the following example:

from openpyxl import Workbook  
  
wb = Workbook()  
sheet = wb.active  
  
rows = (  
    (90, 46, 48, 44),  
    (81, 30, 32, 16),  
    (23, 95, 87,27),  
    (65, 12, 89, 53),  
    (42, 81, 40, 44),  
    (34, 51, 76, 42)  
)  
  
for row in rows:  
    sheet.append(row)  
  
for row in sheet.iter_rows(min_row=1, min_col=1, max_row=6, max_col=4):  
    for cell in row:  
        print(cell.value, end=" ")  
    print()  
  
book.save('iter_rows.xlsx')

Output:

90 46 48 44 
81 30 32 16 
23 95 87 27 
65 12 89 53 
42 81 40 44 
34 51 76 42

Openpyxl Iterate by Column

The openpyxl provides iter_col() method which return cells from the worksheet as columns. Consider the following example:

from openpyxl import Workbook  
  
book = Workbook()  
sheet = book.active  
  
rows = (  
     (90, 46, 48, 44),  
     (81, 30, 32, 16),  
     (23, 95, 87, 27),  
     (65, 12, 89, 53),  
     (42, 81, 40, 44),  
     (34, 51, 76, 42)  
)  
for row in rows:  
    sheet.append(row)  
  
for row in sheet.iter_cols(min_row=1, min_col=1, max_row=6, max_col=3):  
    for cell in row:  
        print(cell.value, end=" ")  
    print()  
  
book.save('iterbycols.xlsx')

Output:

90 81 23 65 42 34 
46 30 95 12 81 51 
48 32 87 89 40 76

Openpyxl Sheets

As we know that each workbook can have multiple sheets. First, we need to create more than one sheet in a single workbook then we can access those excel sheets using Python. In the following example, we have created a workbook with three sheets:

import openpyxl  
wb = openpyxl.load_workbook('dimension_1.xlsx')  
  
#Getting list of all sheet available in workbook  
print(wb.get_sheet_names())  
  
# Returning object  
active_sheet = wb.active  
print(type(active_sheet))  
  
# Title of sheet  
sheet = wb.get_sheet_by_name("Monday")  
print(sheet.title)

Output:

['Sheet', 'Sunday', 'Monday', 'Tuesday', 'Wednesday']
<class 'openpyxl.worksheet.worksheet.Worksheet'>
Monday

It will look like the following image.

Openpyxl Sheets

Openpyxl filter and sort data

The auto_filter attribute is used to set filtering and sorting conditions. Consider the following code:

from openpyxl import Workbook  
  
wb = Workbook()  
sheet = wb.active  
  
sheet['A3'] = 40  
sheet['B3'] = 26  
  
row_count = [  
    (93,45),  
    (23,54),  
    (80,43),  
    (21,12),  
    (63,29),  
    (34,15),  
    (80,68),  
    (20,41)  
]  
  
for row in row_count:  
    sheet.append(row)  
  
print(sheet.dimensions)  
  
for a1,a2 in sheet[sheet.dimensions]:  
    print(a1.value, a2.value)  
  
sheet.auto_filter.add_sort_condition('B2:B8')  
sheet.auto_filter.add_filter_column(1, ['40', '26'])  
  
wb.save('dimension_1.xlsx')

Output:

Openpyxl Merging cell

We can merge the cell using the merge_cells() method. When we merge the cells, the top-left one is removed from the worksheet. The openpyxl also provides the unmerged_cells() method to unmerge the cell. Consider the following code:

from openpyxl.styles import Alignment  
  
wb = Workbook()  
sheet = wb.active  
  
sheet.merge_cells('A1:B2')  
  
cell = sheet.cell(row=1, column=1)  
cell.value = 'Devansh Sharma'  
cell.alignment = Alignment(horizontal='center', vertical='center')  
  
wb.save('merging.xlsx')

Output:

Openpyxl Merging cell

The freezing panes are simply mean to freeze an area of worksheet as visible while scrolling to other parts of the worksheet. It is useful feature where we can display the top row or leftmost column on the screen. We can do this by passing the cell name to the freeze_panes variable. To unfreeze all panes, set freeze_panes to ‘None’. Consider the following code:

from openpyxl import Workbook  
from openpyxl.styles import Alignment  
  
wb = Workbook()  
sheet = wb.active  
  
sheet.freeze_panes = 'A1'  
  
wb.save('freez_cells.xlsx')

Output:

Run the above code and scroll the worksheet.

Openpyxl formulas

We can write formula into the cell. These formulas are used to perform operations in excel file. After writing in the cell execute it from the workbook. Consider the following example:

from openpyxl import Workbook  
wb = Workbook()  
sheet = wb.active  
  
rows_count = (  
    (14, 27),  
    (22, 30),  
    (42, 92),  
    (51, 32),  
    (16, 60),  
    (63, 13)  
)  
  
for i in rows_count:  
    sheet.append(i)  
  
cell = sheet.cell(row=7, column=3)  
cell.value = "=SUM(A1:B6)"  
cell.font = cell.font.copy(bold=True)  
  
wb.save('formulas_book.xlsx')

Output:

Openpyxl formulas

Openpyxl Cell Inverter

The openpyxl cell inverter is used to invert the row and column of the cell in the spreadsheet. For example, the value at 3 rows will be inverted into 5 columns and row at row 5 will invert to column 3 (vice versa). You can see in the following images:

Openpyxl Cell Inverter

This program is written with the help of nested for loop. First the data structure writes in the sheetData[x][y] for the cell at column x and row y then the newly created spreadsheet in the spreadData[y][x] for the cell at column y and row x.

Adding Chart to Excel File

Charts are effective way to represent the data. Using the charts, it visualizes the data which can be easily untestable. There are various types of chart: pie chart, line chart, bar chart, and so on. We can draw a chart on a spreadsheet using an openpyxl module.

For building any chart on the spreadsheet, we need to define the chart type like BarChart, LineChart, and so on. We also import reference which represents data that is used for the chart. It is important to define what data we want to represent on the chart. Let’s understand by the following example:

from openpyxl import Workbook  
from openpyxl.chart import BarChart, Reference  
  
wb = Workbook()  
sheet = wb.active  
  
# Let's create some sample student data  
rows = [  
    ["Serial_no", "Roll no", "Marks"],  
    [1, "0090011", 75],  
    [2, "0090012", 60],  
    [3, "0090013", 43],  
    [4, "0090014", 97],  
    [5, "0090015", 63],  
    [6, "0090016", 54],  
    [7, "0090017", 86],  
]  
  
for i in rows:  
    sheet.append(i)  
  
chart = BarChart()  
values = Reference(worksheet=sheet,  
                 min_row=1,  
                 max_row=8,  
                 min_col=2,  
                 max_col=3)  
  
chart.add_data(values, titles_from_data=True)  
sheet.add_chart(chart, "E2")  
  
wb.save("student_chart.xlsx")

Output:

Openpyxl Adding Chart to Excel File

In the above code, we have created the sample data and drawn the bar chart corresponding to sample data.

Now we will create the line chart. Consider the following code:

import random  
from openpyxl import Workbook  
from openpyxl.chart import LineChart, Reference  
  
wb = Workbook()  
spreadsheet = wb.active  
  
# Let's create some sample data  
rows = [  
    ["", "January", "February", "March", "April",  
    "May", "June", "July", "August", "September",  
     "October", "November", "December"],  
    [1, ],  
    [2, ],  
    [3, ],  
]  
  
for row in rows:  
    spreadsheet.append(row)  
  
for row in spreadsheet.iter_rows(min_row=2,  
                           max_row=4,  
                           min_col=2,  
                           max_col=13):  
    for cell in row:  
        cell.value = random.randrange(5, 100)  
  
chart = LineChart()  
data = Reference(worksheet=spreadsheet,  
                 min_row=2,  
                 max_row=4,  
                 min_col=1,  
                 max_col=13)  
  
chart.add_data(data, from_rows=True, titles_from_data=True)  
spreadsheet.add_chart(chart, "C6")  
  
wb.save("line_chart1.xlsx")

Output:

Openpyxl Adding Chart to Excel File

In the above code, we are used from_rows = True as a parameter, it denotes chart plot row by row instead of the column by column.

Adding Image

Images are not generally used in a spreadsheet but sometimes we can use as per our requirement. We can use an image for the branding purposes or to make the spreadsheet more personal and attractive. For loading an image to spreadsheet, we need to install an additional module called pillow by the following command.

pip install pillow

In the following program, we are importing the image into the excel file.

from openpyxl import load_workbook  
from openpyxl.drawing.image import Image  
  
# Let's use the hello_world spreadsheet since it has less data  
workbook = load_workbook(filename="student_chart1.xlsx")  
sheet = workbook.active  
  
logo = Image(r"C:\Users\DEVANSH SHARMA\Pictures\Screenshots\image.png")  
  
# A bit of resizing to not fill the whole spreadsheet with the logo  
logo.height = 150  
logo.width = 150  
  
sheet.add_image(logo, "E2")  
workbook.save(filename="hello_world_logo1.xlsx")

In this tutorial, we have covered all basic and advance concept of openpyxl.

Thank you for reading! Please share if you liked it!

#python #openpyxl