Introduction

Pandas is an open-source Python library for data analysis. It is designed for efficient and intuitive handling and processing of structured data.

The two main data structures in Pandas are Series and DataFrameSeries are essentially one-dimensional labeled arrays of any type of data, while DataFrames are two-dimensional, with potentially heterogenous data types, labeled arrays of any type of data. Heterogenous means that not all “rows” need to be of equal size.

In this article we will go through the most common ways of creating a DataFrame and methods to change their structure.

We’ll be using the Jupyter Notebook since it offers a nice visual representation of DataFrames. Though, any IDE will also do the job, just by calling a print() statement on the DataFrame object.

Creating DataFrames

Whenever you create a DataFrame, whether you’re creating one manually or generating one from a datasource such as a file - the data has to be ordered in a tabular fashion, as a sequence of rows containing data.

This implies that the rows share the same order of fields, i.e. if you want to have a DataFrame with information about a person’s name and age, you want to make sure that all your rows hold the information in the same way.

Any discrepancy will cause the DataFrame to be faulty, resulting in errors.

Creating an Empty DataFrame

To create an empty DataFrame is as simple as:

import pandas as pd
dataFrame1 = pd.DataFrame()

We will take a look at how you can add rows and columns to this empty DataFrame while manipulating their structure.

Creating a DataFrame From Lists

Following the “sequence of rows with the same order of fields” principle, you can create a DataFrame from a list that contains such a sequence, or from multiple lists zip()-ed together in such a way that they provide a sequence like that:

import pandas as pd

listPepper = [ 
            [50, "Bell pepper", "Not even spicy"], 
            [5000, "Espelette pepper", "Uncomfortable"], 
            [500000, "Chocolate habanero", "Practically ate pepper spray"]
            ]

dataFrame1 = pd.DataFrame(listPepper)

dataFrame1
## If you aren't using Jupyter, you'll have to call `print()`
## print(dataFrame1) 

#python #pandas #data structures

Creating and Manipulating DataFrames in Python with Pandas
1.15 GEEK