This article compares the performance of the well-known pandas library with pypolars, a rising DataFrame library written in Rust. See how they compare.

pandas was initially released in 2008 written in Python, Cython, and C. Today, we’re comparing the performance of this well-known library with pypolars, a rising DataFrame library written in Rust. We compare the two while sorting and concatenating a 25Mil-record data and also when joining two CSVs.

Downloading Reddit Usernames data

Let’s first download a CSV file that contains ~26 million reddit usernames from Kaggle: https://www.kaggle.com/colinmorris/reddit-usernames

And let’s form another CSV file that we will use, you can create it with your favorite text editor or through the command line:

#data processing #pandas #performance #python

A Rising Library Beating Pandas in Performance
1.35 GEEK