This article compares the performance of the well-known pandas library with pypolars, a rising DataFrame library written in Rust. See how they compare.
pandas was initially released in 2008 written in Python, Cython, and C. Today, we’re comparing the performance of this well-known library with pypolars, a rising DataFrame library written in Rust. We compare the two while sorting and concatenating a 25Mil-record data and also when joining two CSVs.
Let’s first download a CSV file that contains ~26 million reddit usernames from Kaggle: https://www.kaggle.com/colinmorris/reddit-usernames
And let’s form another CSV file that we will use, you can create it with your favorite text editor or through the command line:
#data processing #pandas #performance #python