When Pandas is not enough, use PandaSQL

When Pandas is not enough, use PandaSQL

Non-Equi Joins with Pandas and PandaSQL. Pandas is one of the best data manipulation libraries in recent times. It lets you slice and dice, groupby, join and do any arbitrary data transformation. You can take a look at this post, which talks about handling most of the data manipulation cases using a straightforward, simple, and matter of fact way using Pandas

Pandas is one of the best data manipulation libraries in recent times. It lets you slice and dice, groupby, join and do any arbitrary data transformation. You can take a look at this post, which talks about handling most of the data manipulation cases using a straightforward, simple, and matter of fact way using Pandas.

But even with how awesome pandas generally is, there sometimes are moments when you would like to have just a bit more. Say you come from a SQL background in which the same operation was too easy. Or you wanted to have more readable code. Or you just wanted to run an ad-hoc SQL query on your data frame. Or, maybe you come from R and want a replacement for sqldf.

For example, one of the operations that Pandas doesn’t have an alternative for is non-equi joins, which are quite trivial in SQL.

In this series of posts named Python Shorts, I will explain some simple but very useful constructs provided by Python, some essential tips, and some use cases I come up with regularly in my Data Science work.

This post is essentially about using SQL with pandas Dataframes.


But, what are non-equi joins, and why would I need them?

Let’s say you have to join two data frames. One shows us the periods where we offer some promotions on some items. And the second one is our transaction Dataframe. I want to know the sales that were driven by promotions, i.e., the sales that happen for an item in the promotion period.

We can do this by doing a join on the item column as well as a join condition (TransactionDt≥StartDt and TransactionDt≤EndDt). Since now our join conditions have a greater than and less than signs as well, such joins are called non-equi joins. Do think about how you will do such a thing in Pandas before moving on.

Image for post


machine-learning programming productivity data-science python-shorts

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Data Science With Python Training | Python Data Science Course | Intellipaat

🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.