Pandas is an open-source data analysis and manipulation tool built on python. It is generally used for manipulating numerical and time-series data. It is used to create data structures like a data frame. Pandas is one of the most used python libraries but it has certain drawbacks like it uses a slow function which is not very suitable for bigger datasets, also pandas only handle results that fit in the memory which can be easily filled.

To overcome these drawbacks of Pandas, let us explore a high-performance python library for lazy Out-of-Core Dataframes named Vaex which is used to visualize and manipulate big tabular datasets. It performs different statistical functions and visualizations on very large datasets within seconds. Vaex in python uses Lazy computation and Memory mapping in which no memory is wasted. It loads a dataset with billions of rows in a few seconds.

In this article we will explore:


  1. How to use Vaex in python?
  2. Visualization using Vaex?
  3. Comparing Vaex and Pandas

Implementation of Vaex in Python

We will start exploring vaex but before to that, we need to install it using pip install vaex

  1. Importing libraries

We will import both pandas and vaex library as we need to compare the performance of both.


#developers corner #data analysis tool #data visualization tool #vaex #vaex in python #deep learning

Hands-On Guide to Vaex - Tool to Overcome Drawbacks of Pandas
2.65 GEEK