Alternative Python libraries for Data Science

The field of machine learning is progressing with leaps and bounds. With an equal pace, new libraries are being added to the Data Science arsenal. Today a single task can be performed with more than one library and in more than one way. Amidst all this plethora of new libraries, a few stand out due to their ease of use and out of the box implementations. In this article, I will cover five such libraries, which could speed the process of traditional machine learning, thereby lowering the entry barrier.

1. Dabl(Data Analysis Baseline Library)

Dabl library has been created by Andreas Mueller, one of the core developers and maintainers of the scikit-learn machine learning library. The idea behind dabl is to make supervised machine learning more accessible to beginners and reduce boilerplate for common tasks. Dabl takes inspirations from scikit-learn and auto-sklearn. The library is being developed actively and hence isn’t recommended for production use. Refer to the official website for more info and examples.

Installation

## Installing the library
!pip install dabl

Usage

Dabl can be used for automated preprocessing of the dataset, quick EDA as well as initial model building as part of a typical machine learning pipeline. Let’s demo some of the use cases of this library with the help of the titanic dataset. We’ll start by importing both the library as well as the dataset.

#import the basiclibraries
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
#importing dabl
import dabl
#import the dataset
titanic_df = pd.read_csv('../input/titanic/train.csv')
titanic.info()

Image for post

#scikit-learn #python #machine-learning #eda #nlp #data-science

1. Dabl(Data Analysis Baseline Library)

Installation

Usage

towardsdatascience.com

Alternative Python libraries for Data Science