The field of machine learning is progressing with leaps and bounds. With an equal pace, new libraries are being added to the Data Science arsenal. Today a single task can be performed with more than one library and in more than one way. Amidst all this plethora of new libraries, a few stand out due to their ease of use and out of the box implementations. In this article, I will cover five such libraries, which could speed the process of traditional machine learning, thereby lowering the entry barrier.
Dabl library has been created by Andreas Mueller, one of the core developers and maintainers of the scikit-learn machine learning library. The idea behind dabl is to make supervised machine learning more accessible to beginners and reduce boilerplate for common tasks. Dabl takes inspirations from scikit-learn and auto-sklearn. The library is being developed actively and hence isn’t recommended for production use. Refer to the official website for more info and examples.
## Installing the library
!pip install dabl
Dabl can be used for automated preprocessing of the dataset, quick EDA as well as initial model building as part of a typical machine learning pipeline. Let’s demo some of the use cases of this library with the help of the titanic dataset. We’ll start by importing both the library as well as the dataset.
#import the basiclibraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#importing dabl
import dabl
#import the dataset
titanic_df = pd.read_csv('../input/titanic/train.csv')
titanic.info()
#scikit-learn #python #machine-learning #eda #nlp #data-science