Feature Engineering — deep dive into Encoding and Binning techniques

Feature Engineering — deep dive into Encoding and Binning techniques

Illustration of feature encoding and feature binning techniques. Feature engineering is the most important aspect of a data science model development. There are several categories of features in a raw dataset. Features can be text, date/time, categorical, and continuous variables. For a machine learning model, the dataset needs to be processed in the form of numerical vectors to train it using an ML algorithm.

Feature engineering is the most important aspect of a data science model development. There are several categories of features in a raw dataset. Features can be text, date/time, categorical, and continuous variables. For a machine learning model, the dataset needs to be processed in the form of numerical vectors to train it using an ML algorithm.

The objective of this article is to demonstrate feature engineering techniques to transform a categorical feature into a continuous feature and vice-versa.

  • Feature Binning: Conversion of a continuous variable to categorical.
  • Feature Encoding: Conversion of a categorical variable to numerical features.

Feature Binning:

Binning or discretization is used for the transformation of a continuous or numerical variable into a categorical feature. Binning of continuous variable introduces non-linearity and tends to improve the performance of the model. It can be also used to identify missing values or outliers.

There are two types of binning:

  • Unsupervised Binning: Equal width binning, Equal frequency binning
  • Supervised Binning: Entropy-based binning

Unsupervised Binning:

Unsupervised binning is a category of binning that transforms a numerical or continuous variable into categorical bins without considering the target class label into account. Unsupervised binning are of two categories:

1. Equal Width Binning:

This algorithm divides the continuous variable into several categories having bins or range of the same width.

Image for post

Image for post

Notations,
x = number of categories
w = width of a category
max, min = Maximum and Minimun of the list

Image for post

Image for post

Image for post

artificial-intelligence machine-learning feature-engineering data-science nlp

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

Artificial Intelligence (AI) vs Machine Learning vs Deep Learning vs Data Science

Artificial Intelligence (AI) vs Machine Learning vs Deep Learning vs Data Science: Artificial intelligence is a field where set of techniques are used to make computers as smart as humans. Machine learning is a sub domain of artificial intelligence where set of statistical and neural network based algorithms are used for training a computer in doing a smart task. Deep learning is all about neural networks. Deep learning is considered to be a sub field of machine learning. Pytorch and Tensorflow are two popular frameworks that can be used in doing deep learning.

Artificial Intelligence vs Machine Learning vs Data Science

Artificial Intelligence, Machine Learning, and Data Science are amongst a few terms that have become extremely popular amongst professionals in almost all the fields.

AI(Artificial Intelligence): The Business Benefits of Machine Learning

Enroll now at CETPA, the best Institute in India for Artificial Intelligence Online Training Course and Certification for students & working professionals & avail 50% instant discount.

Machine Learning Engineer vs Data Scientist (Is Data Science Over?)

Machine Learning Engineer vs Data Scientist (Is Data Science Over?) vs Data Analyst vs Research Scientist vs Applied Scientist vs…