Clearly explained: What, why and how of feature scaling

Why Normalization?

You might be surprised at the choice of the cover image for this post but this is how we can understand Normalization! This mighty concept helps us when we have data that has a variety of features having different measurement scales and thus leaving us in a lurch when we try to derive insights from such data or try to fit a model on such data.

Much like we can’t compare the different fruits shown in the above picture on a common scale, we can’t work efficiently with data that has too many scales.

For example: See the image below and observe the scales of salary Vs Work experience Vs Band level. Due to the higher scale range of the attribute Salary, it can take precedence over the other two attributes while training the model, despite whether or not it actually holds more weight in predicting the dependent variable.

Image for post

Thus, in the data pre-processing stage of data mining and model development (Statistical or Machine learning), it’s a good practice to normalize all the variables to bring them down to a similar scale — If they are of different ranges.

Normalization is not required for every dataset, you have to sift through it and make sure if your data requires it and only then continue to incorporate this step in your procedure. Also, you should apply Normalization if you are not very sure if the data distribution is Gaussian/ Normal/ bell-curve in nature. Normalization will help in reducing the impact of non-gaussian attributes on your model.

What is Normalization?

We’ll talk about two case scenarios here:

1. Your data doesn’t follow Normal/ Gaussian distribution (Prefer this in case of doubt also)

Data normalization, in this case, is the process of rescaling one or more attributes to the range of 0 to 1. This means that the largest value for each attribute is 1 and the smallest value is 0.

#data-science #machine-learning #statistics #data analysis

What is GEEK

Buddha Community

Clearly explained: What, why and how of feature scaling

Feature Scaling. Why we go for Feature Scaling ?

What is Feature Scaling ?
Feature Scaling is done on the dataset to bring all the different types of data to a Single Format. Done on Independent Variable.
Some Algorithm, uses Euclideam Distance to calculate the target. If the data varies in Magnitude and Units, Distance between the Independent Variables will be more. SO,bring the data in such a way that Independent variables looks same and does not vary much in terms of magnitude.

#standardscalar #scaling-pandas #minmaxscalar #feature-scaling

Nat  Kutch

Nat Kutch


Feature Transformation and Scaling Techniques


  1. Understand the requirement of feature transformation and training techniques
  2. Get to know different feature transformation and scaling techniques including-
  • MinMax Scaler
  • Standard Scaler
  • Power Transformer Scaler
  • Unit Vector Scaler/Normalizer


In my machine learning journey, more often than not, I have found that feature preprocessing is a more effective technique in improving my evaluation metric than any other step, like choosing a model algorithm, hyperparameter tuning, etc.

Feature preprocessing is one of the most crucial steps in building a Machine learning model. Too few features and your model won’t have much to learn from. Too many features and we might be feeding unnecessary information to the model. Not only this, but the values in each of the features need to be considered as well.

We know that there are some set rules of dealing with categorical data, as in, encoding them in different ways. However, a large chunk of the process involves dealing with continuous variables. There are various methods of dealing with continuous variables. Some of them include converting them to a normal distribution or converting them to categorical variables, etc.

Image for post

There are a couple of go-to techniques I always use regardless of the model I am using, or whether it is a classification task or regression task, or even an unsupervised learning model. These techniques are:

  • Feature Transformation and
  • Feature Scaling.

_To get started with Data Science and Machine Learning, check out our course — _Applied Machine Learning — Beginner to Professional

Table of Contents

  1. Why do we need Feature Transformation and Scaling?
  2. MinMax Scaler
  3. Standard Scaler
  4. MaxAbsScaler
  5. Robust Scaler
  6. Quantile Transformer Scaler
  7. Log Transformation
  8. Power Transformer Scaler
  9. Unit Vector Scaler/Normalizer

Why do we need Feature Transformation and Scaling?

Oftentimes, we have datasets in which different columns have different units — like one column can be in kilograms, while another column can be in centimeters. Furthermore, we can have columns like income which can range from 20,000 to 100,000, and even more; while an age column which can range from 0 to 100(at the most). Thus, Income is about 1,000 times larger than age.

But how can we be sure that the model treats both these variables equally? When we feed these features to the model as is, there is every chance that the income will influence the result more due to its larger value. But this doesn’t necessarily mean it is more important as a predictor. So, to give importance to both Age, and Income, we need feature scaling.

In most examples of machine learning models, you would have observed either the Standard Scaler or MinMax Scaler. However, the powerful sklearn library offers many other scaling techniques and feature transformations as well, which we can leverage depending on the data we are dealing with. So, what are you waiting for?

Let us explore them one by one with Python code.

We will work with a simple dataframe:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline
df = pd.DataFrame({ 'Income': [15000, 1800, 120000, 10000], 
'Age': [25, 18, 42, 51], 
'Department': ['HR','Legal','Marketing','Management'] })

Before directly applying any feature transformation or scaling technique, we need to remember the categorical column: Department and first deal with it. This is because we cannot scale non-numeric values.

For that, we 1st create a copy of our dataframe and store the numerical feature names in a list, and their values as well:

df_scaled = df.copy() col_names = ['Income', 'Age']
features = df_scaled[col_names]

We will execute this snippet before using a new scaler every time.

MinMax Scaler

The MinMax scaler is one of the simplest scalers to understand. It just scales all the data between 0 and 1. The formula for calculating the scaled value is-

x_scaled = (x — x_min)/(x_max — x_min)

Thus, a point to note is that it does so for every feature separately. Though (0, 1) is the default range, we can define our range of max and min values as well. How to implement the MinMax scaler?

1 — We will first need to import it

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

2 — Apply it on only the values of the features:

df_scaled[col_names] = scaler.fit_transform(features.values)

How do the scaled values look like?

Image for post

Image for post

You can see how the values were scaled. The minimum value among the columns became 0, and the maximum value was changed to 1, with other values in between. However, suppose we don’t want the income or age to have values like 0. Let us take the range to be (5, 10)

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(5, 10))

df_scaled[col_names] = scaler.fit_transform(features.values)

This is what the output looks like:

Image for post

Image for post

Amazing, right? The min-max scaler lets you set the range in which you want the variables to be.

Standard Scaler

Just like the MinMax Scaler, the Standard Scaler is another popular scaler that is very easy to understand and implement.

For each feature, the Standard Scaler scales the values such that the mean is 0 and the standard deviation is 1(or the variance).

x_scaled = x — mean/std_dev

However, Standard Scaler assumes that the distribution of the variable is normal. Thus, in case, the variables are not normally distributed, we

  1. either choose a different scaler
  2. or first, convert the variables to a normal distribution and then apply this scaler

Implementing the standard scaler is much similar to implementing a min-max scaler. Just like before, we will first import StandardScaler and then use it to transform our variable.

#feature-engineering #feature-scaling #scikit-learn #deep learning

James Clooney

James Clooney


Architect Scale Rulers with Your LOGO |

These architect scales, and rulers are used in construction and engineering. Your company logo will look great on these tapes when you give them to your clients or employees. Not only architects use these tapes. Painting estimators, HVAC, electrical, structural steel, construction and carpentry estimators call them a daily necessity. We have a variety of styles and price ranges, to suit your individual needs.

#architect scale rulers #architect scale #architect 6" pocket scale #hollow triangular architect 12" scale

Loma  Baumbach

Loma Baumbach


Getting Started With Feature Flags


As any developer can tell you, deploying any code carries technical risk. Software might crash or bugs might emerge. Deploying features carries additional user-related risk. Users might hate the new features or run into account management issues. With traditional deployments, all of this risk is absorbed at once.

Feature flags give developers the ability to separate these risks, dealing with one at a time. They can put the new code into production, see how that goes, and then turn on the features later once it’s clear the code is working as expected.

What is a Feature Flag?

Simply put, a feature flag is a way to change a piece of software’s functionality without changing and re-deploying its code. Feature flags involve creating a powerful “if statement” surrounding some chunk of functionality in software (pockets of source code).

The History of Feature Flags

Leading Web 2.0 companies with platforms and services that must maintain performance among high traffic levels led the way in regard to developing and popularizing new deployment techniques. Facebook, in particular, is known as a pioneer of feature flags and for releasing massive amounts of code at scale. While building its massive social network more than a decade ago, the company realized that its uptime and scale requirements could not be met with traditional site maintenance approaches. (A message saying the site was down while they deployed version 3.0 was not going to cut it).

Instead, Facebook just quietly rolled out a never-ending stream of updates without fanfare. Day to day, the site changed in subtle ways, adding and refining functionality. At the time, this was a mean feat of engineering. Other tech titans such as Uber and Netflix developed similar deployment capabilities as well.

The feature flag was philosophically fundamental to this development and set the standard for modern deployment maturity used by leading organizations everywhere today. Recently, feature flags have been used in tandem with continuous delivery (CD) tools to help forward-looking organizations bring features, rather than releases, to market more quickly.

#devops #continuous integration #ci/cd #continous delivery #feature flags #flags #feature branching #feature delivery

Explaining the Explainable AI: A 2-Stage Approach

As artificial intelligence (AI) models, especially those using deep learning, have gained prominence over the last eight or so years [8], they are now significantly impacting society, ranging from loan decisions to self-driving cars. Inherently though, a majority of these models are opaque, and hence following their recommendations blindly in human critical applications can raise issues such as fairness, safety, reliability, along with many others. This has led to the emergence of a subfield in AI called explainable AI (XAI) [7]. XAI is primarily concerned with understanding or interpreting the decisions made by these opaque or black-box models so that one can appropriate trust, and in some cases, have even better performance through human-machine collaboration [5].

While there are multiple views on what XAI is [12] and how explainability can be formalized [4, 6], it is still unclear as to what XAI truly is and why it is hard to formalize mathematically. The reason for this lack of clarity is that not only must the model and/or data be considered but also the final consumer of the explanation. Most XAI methods [11, 9, 3], given this intermingled view, try to meet all these requirements at the same time. For example, many methods try to identify a sparse set of features that replicate the decision of the model. The sparsity is a proxy for the consumer’s mental model. An important question asks whether we can disentangle the steps that XAI methods are trying to accomplish? This may help us better understand the truly challenging parts as well as the simpler parts of XAI, not to mention it may motivate different types of methods.

Two-Stages of XAI

We conjecture that the XAI process can be broadly disentangled into two parts, as depicted in Figure 1. The first part is uncovering what is truly happening in the model that we want to understand, while the second part is about conveying that information to the user in a consumable way. The first part is relatively easy to formalize as it mainly deals with analyzing how well a simple proxy model might generalize either locally or globally with respect to (w.r.t.) data that is generated using the black-box model. Rather than having generalization guarantees w.r.t. the underlying distribution, we now want them w.r.t. the (conditional) output distribution of the model. Once we have some way of figuring out what is truly important, a second step is to communicate this information. This second part is much less clear as we do not have an objective way of characterizing an individual’s mind. This part, we believe, is what makes explainability as a whole so challenging to formalize. A mainstay for a lot of XAI research over the last year or so has been to conduct user studies to evaluate new XAI methods.

#overviews #ai #explainability #explainable ai #xai