The strength of a linear relationship between two quantitative variables can be measured using Correlation. It is a statistical method that is very easy in order to calculate and to interpret. It is generally represented by ‘r’ known as the coefficient of correlation.
This is the reason why it is highly misused by professionals because correlation cannot be termed for causation. It is not necessary that if two variables have a correlation then one is dependent on the other and similarly if there is no correlation between two variables it is possible that they might have some relation. This is where PPS(Predictive Power Score) comes into the role.
Predictive Power Score works similar to the coefficient of correlation but has some additional functionalities like:
In this article, we will explore how we can use the Predictive Power Score to replace correlation.
PPS is an open-source python library so we will install it like any other python library using pip install ppscore.
We will import ppscore along with pandas to load a dataset that we will work on.
import ppscore as pps
import pandas as pd
We will be using different datasets to explore different functionalities of PPS. We will first import an advertising dataset of an MNC which contains the target variable as ‘Sales’ and features like ‘TV’, ‘Radio’, etc.
df = pd.read_csv(‘advertising.csv’)
df.head()
We will use some basic functions defined in ppscore.
PP Score lies between 0(No Predictive Power) to 1(perfect predictive power), in this step we will find PPScore/Relationship between the target variable and the featured variable in the given dataset.
pps.score(df, "Sales", "TV")
#developers corner #coefficient of correlation #correlation analysis #dependency #heatmap #linear regression #replace correlation #visualization