A lot of events in our daily life follow the binomial distribution that describes the number of successes in a sequence of independent Bernoulli experiments.For example, assuming that the probability of James Harden making his shot is constant and each shot is independent, the number of field goals follows the binomial distribution.If we want to find the relationship between the success probability (p) of a binomially distributed variable Y with a list of independent variables _x_s, the binomial regression model is among our top choices.The link function is the major difference between a binomial regression and a linear regression model. Specifically, the linear regression model uses p directly as the response variable.

Image for post

linear regression

The problem of the linear regression is that its response value is not bounded. However, the binomial regression uses a link function (l) of p as the response variable.

Image for post

Binomial regression with a link function

_The link function maps the linear combination of x_s to a value that is between 0 and 1 but never reaches 0 or 1. Based on such criteria, there are mainly three common choices:

Image for post

Binomial regression link functions

_When the link function is the logit function, the binomial regression becomes the well-known _logistic regression. As one of the most first examples of classifiers in data science books, logistic regression undoubtedly has become the spokesperson of binomial regression models. There are mainly three reasons for that.

1\. Applicable to more general cases.

2\. Easy interpretation.
3\. Works in retrospective studies.

Let’s go over them in detail.

#data-science #classification #binomial #logistic-regression #r

Why Is Logistic Regression the Spokesperson of Binomial Regression Models?
1.50 GEEK