1596247980

When we start studying the concepts of probability and statistics, there are a few topics that require us to take a logical leap, often leaving us confused. In my earlier post, I have talked about one such topic, confidence interval. In this post, I will try to explain another such confusing topic, P-value. (Spoiler alert: No, it is not probability, but is related to probability)

After this post, you will be able to understand the correct interpretation of P-value and how the P-value leads to rejection/failure of rejection of hypothesis testing. Any mathematical equation or concept becomes beautiful if we learn how to read it in simple English and that is exactly what I am trying to do here.

I would assume that you are aware of the basics of hypothesis testing and conditional probability.

Conditional probability: P(it will rain today | sky is grey) read as probability that it will rain today given that sky is grey is 0.4 means that there is only a 4% chance that it will rain today if the sky has become grey.

Imagine you are a data scientist at Expedia. You need to analyze what factors drive the new users to become loyal customers of Expedia. You do some initial exploratory analysis of the data. You find out that when a new user book through some promotional offer or deal, he/she tends to return to the website. So if a new user in his first visit uses some promotional offer to book a service in Expedia, they are more likely to come back than those first-time users who booked directly without using any deal.

Having studied a little about arts, you notice that these deals and offers are not eye-catching to the website visitors because those are displayed in blue color. You think that if the deals and offers are redesigned and highlighted in red color, this might increase the retention rate of first-time users. You recommend this to business. But rather than simple idea narration, to show how confident you are in your claim statistically. You perform the following steps:

- Redesign the promotional content of the website in red color, and show this new version only to half of the new traffic to the website
- The rest of the new users see the old version of the website, i.e., blue color.

If in a day 100,000 new users visit the website, imagine 50,000 sees the old version and rest of the 50,000 sees the new version

In the new version, Number of new users who used the promo and returned after their first use is 7700

In the old version, Number of new users who used the promo and returned after their first use is 7000

Here, you observe a **10% increase in the number of retained users when they used the new version of the website**. But before you conclude that changing the website promo colors to red can increase the retention rate to 10%, you must **be sure that this increase is not by any random chance**. This increase might be due to the way you sampled or because in general users who used a new version that day do not know any other travel booking website or because of any other random reason. For this clarity, you run a hypothesis test.

- Choose one test statistic (retention percentage here)
- Formulate hypothesis

**_Null Hypothesis (Always status quo): _**There is no difference in the retention percentage of new users between the two versions. Retention percentage of new version = retention percentage of the old version. The difference observed is just random.

** Alternative hypothesis:** There is a difference between the retention percentage of the new version and the retention percentage of the old version.

#hypothesis-testing #data-science #statistics #p-value #machine-learning #deep learning

1596247980

When we start studying the concepts of probability and statistics, there are a few topics that require us to take a logical leap, often leaving us confused. In my earlier post, I have talked about one such topic, confidence interval. In this post, I will try to explain another such confusing topic, P-value. (Spoiler alert: No, it is not probability, but is related to probability)

After this post, you will be able to understand the correct interpretation of P-value and how the P-value leads to rejection/failure of rejection of hypothesis testing. Any mathematical equation or concept becomes beautiful if we learn how to read it in simple English and that is exactly what I am trying to do here.

I would assume that you are aware of the basics of hypothesis testing and conditional probability.

Conditional probability: P(it will rain today | sky is grey) read as probability that it will rain today given that sky is grey is 0.4 means that there is only a 4% chance that it will rain today if the sky has become grey.

Imagine you are a data scientist at Expedia. You need to analyze what factors drive the new users to become loyal customers of Expedia. You do some initial exploratory analysis of the data. You find out that when a new user book through some promotional offer or deal, he/she tends to return to the website. So if a new user in his first visit uses some promotional offer to book a service in Expedia, they are more likely to come back than those first-time users who booked directly without using any deal.

Having studied a little about arts, you notice that these deals and offers are not eye-catching to the website visitors because those are displayed in blue color. You think that if the deals and offers are redesigned and highlighted in red color, this might increase the retention rate of first-time users. You recommend this to business. But rather than simple idea narration, to show how confident you are in your claim statistically. You perform the following steps:

- Redesign the promotional content of the website in red color, and show this new version only to half of the new traffic to the website
- The rest of the new users see the old version of the website, i.e., blue color.

If in a day 100,000 new users visit the website, imagine 50,000 sees the old version and rest of the 50,000 sees the new version

In the new version, Number of new users who used the promo and returned after their first use is 7700

In the old version, Number of new users who used the promo and returned after their first use is 7000

Here, you observe a **10% increase in the number of retained users when they used the new version of the website**. But before you conclude that changing the website promo colors to red can increase the retention rate to 10%, you must **be sure that this increase is not by any random chance**. This increase might be due to the way you sampled or because in general users who used a new version that day do not know any other travel booking website or because of any other random reason. For this clarity, you run a hypothesis test.

- Choose one test statistic (retention percentage here)
- Formulate hypothesis

**_Null Hypothesis (Always status quo): _**There is no difference in the retention percentage of new users between the two versions. Retention percentage of new version = retention percentage of the old version. The difference observed is just random.

** Alternative hypothesis:** There is a difference between the retention percentage of the new version and the retention percentage of the old version.

#hypothesis-testing #data-science #statistics #p-value #machine-learning #deep learning

1617327135

The Garrison (armies of militia) of libraries worldwide offer millions of books, such as the Library of Congress in D.C. has over 162 million books, and the New York Public library carries around 53 million books. So many books, so little time in a human’s life.

A number of people have asked me through several of my channels and conferences — how to find time to read books, and what can be done to read more books each month. Some audiences even feel that 43 machine learning books in a year are insufficient, and want more.

I keep discovering new material every day on top of the antiquated books, which still offer good concepts. To get started, I would suggest disconnecting from Netflix, Amazon Video, and regular TV channels. The more you watch any of this stuff, the more you wouldn’t be finding time to read the books.

In 2020, I had managed to read more than 96,120 pieces of books, eBooks, articles, averaging 267 pieces of books, eBooks, research papers, or articles per day. However, on average, people might have read 10 to 30 machine learning books in a year.

#free machine learning books #garrison platoon of machine learning books #machine learning #machine learning books made free #reading machine learning books

1598891580

Recently, researchers from Google proposed the solution of a very fundamental question in the machine learning community — What is being transferred in Transfer Learning? They explained various tools and analyses to address the fundamental question.

The ability to transfer the domain knowledge of one machine in which it is trained on to another where the data is usually scarce is one of the desired capabilities for machines. Researchers around the globe have been using transfer learning in various deep learning applications, including object detection, image classification, medical imaging tasks, among others.

#developers corner #learn transfer learning #machine learning #transfer learning #transfer learning methods #transfer learning resources

1620898103

Check out the 5 latest technologies of machine learning trends to boost business growth in 2021 by considering the best version of digital development tools. It is the right time to accelerate user experience by bringing advancement in their lifestyle.

#machinelearningapps #machinelearningdevelopers #machinelearningexpert #machinelearningexperts #expertmachinelearningservices #topmachinelearningcompanies #machinelearningdevelopmentcompany

Visit Blog- https://www.xplace.com/article/8743

#machine learning companies #top machine learning companies #machine learning development company #expert machine learning services #machine learning experts #machine learning expert

1617331066

Reinforcement learning (RL) is surely a rising field, with the huge influence from the performance of AlphaZero (the best chess engine as of now). RL is a subfield of machine learning that teaches agents to perform in an environment to maximize rewards overtime.

Among RL’s model-free methods is temporal difference (TD) learning, with SARSA and Q-learning (QL) being two of the most used algorithms. I chose to explore SARSA and QL to highlight a subtle difference between on-policy learning and off-learning, which we will discuss later in the post.

This post assumes you have basic knowledge of the agent, environment, action, and rewards within RL’s scope. A brief introduction can be found here.

The outline of this post include:

- Temporal difference learning (TD learning)
- Parameters
- QL & SARSA
- Comparison
- Implementation
- Conclusion

We will compare these two algorithms via the CartPole game implementation. **This post’s code can be found** **here** **:****QL code** **,****SARSA code** **, and** **the fully functioning code** **.** (the fully-functioning code has both algorithms implemented and trained on cart pole game)

The TD learning will be a bit mathematical, but feel free to skim through and jump directly to QL and SARSA.

#reinforcement-learning #artificial-intelligence #machine-learning #deep-learning #learning