Conditional Probabilities, Seattle Rain, And Tricky Friends

My least favorite kind of data science interview question is probability. It’s just not something that I think about everyday, so the probability muscles always feel super rusty whenever I am forced to exercise them. But if you are hunting for a data job, it’s inevitable that you will run into one at some point — so let’s keep our probability skills fresh with some practice. As usual, we will use simulation (and Python code) to better visualize what’s going on.


The Question

You’re headed to Seattle. You want to know if you should bring an umbrella so you call 3 random friends who live there and ask each independently whether it’s raining or not. Each friend has a 2/3 chance of telling you the truth and a 1/3 chance of lying (so mean!). All 3 friends tell you “Yes, it’s raining”. What is the probability that it’s actually raining in Seattle?

The first time I saw this problem, I thought “only if all 3 of my friends lied to me then it would mean it’s not raining in Seattle”. Because as long as one of my friends was not lying, then one of the yeses would be true (implying rain).

probability rain = 1 - probability all lying

= 1 - (1/3)^3 = 0.963

But then I thought this seems too simple. How could this question gain such notoriety if it was this simple. Sadly my instincts were right and this question is more complicated than meets the eye.


It’s A Conditional Probability

My previous approach ignored the given condition. What condition you ask? The interviewer told us that all 3 of our friends answered yes. That’s relevant information that needs to be factored into our solution.

Image for post

To see why, imagine that it’s raining in Seattle. Our friends answered [yes, yes, yes] when we asked them whether it was raining. Let’s think through the possible outcomes in this state of the world. If they all told the truth, then that is consistent with the world state that we’re in (because it IS raining in Seattle).

What if they all lied? That’s impossible. In a state of the world where it is raining, our friends can’t have answered “yes” and lied (to lie, they would have had to have replied “no”). Is it possible that just one of them lied? That’s not possible either — all of them said “yes” and to lie in the state of the world where it’s raining requires a “no”. So the only possible result if it IS raining in Seattle is that our friends are all telling the truth.

#data-science #statistics #technology #data analysis

What is GEEK

Buddha Community

Conditional Probabilities, Seattle Rain, And Tricky Friends

Conditional Probabilities, Seattle Rain, And Tricky Friends

My least favorite kind of data science interview question is probability. It’s just not something that I think about everyday, so the probability muscles always feel super rusty whenever I am forced to exercise them. But if you are hunting for a data job, it’s inevitable that you will run into one at some point — so let’s keep our probability skills fresh with some practice. As usual, we will use simulation (and Python code) to better visualize what’s going on.


The Question

You’re headed to Seattle. You want to know if you should bring an umbrella so you call 3 random friends who live there and ask each independently whether it’s raining or not. Each friend has a 2/3 chance of telling you the truth and a 1/3 chance of lying (so mean!). All 3 friends tell you “Yes, it’s raining”. What is the probability that it’s actually raining in Seattle?

The first time I saw this problem, I thought “only if all 3 of my friends lied to me then it would mean it’s not raining in Seattle”. Because as long as one of my friends was not lying, then one of the yeses would be true (implying rain).

probability rain = 1 - probability all lying

= 1 - (1/3)^3 = 0.963

But then I thought this seems too simple. How could this question gain such notoriety if it was this simple. Sadly my instincts were right and this question is more complicated than meets the eye.


It’s A Conditional Probability

My previous approach ignored the given condition. What condition you ask? The interviewer told us that all 3 of our friends answered yes. That’s relevant information that needs to be factored into our solution.

Image for post

To see why, imagine that it’s raining in Seattle. Our friends answered [yes, yes, yes] when we asked them whether it was raining. Let’s think through the possible outcomes in this state of the world. If they all told the truth, then that is consistent with the world state that we’re in (because it IS raining in Seattle).

What if they all lied? That’s impossible. In a state of the world where it is raining, our friends can’t have answered “yes” and lied (to lie, they would have had to have replied “no”). Is it possible that just one of them lied? That’s not possible either — all of them said “yes” and to lie in the state of the world where it’s raining requires a “no”. So the only possible result if it IS raining in Seattle is that our friends are all telling the truth.

#data-science #statistics #technology #data analysis

Roberta  Ward

Roberta Ward

1596822540

Famous Probability Distribution in Data Science

Data Scientists are modern-day statisticians that take a shot on complex business problems and unravel them with the assistance of data. Probability Distributions resemble microscope. They allow a Data Scientist or Data Analyst to recognize patterns in any case totally random variables.

Image for post

Normal Distribution

A normal distribution is generally described to as the bell-shaped curve and it depicts the recurrence of something that you are evaluating, such as the class scores. The focal point of the bend is the mean and the curve width called the standard deviation. The more extensive the curve, the more the discrepancy. The score happens most every now and again is the mean. Scores farther away from the mean become less repeated.

The normal distribution applies to numerous circumstances where the varieties in the measure are because of a bunch of reasons for example the scores can change because of contrasts in study time, IQ, school quality.

Another instance takes some sand in your hand. Drop it gradually to the ground. What do you see? A little slope like structure which resembles a normal distribution. Most of the sand will, in general, be in the centre and there are two extremities as well. This inclination to be in the centre is a central tendency.

Along these lines, the main thing you should remember is as the size of the sample increases everything starts to normal.

Normal distribution where the most likely thing is in the middle and you never need to stress about the time where the things are going on.

Exponent Distribution

Exponential random variables are regularly utilized to model waiting times between events. In this way, for example, one student went to the Help Room and had a stopwatch and monitored the times when students would show to the centre for help. The distribution of these times looked near that of an exponential distribution. Another case is the number of hits a site gets in 60 min.

Suspicion for the exponential distribution that occasions happens autonomously at irregular occasions at a steady normal rate. The time between progressive occasions at that point has an exponential distribution.

An assumption for the exponential distribution that events happen independently at random times at a constant average rate. The time between consecutive events known as an exponential distribution.

#machine-learning #data-science #probability-distributions #probability

Understanding and Choosing the Right Probability Distributions with Examples

Probability Distributions

A probability distribution is a mathematical function that describes the likelihood of obtaining the possible values for an event. A probability distribution may be either discrete or continuous. A discrete distribution is one in which the data can only take on certain values, while a continuous distribution is one in which data can take on any value within a specified range (which may be infinite).There are a variety of discrete probability distributions. The usage of discrete probability distributions depends on the properties of your data. For example, use the:

  • Binomial distribution to calculate probabilities for a process where only one of two possible outcomes may occur on each trial, such as coin tosses.Hypergeometric distribution to find the probability of k successes in n draws without replacement.Poisson distribution to measure the probability that a given number of events will occur during a given time frame, such as the count of library book checkouts per hour.Geometric distribution to determine the probability that a specified number of trials will take place before the first success occurs.

Binomial Distribution

The binomial distribution is probably the most widely known of all discrete distribution. It is a type of distribution that has two possible outcomes. One typical example of using binomial distribution is flipping coins. A coin toss has only two possible outcomes: heads or tails, and each outcome has the same probability of 1/2. Let’s take a look at when the Binomial Distribution can be used!

#probability #probability-distributions #statistics #data-science #math

New Python Statistics Course: Conditional Probability

Take your data science and statistics knowledge to the next level with the latest addition to our data science course offerings: Conditional Probability.

conditional probability course in python

In this course, you’ll learn about the basics of conditional probability and then dig into more advanced concepts like Bayes’s Theorem and Naive Bayes algorithms. As you learn, you’ll be using your Python skills to put theory into practice and build a working knowledge of these critical statistics concepts.

Ready to start learning? Click the button below to dive into Conditional Probability, or scroll down to learn more about this new course.

What’s Covered in Conditional Probability?

Conditional Probability is an area of probability theory that’s concerned with — as the name suggests — measuring the probability of a particular event occurring based on certain conditions.

In this course, which builds off of the Probability Fundamentals course that precedes it, we’ll start with some lessons on foundational concepts like the conditional probability formula, the multiplication rule, statistical dependence and independence, and more.

From there, we’ll look at Bayes’ Theorem and how it can be used to calculate probabilities. We’ll examine prior and posterior probability distributions. Then we’ll dig in and apply some of these statistical concepts by learning about the Naive Bayes algorithm, a common statistical tool employed by data scientists.

Finally, you’ll put all your new knowledge into practice in a new guided project that challenges you to build an SMS spam filter using a data set of over 5,000 messages by employing a Naive Bayes algorithm.

By the end of the course, you’ll feel comfortable assigning probabilities to events based on conditions using the rules of conditional probability. You’ll know when these events have statistical dependence (or not) on other events. You’ll be able to assign probabilities based on prior knowledge using Bayes’s theorem.

And of course you’ll have built a cool SMS spam filter that makes use of a Naive Bayes algorithm (and your Python programming skills)!

#dataquest updates #announcements #course launches #naive-bayes #probability #statistics

Wanda  Huel

Wanda Huel

1602968400

Statistics and Probability: Introduction to Probability

This is the famous Monty Hall Problem. What if I tell you that by learning just the basics of probability, you have a higher chance of winning this contest and going home with a brand new Audi?

Statistics and Probability are subjects which are widely overlooked when it comes to Machine Learning. A lot many people tend to ignore them, because they come off as being difficult and maybe not as cool as Machine Learning. But in order to understand and grasp the core concepts behind some of the most prominently used Machine Learning algorithms, it is important that one is at least familiar with the basics of Statistics and Probability. The aim of this article is to give you a valuable introduction to Probability and its various types. Along with that, we also need to figure out the Monty Hall problem, so let’s go over a few important things.

Probability

Probability, as the name suggests, is nothing but an estimate of how likely an event might take place. Also known as Marginal Probability, it is simply a number that reflects the likelihood that an event will take place. It could be a number between 0 and 1 or it could be expressed as a percentage value. Let us take it step by step.

Experiment

We will define an Experiment within the context of Probability Theory — a branch of mathematics dealing particularly with probability. An Experiment is defined as a procedure which, although can be repeated infinite number of times, still has a well-defined set of possible outcomes.

#probability #statistics #machine-learning #data-science #data-analysis