Note from the editors:Towards Data Science_ is a Medium publication primarily based on the study of data science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here._

No test is 100% accurate to detect the novel coronavirus! However, we feel satisfied when we are told that the tests are 98.5% accurate in detecting COVID infections. But what does this accuracy actually mean?

Before we answer this question, let’s review some basic concepts.

  1. True positive: A person with COVID-19 tests positive for COVID-19
  2. False positive: A person without COVID-19 tests positive for COVID-19
  3. False negative: A person with COVID-19 tests negative for COVID-19
  4. True negative: A person without COVID-19 tests negative for COVID-19

Now let’s review what we mean by accuracy, precision, sensitivity and specificity.

Accuracy_ = (true positives + true negatives) / all results_

Precision_ = true positives / (true positives + false positives)_

Sensitivity_ = true positives / (true positives + false negatives)_

Specificity_ = true negatives / (true negatives + false positives)_


Now suppose there are 200 patients. Lets say 100 patients are infected and 100 patients are not. Suppose 99 of the 100 infected patients were tested positive and 2 healthy patients were also tested positive. In other words, 99 people were truly positive, and 2 people were falsely positive. 1 infected person was _falsely negative, and _98 not infected patients were truly negative. Now let’s calculate what we just studied.

Accuracy = (99 + 98) / 200 = 0.985

Precision = 99 / (99 + 2) = 0.98
Sensitivity = 99 / (99 + 1) = 0.99
Specificity = 98 / (98 + 2) = 0.98

Looks good till now, right? Now let’s try to understand more about specificity. It means that out of 100 patients who did not have covid-19, only 98 people tested negative. So, given that a person doesn’t have the virus, probability that he will test negative is 0.98.


Now let’s dive deeper! Probability that given a person who has has covid, tests positive will be given by the conditional probability,

P(+|C⁺) = sensitivity = 0.99.

Similarly, Probability that a healthy patient will test negative will be

P(-|C⁻) = specificity = 0.98.

We got the conditional probabilities, but what use is that? Now let’s say we pick up a random person from a population and test him. He tested positive, but what is the probability that he actually has the novel coronavirus. In other words, here we are interested to find P(C⁺|+).

Now according to Bayes’ theorem,

P(A|B) = P(A)P(B|A) / (P(A)P(B|A) + P(not A)P(B|not A) )

So, if we know P(C⁺), we can easily calculate **P(C⁺|+) **by plugging in the values in the above equation. So, to calculate P(C⁺), you can just just divide the number of cases in your country by the total population. In my case, it is less than 0.0001 which is quite rare. So lets calculate P(C⁺|+).

#bayes-theorem #coronavirus #data-science #probability #data analysis

COVID-19, Bayes’ theorem and taking probabilistic decisions.
1.60 GEEK