Calcady
Home / Scientific / Bayes' Theorem Calculator

Bayes' Theorem Calculator

Calculate the posterior probability of an event using Bayes' Theorem. Enter prior probability, sensitivity (true positive rate), and false positive rate to find the true probability given a positive result.

Probabilities

%
%

Calculate P(B) using:

%

Posterior Probability

Updated Chance: P(A | B)16.667%
Email LinkText/SMSWhatsApp

Bayes' Theorem

Bayes' Theorem provides a way to revise existing predictions or theories (update probabilities) given new or additional evidence. It is foundational in statistics, machine learning, and medical testing.

The Formula

P(A|B) = [P(B|A) × P(A)] / P(B)

Where P(B) can be expanded using the law of total probability:
P(B) = P(B|A)P(A) + P(B|not A)P(not A)

Variables Explained

  • P(A) (Prior Probability): The initial probability of event A occurring before new evidence is considered.
  • P(B|A) (True Positive Rate): The probability of observing evidence B given that A is true.
  • P(B) (Marginal Probability): The total probability of observing evidence B under all possible scenarios.
  • P(B|not A) (False Positive Rate): The probability of observing evidence B even when A is false.
  • P(A|B) (Posterior Probability): The updated probability of event A given that evidence B has been observed.

Quick Answer: What does Bayes' Theorem actually calculate?

Bayes' Theorem calculates the posterior probability — the updated probability that a condition is true after accounting for new evidence. The formula: P(A|B) = P(B|A) × P(A) ÷ P(B). The counterintuitive result that shocks most people: a positive result on a highly accurate test does not mean you probably have the disease if the disease is rare. Example: 1% disease prevalence, 90% sensitivity, 95% specificity → P(Disease | Positive Test) = 15.4%. Even though the test is 90% accurate, 84.6% of positive results in a low-prevalence population are false positives. This is why public health agencies do not recommend mass screening for rare diseases with imperfect tests — the false positive avalanche causes more harm (unnecessary treatment, anxiety, follow-up procedures) than the true positives saved.

Base Rate Effect on Positive Predictive Value (PPV)

All rows use a test with 90% sensitivity and 95% specificity. Only the disease prevalence changes:

Disease Prevalence True Positives per 100k False Positives per 100k PPV (Bayesian Posterior) Verdict
0.01% (1 in 10,000) 9 4,999 0.18% Mass screening harmful — overwhelming false positives
0.1% (1 in 1,000) 90 4,995 1.77% Not suitable for general screening
1% (high-risk group) 900 4,950 15.4% Confirmatory test recommended on positives
5% (symptomatic patients) 4,500 4,750 48.6% Near coin-flip — needs clinical correlation
20% (symptomatic + risk factors) 18,000 4,000 81.8% Strong positive evidence; treatment-level confidence
50% (confirmed exposure) 45,000 2,500 94.7% High confidence; false positive unlikely

Pro Tips & Bayesian Reasoning Errors

Do This

  • Always identify your prior (P(A)) before interpreting any test result — the prior is the most impactful input and is most often ignored. In clinical medicine, the prior is the pre-test probability: the likelihood of disease given the patient's presentation, risk factors, and population. A prior of 5% (symptomatic patient, moderate risk) and a prior of 0.01% (random population screening) produce drastically different positive predictive values from identical tests. In statistical inference, this is called “motivated prior” selection — the posterior is only as valid as the prior is accurate. Sources for priors include: epidemiological prevalence data, clinical decision rules (Wells Score, HEART Score), base rates from published studies, or prior test results (sequential updating).
  • Use sequential Bayesian updating when multiple independent tests are available — the posterior from test 1 becomes the prior for test 2. If a patient tests positive on a rapid antigen test (posterior = 15%), and then tests positive on a PCR confirmatory test (sensitivity 99%, specificity 99%), re-run Bayes with prior = 0.154: posterior jumps to ~78%. A third positive independent test would push it above 95%. This is mathematically equivalent to running Bayes once with all three tests combined (if independent), but the sequential form is more intuitive and allows for test-specific sensitivity/specificity values.

Avoid This

  • Don't confuse P(B|A) with P(A|B) — this is called the “Prosecutor's Fallacy” and has sent innocent people to prison. P(B|A) is the probability of observing the evidence B given hypothesis A is true (e.g., probability of matching DNA if the suspect is the perpetrator). P(A|B) is the probability the suspect is the perpetrator given the DNA match. These are not the same. Claiming “there's a 1-in-a-million chance of a random DNA match, therefore the probability of innocence is 1-in-a-million” confuses these two quantities. Without accounting for the prior (how many people could plausibly be the perpetrator), the posterior probability of guilt is not deducible from the likelihood alone. This fallacy was central to the O.J. Simpson trial and multiple wrongful convictions.
  • Don't use Bayes' theorem with non-independent tests as if they were independent sequential updates — the posterior will be falsely inflated. Sequential Bayesian updating is valid only when test 2 provides independent new evidence beyond test 1. If two tests use the same biological mechanism (e.g., two ELISA tests measuring the same antibody from the same blood draw), they are not independent — their results are correlated by the same underlying measurement error. Treating them as independent updates overcounts the evidence. True independence requires tests that measure different signals: e.g., antigen test + antibody test + PCR all measure different aspects of infection and can be treated as approximately independent for sequential updating.

Frequently Asked Questions

What is the difference between sensitivity, specificity, PPV, and NPV?

Sensitivity (True Positive Rate) = P(Test+ | Disease+): the probability a sick person tests positive. A 90% sensitive test misses 10% of true cases. Sensitivity is a fixed property of the test, independent of prevalence. Specificity (True Negative Rate) = P(Test− | Disease−): the probability a healthy person tests negative. A 95% specific test generates a 5% false positive rate. Also fixed. Positive Predictive Value (PPV) = P(Disease+ | Test+): the probability that a positive result is a true positive. This is Bayes' posterior. PPV depends on prevalence — it changes with the population screened. Negative Predictive Value (NPV) = P(Disease− | Test−): probability that a negative result is a true negative. Also prevalence-dependent. The key insight: sensitivity and specificity are test properties (fixed); PPV and NPV are population-specific outcomes (change with prevalence). This is why a test with excellent sensitivity/specificity can still have poor PPV in a low-prevalence population.

Why does a 99% accurate test only tell you 50% probability when the disease is rare?

This is the base rate fallacy in its most striking form. Imagine a disease affecting 1 in 1,000 people (0.1% prevalence) and a test that is 99% sensitive and 99% specific. Testing 100,000 people: 100 truly have the disease; 99 test positive (true positives from 99% sensitivity, 1 missed). Of the 99,900 without the disease, 999 test positive (false positives from 1% false positive rate). Total positives: 99 + 999 = 1,098. Of those, only 99 are true cases. PPV = 99/1,098 ≈ 9% — a positive result is 91% likely to be wrong! The 1% false positive rate, applied to 99,900 healthy people, generates far more false positives than the 0.1% prevalence rate generates true cases. The only way to improve PPV without changing the test is to test higher-prevalence populations (symptomatic patients, exposed contacts) where the prior is higher.

How is Bayes' Theorem used in AI and machine learning?

Bayes' theorem is embedded in multiple foundational ML techniques: (1) Naïve Bayes classifiers — the earliest spam filters calculated P(Spam | word “Viagra”) using Bayes' theorem with word frequencies as likelihoods and the prior probability of spam. Still used in text classification for its speed and interpretability. (2) Bayesian neural networks — instead of point estimates for weights, maintain probability distributions, enabling uncertainty quantification in predictions. (3) Bayesian hyperparameter optimization (Gaussian processes) — treats the loss function as an unknown function and uses Bayes to update beliefs about which hyperparameters are optimal, efficiently exploring the search space. (4) Probabilistic graphical models (Bayesian networks / belief networks) — encode conditional dependencies between variables using directed acyclic graphs where each node's CPT is a Bayesian conditional probability. Used in medical diagnosis, risk assessment, and causal inference.

What is the difference between Bayesian and frequentist probability?

Frequentist probability defines probability as the long-run frequency of outcomes in infinitely repeated experiments. It produces p-values, confidence intervals, and null hypothesis significance testing (NHST). A 95% CI means: “If we repeated this experiment 100 times, 95 of the resulting intervals would contain the true value.” The parameter itself is treated as fixed (not a random variable). Bayesian probability treats probability as a degree of belief that can be assigned to any uncertain event, including one-time events. Parameters are random variables with probability distributions. Bayes' theorem updates a prior belief to a posterior belief given data. A 95% Credible Interval means: “Given the data, there is a 95% probability the true parameter lies within this range.” The philosophical difference: frequentism rejects prior beliefs as unscientific subjectivity; Bayesian inference embraces prior knowledge as legitimate information. In practice, Bayesian methods are more natural for decision-making (what is the probability this specific patient has cancer?), while frequentist methods dominate traditional hypothesis testing in academic research.

Related Calculators