Weighing Evidence: Bayes' Theorem for Model Selection

Choosing Between Alternatives

Often in data science, we are faced with several competing hypotheses or "models" that could explain the data we observe. Bayes' Theorem provides a principled way to update our belief in each hypothesis after seeing the evidence. We start with a prior probability for each hypothesis (how likely we thought it was initially) and then calculate how well each hypothesis explains the observed data (the likelihood).

The Role of Evidence

The evidence (data) helps us discriminate between these hypotheses. A hypothesis under which the observed evidence is more probable will see its posterior probability increase, while a hypothesis that makes the evidence unlikely will see its posterior probability decrease.

The total probability of the evidence, P(E), acts as a normalizing constant and is calculated by summing the probabilities of the evidence under each hypothesis, weighted by their prior probabilities. This ensures that the posterior probabilities sum to 1.

Fair vs. Biased Coin

MODERATE

You have two coins. Coin A is a fair coin (P(Heads) = 0.5). Coin B is a biased coin (P(Heads) = 0.7). You randomly pick one of the coins (with equal probability) and flip it twice. You observe two Heads (HH).

What is the probability that you picked the biased coin (Coin B)?

Consider the Priors: How would the result P(Biased | Two Heads) change if your prior belief was that the biased coin was much rarer (e.g., P(Biased) = 0.1 and P(Fair) = 0.9)?

 

Nerchuko Academy · Free DS Interview Prep