Derivation of Bayes Theorem

Author

Andy Grogan-Kaylor

Published

June 24, 2025

1 Some Definitions

\[\begin{matrix} & \color{blue}{P(A)} & P(!A) \\ \color{blue}{P(B)} & \color{red}{P(A,B)} & P(A, !B) \\ P(!B) & P(A,!B) & P(!A, !B) \end{matrix} \tag{1}\]

\(\color{blue}{P(A)}\) and \(\color{blue}{P(B)}\) are examples of marginal probabilities.
\(\color{red}{P(A,B)}\) is an example of a joint probability.
A conditional probability, for example, \(\color{purple}{P(A|B)}\) is defined as the appropriate joint probability divided by the marginal probability: \(\frac{\color{red}{P(A,B)}}{\color{blue}{P(B)}}\)

2 From The Definition Of Conditional Probability:

\(P(A|B) = \frac{P(A,B)}{P(B)}\)

\(P(B|A) = \frac{P(A,B)}{P(A)}\)

3 Multiply Each Fraction By The Denominator:

\(P(A|B)P(B) = P(A,B)\)

\(P(B|A)P(A) = P(A,B)\)

4 Set The Two Expressions To Be Equivalent:

\(P(A|B)P(B) = P(B|A)P(A)\)

5 Divide by \(P(B)\):

\(P(A|B) = \frac{P(B|A)P(A)}{P(B)}\)

This is Bayes Theorem.