Introduction

The conditional probability will be introduced using a study:

Study: Examining teens' beliefs about social class

Sample size: 48 working class and 50 upper middle class 16 years-old

Study design:

  • "Objective" assignment to social class based on self-reported measures of both parents' occupation, education, and income.
  • "Subjective" based on survey question to determine their own subjective association to a social class. This is the social class that students think they belong to.

The study results are summarized into a contingency table shownn below:

Objective social class position
Working Class Upper Middle Class Total
Subjective
Social
Class
Identity
Poor 0 0 0
Working Class 8 0 8
Middle Class 32 13 45
Upper Middle Class 8 37 45
Upper Class 0 0 0
Total 48 50 98

Marginal Probability

The term marginal probability comes from the fact that the counts come from the margins of the contingency table.

Example:

What is the probability that a student's objective social class position is upper middle class?

P(objective upper middle class)
= 50 /98
≈ 0.51
Objective social class position
Working Class Upper Middle Class Total
Subjective
Social
Class
Identity
Poor 0 0 0
Working Class 8 0 8
Middle Class 32 13 45
Upper Middle Class 8 37 45
Upper Class 0 0 0
Total 48 50 98

Joint

The term joint probability comes from the fact that we are considering the intersection of two events of interest.

Example:

What is the probability that a student's objective position and subjective identity are both upper middle class?

There are 37 students meet this criteria (as shown below), out of total 98. So the probability is 37 over 98, or rughly 38%.

P(objective UMC and subjective UMC)
= 37 / 98
≈ 0.38
Objective social class position
Working Class Upper Middle Class Total
Subjective
Social
Class
Identity
Poor 0 0 0
Working Class 8 0 8
Middle Class 32 13 45
Upper Middle Class 8 37 45
Upper Class 0 0 0
Total 48 50 98

Conditional Probability

The term conditional means that we only count those meet the given condition.

Example:

What is the probability that a student who is objectively in the upper middle class associates with upper middle class?

There are 8 students meet this criteria. Among all 48 who are working class, so the probability is 8 over 48, roughly 17%.

P(subjective UMC | objective UMC)
= 8 / 48
≈ 0.17
Objective social class position
Working Class Upper Middle Class Total
Subjective
Social
Class
Identity
Poor 0 0 0
Working Class 8 0 8
Middle Class 32 13 45
Upper Middle Class 8 37 45
Upper Class 0 0 0
Total 48 50 98

Bayes's Theorem

More formally, we calculate the conditional probability using Bayes' Theorem, which states that probability of A given B is the probability of A and B divided by the probability of B.

Bayes's Theorem: 
 
              P(A and B) 
P(A | B) =  --------------
                 P(B)

So the question P(subjective UMC | objective UMC) we work before is going to be equal to

P(subjective UMC | objective UMC)
= P(subjective UMC and objective UMC) / P(objective UMC)
= (8/98) / (48/98)
= 8 / 98
≈ 0.17

The Bayes's Theorem here is a bit an overkill, as we could simply reasoning the answer through the contingency table. But, if we don't have the counts neatly organized in the table, using Bayes' Theorem to calculate a conditional probability would be much more intuitive.

Example:

The American Community Survey is an ongoing survey that provides data every year to give communities the current information they need to plan investments and services.

The 2010 American Community Survey estimates that 14.6% of Americans live the proverty line, 20.7% speak a language other that English at home, and 4.2% fall in both categories.

Based on this information, what percent of Americans live below the proverty line given that they speak a language other than English at home?

We are asked for a probability of living below the proverty line given that the person speaks a language other than English at home. This question can be expressed as:

P(below proverty | speak non-English)

This is a conditional probability, so the Bayes Theorem can be used.

 P(below proverty | speak non-English) 
 
     P(below proverty and speak non-English)
=  -------------------------------------------
          P(speak non-English)
 
      0.042
=  ----------- ≈ 0.20
      0.207

The result shows roughly 20% of Americans who speak a language other than English at home also live below the proverty line.

What do we do with this information? One use is that this information can be used to compare to the general public. According to the context, 14% of all Americans live below the proverty line, so we seem like living below the proverty line is more prevalent for people who speak a language other than English at home.

Calculating joint probability

The product rule for independent events is:

If A and B are independent, P(A and B) = P(A) × P(B)

Since Bayes Theorem doesn't have an independent condition, the equation can be rearranged to:

Bayes's Theorem: 
 
              P(A and B) 
P(A | B) =  --------------
                 P(B)
 
Rearrange to: 
 
P(A and B) = P(A | B) × P(B)

So the joint probability of A and B is the conditional probability of A given B, multiplied by the marginal probability of B.

Independence and Conditional Probability

Generically, if P(A | B) = P(A) , then the events A and B are said to be independent. This can be explained in two ways:

  • Conceptually: If B tells us nothing about A, then A and B are independent. Meaning that, whether we have the probability with B given or not, the probability are exactly the same. In the other words, B is worthless.
  • Mathematically: If events A and B are independent, P(A and B) = P(A) × P(B). Then from Bayes Theorem:
                  P(A and B)       P(A) × P(B)
    P(A | B) =  -------------- = -------------  = P(A)
                     P(B)             P(B)