STAT 7 - Statistical Methods for the Biological, Environmental & Health Sciences
10 Mar 2026
Poll Time!
PollEv.com/slugstats
How confident do you feel about the probability concepts we’ve started exploring?
| Day | Topics | To-Do |
|---|---|---|
| Tuesday | Probability foundations, conditional probability | Attend lecture, participate in activities |
| Thursday | Bayes’ Theorem, diagnostic testing | Attend lecture, participate in activities |
| Friday | - | HW2 due |
| Discussion | Practice with EDA | DSA 3 due after section, Section C is due before class on Thursday |
HIV Testing & Public Health
The Scenario:
A 23-year-old patient gets tested for HIV at a community health clinic. The test comes back positive.
Questions to Consider:

According to the CDC (2024 estimates):
Think about this:
If we test 10,000 people, what happens?
Your Task
Individual Think (2 min):
Imagine we test 10,000 people. Given:
How many people would test positive? How many of those actually have HIV?
Pair Discussion (3 min): Share your thinking with a neighbor
Share Out: Let’s hear some approaches!
To answer this question properly, we need to understand:
Let’s build these tools step by step.
Core concepts and rules
To understand statistical inference, we need probability!
Definition
We know what outcomes could happen, but we don’t know which particular values will be observed.
Examples:
Random phenomenon: Outcomes we cannot predict with certainty, but that have a regular distribution in many repetitions
Probability: The proportion of times an outcome occurs in many repeated trials
Independent trials: The outcome of one trial does not influence another
Sample Space (S): Set of all distinct possible outcomes
Event: An outcome or set of outcomes (subset of sample space)
Probability of an event: P(A) = |A| / |S| = 3/6 = 1/2
Three Key Properties
If A₁ and A₂ are disjoint (mutually exclusive):
\[P(A_1 \text{ or } A_2) = P(A_1) + P(A_2)\]
Example: Rolling a die
For any two events A and B:
\[P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)\]
Example: Rolling a die
The complement of event A is Aᶜ:
\[P(A) + P(A^c) = 1\]
Therefore: \[P(A) = 1 - P(A^c)\]
Example: Rolling a die
Your Task

Mark all the events present:
If A and B are independent:
\[P(A \text{ and } B) = P(A) \times P(B)\]
Warning
Important: Do not confuse disjoint and independent!
Two dice illustration
Rolling two dice:
The first roll doesn’t affect the second!
Organizing outcomes and probabilities
Definition
A probability distribution lists all possible outcomes and their associated probabilities, satisfying:
Hospital Survey Data
| Very satisfied | Somewhat satisfied | Neither | Somewhat dissatisfied | Very dissatisfied | |
|---|---|---|---|---|---|
| Probability | 0.32 | 0.35 | 0.13 | 0.07 | 0.13 |
Questions: 1. Can this be a probability distribution? 2. What is P(satisfied or very satisfied)?
Poll: PollEv.com/slugstats
Question 1: Can this be a probability distribution?
Check the three rules:
Yes, this is a valid probability distribution!
Question 2: P(satisfied or very satisfied)?
Since these events are disjoint, we can add:
P(somewhat satisfied or very satisfied) = P(somewhat) + P(very)
= 0.35 + 0.32 = 0.67
Interpretation: 67% of patients report being satisfied or very satisfied.
Rolling One Die
S = {1, 2, 3, 4, 5, 6}, A = {1, 2, 3}, B = {2, 3, 4, 5, 6}
Calculate:
Poll: PollEv.com/slugstats
Given: S = {1, 2, 3, 4, 5, 6}, A = {1, 2, 3}, B = {2, 3, 4, 5, 6}
P(A ∩ B) = P(A and B) = P({2, 3}) = 2/6 = 1/3
P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = 3/6 + 5/6 - 2/6 = 6/6 = 1
P(Bᶜ) = 1 - P(B) = 1 - 5/6 = 1/6
(Note: Bᶜ = {1}, so P(Bᶜ) = 1/6 ✓)
Break Time! ☕ 5-minute break
Stretch, grab water, chat with neighbors!
We’ll resume with conditional probability.
When prior information changes what we know
Let’s look at data on diabetes and age from a large health survey:
| Diabetes | No Diabetes | Total | |
|---|---|---|---|
| Less than 20 years | 0.001 | 0.277 | 0.277 |
| 20 to 44 years | 0.014 | 0.315 | 0.329 |
| 45 to 64 years | 0.043 | 0.219 | 0.261 |
| Greater than 64 | 0.036 | 0.097 | 0.132 |
| Total | 0.093 | 0.907 | 1.000 |
| Diabetes | No Diabetes | Total | |
|---|---|---|---|
| Less than 20 years | 0.001 | 0.277 | 0.277 |
| 20 to 44 years | 0.014 | 0.315 | 0.329 |
| 45 to 64 years | 0.043 | 0.219 | 0.261 |
| Greater than 64 | 0.036 | 0.097 | 0.132 |
| Total | 0.093 | 0.907 | 1.000 |
Check the Rules
How can we verify this is a valid joint probability distribution?
Think for 30 seconds, then we’ll discuss!
1. Disjoint outcomes? ✓
Each person falls into exactly one cell (e.g., can’t be both “20-44” and “45-64”)
2. Between 0 and 1? ✓
All values are probabilities: 0.001, 0.014, …, 0.907
3. Sum to 1? ✓
All interior cells: 0.001 + 0.277 + 0.014 + … = 1.000
Yes! This is a valid probability distribution.
Conditional Probability is the probability of an event occurring, given that another event has already occurred.
Notation: P(A | B)
Read as: “Probability of A given B”
Example from daily life:
Biological examples:

Unconditional: P(A)
All of A divided by all of S

Conditional: P(A | B)
A and B divided by B
We’ve restricted our sample space!
Conditional Probability Formula
\[P(A|B) = \frac{P(A \text{ and } B)}{P(B)}\]
provided that P(B) > 0
Intuition:
Using our diabetes table:
| Diabetes | No Diabetes | Total | |
|---|---|---|---|
| 45 to 64 years | 0.043 | 0.219 | 0.261 |
Question: What’s the probability someone has diabetes, given they’re 45-64 years old?
\[P(\text{Diabetes | Age 45-64}) = \frac{P(\text{Diabetes and Age 45-64})}{P(\text{Age 45-64})}\]
\[P(\text{Diabetes | Age 45-64}) = \frac{P(\text{Diabetes and Age 45-64})}{P(\text{Age 45-64})}\]
From the table:
Calculate: \[P(\text{Diabetes | Age 45-64}) = \frac{0.043}{0.261}\] \[= 0.1647 \approx 16.5\%\]
Interpretation: Among people aged 45-64, about 16.5% have diabetes.
Unconditional probability:
P(Diabetes) = 0.093 = 9.3%
In the general population
Conditional probability:
P(Diabetes | Age 45-64) = 0.1647 = 16.5%
Among 45-64 year olds
Why the difference?
Age gives us information! Being in the 45-64 age group increases the probability of having diabetes compared to the general population.
Calculate and Compare
Using the same diabetes table:
Question: What’s P(Age 45-64 | Diabetes)?
Individual (2 min): Set up and solve
Pair (3 min):
Poll: PollEv.com/slugstats
\[P(\text{Age 45-64 | Diabetes}) = \frac{P(\text{Age 45-64 and Diabetes})}{P(\text{Diabetes})}\]
Calculate: \[= \frac{0.043}{0.093} = 0.4624 \approx 46.2\%\]
Interpretation: Among people with diabetes, about 46.2% are in the 45-64 age group.
Important
Remember: P(A|B) ≠ P(B|A) in general!
The order matters in conditional probability!
Recall: Events A and B are independent if
P(A and B) = P(A) × P(B)
New perspective with conditional probability:
Independence Definition
A and B are independent if and only if:
\[P(A|B) = P(A)\]
or equivalently:
\[P(B|A) = P(B)\]
Meaning: Knowing B occurred doesn’t change the probability of A
Are “having diabetes” and “being age 45-64” independent?
Check if P(Diabetes | Age 45-64) = P(Diabetes):
Since 16.5% ≠ 9.3%, these events are NOT independent.
Knowing someone’s age group gives us information about their diabetes status!
Key takeaways from today
Probability foundations: Sample space, events, rules
Probability distributions: Valid when outcomes are disjoint, probabilities sum to 1
Conditional Probability: P(A|B) = P(A and B) / P(B)
Independence: P(A|B) = P(A) when A and B are independent
Tree Diagrams & Tables: Two ways to organize the same information
Poll Time!
PollEv.com/slugstats
Question: If P(A) = 0.3, P(B) = 0.4, and P(A and B) = 0.12, are A and B independent?
A. Yes
B. No
C. Need more information
Post on Ed Discussion or come to office hours!
Office Hours:
I’ll be here after class :)
Next class: Thursday, January 23rd
Bayes’ Theorem & Diagnostic Testing
See you then! 🦠🔬📊
![]()
STAT 7 – Winter 2026