Week 9
Today’s Plan
The Physicians’ Health Study (1988) was a landmark randomized trial involving 22,071 male physicians.
Participants were randomly assigned to take aspirin or a placebo daily.
After 5 years:
| Heart Attack | No Heart Attack | Total | |
|---|---|---|---|
| Aspirin | 104 | 10,933 | 11,037 |
| Placebo | 189 | 10,845 | 11,034 |
| Total | 293 | 21,778 | 22,071 |
Is there a relationship between aspirin use and heart attack occurrence?
A two-way contingency table (also called a cross-tabulation) displays the distribution of two categorical variables simultaneously.
Rows → one categorical variable (e.g., treatment group)
Columns → another categorical variable (e.g., outcome)
Cells → joint counts (how many observations fall in each combination)
Margins → row totals and column totals
The table lets us ask: Are the two variables related, or are they independent?
| Heart Attack | No Heart Attack | Total | |
|---|---|---|---|
| Aspirin | 104 | 10,933 | 11,037 |
| Placebo | 189 | 10,845 | 11,034 |
| Total | 293 | 21,778 | 22,071 |
Row proportions (risk within each group):
This is almost a 45% relative reduction in risk!
But we’ve been here before — we need a formal test to rule out chance.
We already know how to test \(H_0: p_A = p_P\) using a two-proportion z-test.
But what if we had more than 2 categories?
The chi-square test (\(\chi^2\)) extends proportion tests to tables of any size.
It answers: Is there a statistically significant association between the row variable and the column variable?
Key idea: If the two variables are independent (no association), what would we expect to see in each cell?
Under independence:
\[E_{ij} = \frac{(\text{row } i \text{ total}) \times (\text{column } j \text{ total})}{\text{grand total}}\]
If the observed counts (O) differ substantially from expected (E), there is evidence of an association.
\[E = \frac{\text{row total} \times \text{column total}}{\text{grand total}}\]
Aspirin × Heart Attack: \[E_{11} = \frac{11037 \times 293}{22071} = \frac{3233841}{22071} = 146.5\]
Aspirin × No Heart Attack: \[E_{12} = \frac{11037 \times 21778}{22071} = \frac{240383886}{22071} = 10890.5\]
Placebo × Heart Attack: \[E_{21} = \frac{11034 \times 293}{22071} = 146.5\]
Placebo × No Heart Attack: \[E_{22} = \frac{11034 \times 21778}{22071} = 10887.5\]
| Heart Attack | No Heart Attack | |
|---|---|---|
| Aspirin | O = 104, E = 146.5 | O = 10933, E = 10890.5 |
| Placebo | O = 189, E = 146.5 | O = 10845, E = 10887.5 |
The aspirin group had far fewer heart attacks than expected under independence.
The placebo group had far more than expected.
How do we turn this into a single test statistic?
\[\chi^2 = \sum_{\text{all cells}} \frac{(O - E)^2}{E}\]
For each cell: compute how far the observed count is from expected, square it (to make all deviations positive), and divide by E (to standardize by the expected magnitude).
Sum across all cells. Large \(\chi^2\) → strong evidence against independence.
\[\chi^2 = \frac{(104-146.5)^2}{146.5} + \frac{(10933-10890.5)^2}{10890.5} + \frac{(189-146.5)^2}{146.5} + \frac{(10845-10887.5)^2}{10887.5}\]
\[= \frac{1806.25}{146.5} + \frac{1806.25}{10890.5} + \frac{1806.25}{146.5} + \frac{1806.25}{10887.5}\]
\[= 12.33 + 0.166 + 12.33 + 0.166 = 24.99 \approx 25.0\]
The chi-square statistic follows a \(\chi^2\) distribution with degrees of freedom:
\[df = (\text{rows} - 1) \times (\text{columns} - 1)\]
For a 2×2 table: \(df = (2-1)(2-1) = 1\)
Key properties of the \(\chi^2\) distribution:
For our aspirin example: \(\chi^2 = 25.0\), \(df = 1\) → p < 0.0001
[Poll Everywhere — respond now!]
A study examines whether blood type (A, B, AB, O) is associated with COVID-19 severity (mild, severe). Below is a partial table:
| Mild | Severe | Total | |
|---|---|---|---|
| A | 120 | 30 | 150 |
| B | 80 | 20 | 100 |
| AB | 40 | 10 | 50 |
| O | 200 | 50 | 250 |
| Total | 440 | 110 | 550 |
Discuss (2 min):
→ Poll Everywhere: What are the degrees of freedom for this table?
In R, the function is chisq.test(). Here is the output for the aspirin study:
Pearson's Chi-squared test with Yates' continuity correction
data: aspirin_table
X-squared = 24.429, df = 1, p-value = 7.71e-07
Reading the output:
X-squared = 24.429: The chi-square test statistic (slight difference from manual calc due to continuity correction)df = 1: One degree of freedom (2×2 table)p-value = 7.71e-07: Approximately 0.00000077 — overwhelming evidence against independenceConclusion: There is very strong evidence of an association between aspirin use and heart attack occurrence (\(\chi^2 = 24.4\), df = 1, p < 0.001).
For the blood type × COVID severity table:
Pearson's Chi-squared test
data: blood_covid
X-squared = 0.0, df = 3, p-value = 1
Wait — look at those row proportions again:
Every blood type has exactly the same severity rate! O = E everywhere → \(\chi^2 = 0\). Perfect independence.
Coming up:
What exactly does the chi-square test detect? And how is “independence” different from “homogeneity”?
Before proceeding, check:
1. Independence: Observations are independent. → Satisfied by random sampling or random assignment.
2. Expected cell counts: All expected counts must be ≥ 5. \[E_{ij} = \frac{\text{row total} \times \text{column total}}{\text{grand total}} \geq 5 \text{ for every cell}\]
3. No sparse cells: Avoid tables where many cells have very small counts.
→ If conditions aren’t met, consider combining categories or using Fisher’s Exact Test (not covered in STAT 7).
All expected counts:
| Heart Attack | No Heart Attack | |
|---|---|---|
| Aspirin | E = 146.5 ✅ | E = 10890.5 ✅ |
| Placebo | E = 146.5 ✅ | E = 10887.5 ✅ |
All ≥ 5 — conditions are met! ✅
What if a cell had E < 5?
Example: A rare disease study with very few cases. You might need to:
[Poll Everywhere — respond now!]
Here is R output from a study on smoking status and lung disease diagnosis:
Pearson's Chi-squared test
data: smoking_lung
X-squared = 18.72, df = 2, p-value = 0.000087
The table had 3 smoking categories (never, former, current) and 2 disease outcomes.
Discuss (2 min):
→ Poll Everywhere: What does df = 2 tell us about the table structure?
Two chi-square tests with the same formula, but different research designs and interpretations:
Chi-Square Test for Independence:
Chi-Square Test for Homogeneity:
| Feature | Independence | Homogeneity |
|---|---|---|
| Sampling | One sample | Multiple samples |
| Who controls? | Neither (both observed) | Researcher fixes row totals |
| Question | Are X and Y related? | Are distributions the same? |
| Calculation | Identical | Identical |
| Interpretation | Association | Distributional equality |
The math is identical. The interpretation differs based on the study design.
The aspirin study is actually a test of homogeneity — researchers fixed the sample sizes in each group and measured heart attack rates.
The chi-square statistic tells us if an association is significant. But how strong is it?
\[V = \sqrt{\frac{\chi^2}{n \cdot (k-1)}}\]
where \(k = \min(\text{rows}, \text{columns})\).
Rough guidelines:
| V | Strength |
|---|---|
| 0.1 | Weak |
| 0.3 | Moderate |
| 0.5 | Strong |
For aspirin: \(V = \sqrt{24.4 / (22071 \times 1)} = \sqrt{0.00111} = 0.033\) — statistically significant but very weak effect size.
Even tiny effects can be statistically significant with large n. Always look at both!
Pearson's Chi-squared test
data: aspirin_table
X-squared = 25.014, df = 1, p-value = 5.66e-07
Expected counts:
Heart.Attack No.Heart.Attack
Aspirin 146.49 10890.51
Placebo 146.51 10887.49
R can also show standardized residuals — cells where \((O-E)/\sqrt{E}\) is large indicate the biggest contributors to the chi-square statistic.
[Poll Everywhere — respond now!]
A student collected data from 200 randomly selected UC Santa Cruz students, asking two questions: (1) Do you eat breakfast? (Yes/No) and (2) Do you exercise regularly? (Yes/No).
Pearson's Chi-squared test
data: breakfast_exercise
X-squared = 6.84, df = 1, p-value = 0.0089
| Exercise: Yes | Exercise: No | Total | |
|---|---|---|---|
| Breakfast: Yes | 82 | 38 | 120 |
| Breakfast: No | 42 | 38 | 80 |
| Total | 124 | 76 | 200 |
Discuss (2 min):
→ Poll Everywhere: Is this independence or homogeneity?
For a 2×2 table, the chi-square test and the two-proportion z-test give equivalent results:
\[\chi^2 = z^2\]
When to use which:
Formula: \(\chi^2 = \sum \frac{(O-E)^2}{E}\)
Expected counts: \(E_{ij} = \frac{\text{row}_i \times \text{col}_j}{n}\)
Degrees of freedom: \((r-1)(c-1)\)
Conditions: All \(E_{ij} \geq 5\); independent observations
Hypotheses:
In R: chisq.test(table_name)
Tuesday: Inference for proportions
Thursday: Chi-square tests
This completes the new material for STAT 7!
This concludes the material for the course. Next week, we will review formulas and exercises in both lectures and discussion sections.
You have been a wonderful group to work with! Keep up the great work and keep asking great questions.
STAT 7 · Winter 2026