Contingency Tables and Chi-Square

Week 9

Welcome Back!

Today’s Plan

  • Opening application: aspirin and heart attacks
  • Two-way contingency tables
  • Expected counts
  • Chi-square test for independence
  • Chi-square test for homogeneity (and the difference!)
  • Conditions and R output

Opening Application: Aspirin Study

The Physicians’ Health Study (1988) was a landmark randomized trial involving 22,071 male physicians.

Participants were randomly assigned to take aspirin or a placebo daily.

After 5 years:

Heart Attack No Heart Attack Total
Aspirin 104 10,933 11,037
Placebo 189 10,845 11,034
Total 293 21,778 22,071

Is there a relationship between aspirin use and heart attack occurrence?

What Is a Contingency Table?

A two-way contingency table (also called a cross-tabulation) displays the distribution of two categorical variables simultaneously.

Rows → one categorical variable (e.g., treatment group)

Columns → another categorical variable (e.g., outcome)

Cells → joint counts (how many observations fall in each combination)

Margins → row totals and column totals

The table lets us ask: Are the two variables related, or are they independent?

Reading the Aspirin Table

Heart Attack No Heart Attack Total
Aspirin 104 10,933 11,037
Placebo 189 10,845 11,034
Total 293 21,778 22,071

Row proportions (risk within each group):

  • Aspirin group: 104/11,037 = 0.94% heart attack rate
  • Placebo group: 189/11,034 = 1.71% heart attack rate

This is almost a 45% relative reduction in risk!

But we’ve been here before — we need a formal test to rule out chance.

The Chi-Square Approach

We already know how to test \(H_0: p_A = p_P\) using a two-proportion z-test.

But what if we had more than 2 categories?

  • 3 treatment groups?
  • Outcomes with 3+ levels?

The chi-square test (\(\chi^2\)) extends proportion tests to tables of any size.

It answers: Is there a statistically significant association between the row variable and the column variable?

The Logic: Observed vs. Expected

Key idea: If the two variables are independent (no association), what would we expect to see in each cell?

Under independence:

\[E_{ij} = \frac{(\text{row } i \text{ total}) \times (\text{column } j \text{ total})}{\text{grand total}}\]

If the observed counts (O) differ substantially from expected (E), there is evidence of an association.

Calculating Expected Counts: Aspirin

\[E = \frac{\text{row total} \times \text{column total}}{\text{grand total}}\]

Aspirin × Heart Attack: \[E_{11} = \frac{11037 \times 293}{22071} = \frac{3233841}{22071} = 146.5\]

Aspirin × No Heart Attack: \[E_{12} = \frac{11037 \times 21778}{22071} = \frac{240383886}{22071} = 10890.5\]

Placebo × Heart Attack: \[E_{21} = \frac{11034 \times 293}{22071} = 146.5\]

Placebo × No Heart Attack: \[E_{22} = \frac{11034 \times 21778}{22071} = 10887.5\]

Observed vs. Expected

Heart Attack No Heart Attack
Aspirin O = 104, E = 146.5 O = 10933, E = 10890.5
Placebo O = 189, E = 146.5 O = 10845, E = 10887.5

The aspirin group had far fewer heart attacks than expected under independence.

The placebo group had far more than expected.

How do we turn this into a single test statistic?

The Chi-Square Test Statistic

\[\chi^2 = \sum_{\text{all cells}} \frac{(O - E)^2}{E}\]

For each cell: compute how far the observed count is from expected, square it (to make all deviations positive), and divide by E (to standardize by the expected magnitude).

Sum across all cells. Large \(\chi^2\) → strong evidence against independence.

Chi-Square Calculation: Aspirin

\[\chi^2 = \frac{(104-146.5)^2}{146.5} + \frac{(10933-10890.5)^2}{10890.5} + \frac{(189-146.5)^2}{146.5} + \frac{(10845-10887.5)^2}{10887.5}\]

\[= \frac{1806.25}{146.5} + \frac{1806.25}{10890.5} + \frac{1806.25}{146.5} + \frac{1806.25}{10887.5}\]

\[= 12.33 + 0.166 + 12.33 + 0.166 = 24.99 \approx 25.0\]

The Chi-Square Distribution

The chi-square statistic follows a \(\chi^2\) distribution with degrees of freedom:

\[df = (\text{rows} - 1) \times (\text{columns} - 1)\]

For a 2×2 table: \(df = (2-1)(2-1) = 1\)

Key properties of the \(\chi^2\) distribution:

  • Always non-negative (we squared the deviations)
  • Skewed right
  • Larger \(\chi^2\) values are in the right tail → correspond to smaller p-values

For our aspirin example: \(\chi^2 = 25.0\), \(df = 1\)p < 0.0001

Think-Pair-Share #1

[Poll Everywhere — respond now!]

A study examines whether blood type (A, B, AB, O) is associated with COVID-19 severity (mild, severe). Below is a partial table:

Mild Severe Total
A 120 30 150
B 80 20 100
AB 40 10 50
O 200 50 250
Total 440 110 550

Discuss (2 min):

  1. What are the dimensions of this table? How many degrees of freedom?
  2. Calculate the expected count for Blood Type O / Severe.
  3. Looking at the row proportions: does there appear to be a relationship between blood type and severity?

→ Poll Everywhere: What are the degrees of freedom for this table?

R Output for Chi-Square Test

In R, the function is chisq.test(). Here is the output for the aspirin study:

    Pearson's Chi-squared test with Yates' continuity correction

data:  aspirin_table
X-squared = 24.429, df = 1, p-value = 7.71e-07

Reading the output:

  • X-squared = 24.429: The chi-square test statistic (slight difference from manual calc due to continuity correction)
  • df = 1: One degree of freedom (2×2 table)
  • p-value = 7.71e-07: Approximately 0.00000077 — overwhelming evidence against independence

Conclusion: There is very strong evidence of an association between aspirin use and heart attack occurrence (\(\chi^2 = 24.4\), df = 1, p < 0.001).

Larger Table R Output

For the blood type × COVID severity table:

    Pearson's Chi-squared test

data:  blood_covid
X-squared = 0.0, df = 3, p-value = 1

Wait — look at those row proportions again:

  • Type A: 30/150 = 20% severe
  • Type B: 20/100 = 20% severe
  • Type AB: 10/50 = 20% severe
  • Type O: 50/250 = 20% severe

Every blood type has exactly the same severity rate! O = E everywhere → \(\chi^2 = 0\). Perfect independence.

☕ BREAK — 10 minutes

Coming up:

What exactly does the chi-square test detect? And how is “independence” different from “homogeneity”?

Conditions for the Chi-Square Test

Before proceeding, check:

1. Independence: Observations are independent. → Satisfied by random sampling or random assignment.

2. Expected cell counts: All expected counts must be ≥ 5. \[E_{ij} = \frac{\text{row total} \times \text{column total}}{\text{grand total}} \geq 5 \text{ for every cell}\]

3. No sparse cells: Avoid tables where many cells have very small counts.

→ If conditions aren’t met, consider combining categories or using Fisher’s Exact Test (not covered in STAT 7).

Checking Conditions: Aspirin Study

All expected counts:

Heart Attack No Heart Attack
Aspirin E = 146.5 E = 10890.5
Placebo E = 146.5 E = 10887.5

All ≥ 5 — conditions are met! ✅

What if a cell had E < 5?

Example: A rare disease study with very few cases. You might need to:

  • Collect more data
  • Combine small categories
  • Use Fisher’s Exact Test

Think-Pair-Share #2

[Poll Everywhere — respond now!]

Here is R output from a study on smoking status and lung disease diagnosis:

    Pearson's Chi-squared test

data:  smoking_lung
X-squared = 18.72, df = 2, p-value = 0.000087

The table had 3 smoking categories (never, former, current) and 2 disease outcomes.

Discuss (2 min):

  1. How many rows and columns does this table have? Check: does df = 2 make sense?
  2. Write a complete conclusion in plain language (one statistical sentence, one biological sentence).
  3. The researcher concludes: “Smoking causes lung disease.” Is this conclusion justified from this study? What else would you need to know?

→ Poll Everywhere: What does df = 2 tell us about the table structure?

Independence vs. Homogeneity

Two chi-square tests with the same formula, but different research designs and interpretations:

Chi-Square Test for Independence:

  • One sample drawn from a population
  • Both variables measured on the same individuals
  • Question: Are two categorical variables associated?
  • Example: Survey 500 adults; ask about smoking AND lung disease.

Chi-Square Test for Homogeneity:

  • Multiple samples drawn separately from different populations
  • One categorical outcome measured on each group
  • Question: Are the distributions of the outcome the same across groups?
  • Example: Sample 200 smokers AND 300 non-smokers separately; measure lung disease in each.

Independence vs. Homogeneity: The Key Difference

Feature Independence Homogeneity
Sampling One sample Multiple samples
Who controls? Neither (both observed) Researcher fixes row totals
Question Are X and Y related? Are distributions the same?
Calculation Identical Identical
Interpretation Association Distributional equality

The math is identical. The interpretation differs based on the study design.

The aspirin study is actually a test of homogeneity — researchers fixed the sample sizes in each group and measured heart attack rates.

Effect Size: Cramér’s V

The chi-square statistic tells us if an association is significant. But how strong is it?

\[V = \sqrt{\frac{\chi^2}{n \cdot (k-1)}}\]

where \(k = \min(\text{rows}, \text{columns})\).

Rough guidelines:

V Strength
0.1 Weak
0.3 Moderate
0.5 Strong

For aspirin: \(V = \sqrt{24.4 / (22071 \times 1)} = \sqrt{0.00111} = 0.033\) — statistically significant but very weak effect size.

Even tiny effects can be statistically significant with large n. Always look at both!

Full R Output with Expected Counts

chisq.test(aspirin_table, correct = FALSE)
    Pearson's Chi-squared test

data:  aspirin_table
X-squared = 25.014, df = 1, p-value = 5.66e-07

Expected counts:
         Heart.Attack No.Heart.Attack
Aspirin        146.49        10890.51
Placebo        146.51        10887.49

R can also show standardized residuals — cells where \((O-E)/\sqrt{E}\) is large indicate the biggest contributors to the chi-square statistic.

Think-Pair-Share #3

[Poll Everywhere — respond now!]

A student collected data from 200 randomly selected UC Santa Cruz students, asking two questions: (1) Do you eat breakfast? (Yes/No) and (2) Do you exercise regularly? (Yes/No).

    Pearson's Chi-squared test

data:  breakfast_exercise
X-squared = 6.84, df = 1, p-value = 0.0089
Exercise: Yes Exercise: No Total
Breakfast: Yes 82 38 120
Breakfast: No 42 38 80
Total 124 76 200

Discuss (2 min):

  1. Is this a test of independence or homogeneity? How do you know?
  2. Write the hypotheses in plain language.
  3. Verify the expected count for the Breakfast Yes / Exercise Yes cell. Is the expected count condition met?
  4. Write a complete conclusion.

→ Poll Everywhere: Is this independence or homogeneity?

Connecting Chi-Square to the Z-Test

For a 2×2 table, the chi-square test and the two-proportion z-test give equivalent results:

\[\chi^2 = z^2\]

  • Aspirin z-test (from Tuesday): \(z = 5.01\), so \(z^2 = 25.1 \approx \chi^2 = 25.0\)
  • Both give p-value < 0.0001

When to use which:

  • Use the z-test when you want a one-sided alternative, or when you want a CI for the difference
  • Use chi-square when you have more than 2 categories, or when you want to describe the overall association without a directional hypothesis

Summary: Chi-Square Tests

Formula: \(\chi^2 = \sum \frac{(O-E)^2}{E}\)

Expected counts: \(E_{ij} = \frac{\text{row}_i \times \text{col}_j}{n}\)

Degrees of freedom: \((r-1)(c-1)\)

Conditions: All \(E_{ij} \geq 5\); independent observations

Hypotheses:

  • \(H_0\): No association between the two variables (independence) OR distributions are the same (homogeneity)
  • \(H_a\): There is an association OR distributions differ

In R: chisq.test(table_name)

What We’ve Covered in Week 9

Tuesday: Inference for proportions

  • Single proportion CI and test
  • Two-proportion CI and test
  • Conditions: success/failure, pooling, SE formulas

Thursday: Chi-square tests

  • Contingency tables and expected counts
  • Chi-square statistic and distribution
  • Independence vs. homogeneity
  • Conditions and R output

This completes the new material for STAT 7!

Thank you!

This concludes the material for the course. Next week, we will review formulas and exercises in both lectures and discussion sections.

You have been a wonderful group to work with! Keep up the great work and keep asking great questions.