Paired Tests & Statistical vs Practical Significance

STAT 7

Welcome!

STAT 7 - Winter 2026

Today’s Plan

  • Motivating Example: The DASH Diet Study
  • Statistical vs. Practical Significance
  • Paired t-tests (continued from last week)
  • Conditions for t-tests
  • Practice with real data

Our Motivating Example: The DASH Diet

The DASH Study (Appel et al., 1997, New England Journal of Medicine)

Background: Hypertension (high blood pressure) affects ~1 in 3 US adults and increases risk of heart disease and stroke.

Question: Can dietary changes reduce blood pressure without medication?

The DASH Diet:

  • Rich in fruits, vegetables, low-fat dairy
  • Reduced saturated fat and total fat
  • Emphasis on whole grains, poultry, fish, nuts

The DASH Study Design

Participants: 459 adults with high-normal or stage 1 hypertension

Three dietary groups (independent samples):

  1. Control diet - typical American diet
  2. Fruits & Vegetables diet - control + more produce
  3. DASH diet - full intervention

Duration: 8 weeks on assigned diet

Primary outcome: Change in systolic blood pressure (mmHg)

Key Question for Today

We’ll focus on comparing participants before and after the DASH diet intervention.

This is a paired design:

  • Same people measured twice
  • Before starting diet (baseline)
  • After 8 weeks on diet

Research question: Does the DASH diet reduce systolic blood pressure?

Think-Pair-Share 1

Scenario: In a preliminary analysis, researchers find that DASH diet participants had a mean reduction of 5.5 mmHg with p = 0.001.

  1. Think (1 min): Is this statistically significant at α = 0.05? But does a 5.5 mmHg reduction matter clinically?
  2. Pair (2 min): Discuss with your neighbor - what else would you want to know?
  3. Share: What did you conclude?

Statistical vs. Practical Significance

Statistical Significance

  • Based on p-value vs. α
  • Answers: “Is the effect real?”
  • Affected by sample size
  • p-value < \(\alpha\) → “significant”

Practical Significance

  • Based on effect size
  • Answers: “Does it matter?”
  • Independent of sample size
  • Requires domain knowledge

Clinical Context Matters

For blood pressure reduction:

  • 2-3 mmHg reduction: Detectable with large samples, but minimal clinical benefit
  • 5-6 mmHg reduction: Associated with ~10-15% lower stroke risk
  • 10+ mmHg reduction: Substantial cardiovascular benefit

The DASH study found: 5.5 mmHg reduction in systolic BP

Both statistically AND practically significant!

This magnitude of reduction, if sustained, translates to meaningful health benefits at the population level.

Paired vs. Independent Samples

Paired Independent
Data Structure Two measurements per subject Separate subjects in each group
Examples Before/after, Left/right, Matched pairs Treatment vs. control (different people)
What We Analyze Differences within pairs (di) Difference between group means
Key Advantage Controls for individual variation Simpler study design

Paired t-Test: DASH Diet Example

Research Question: Does the DASH diet reduce systolic blood pressure?

Study design:

  • Measured blood pressure at baseline and after 8 weeks
  • Each person serves as their own control

Data: Systolic BP (mmHg)

Why paired? Same individuals measured under both conditions (before/after)

The DASH Study: Scale and Impact

Full study enrollment: 459 adults with high-normal or stage 1 hypertension

Three diet groups:

  • Control diet (n = 154)
  • Fruits & Vegetables diet (n = 154)
  • DASH diet (n = 151) ← our focus today

Duration: 8 weeks of dietary intervention

Publication: Appel et al. (1997), New England Journal of Medicine

DASH Diet Group Results (n = 151)

Published summary statistics from Appel et al. (1997):

  • Mean baseline systolic BP: 131.3 mmHg
  • Mean at week 8: 125.8 mmHg
  • Mean reduction (Baseline - Week 8): 5.5 mmHg
  • SD of reductions: 7.8 mmHg
  • Sample size: n = 151 participants

This is paired data - same 151 people measured at two time points

Calculating the Differences

For each participant i: \(d_i = \text{Baseline}_i - \text{Week 8}_i\)

A positive difference = blood pressure decreased (improvement!)

The sample mean difference: \(\bar{d} = \frac{\sum d_i}{n}\)

From the published study:

  • Mean reduction: \(\bar{d} = 5.5\) mmHg
  • SD of differences: \(s_d = 7.8\) mmHg
  • Sample size: \(n = 151\) participants

Key insight: We now have ONE sample (of differences) to analyze!

Hypotheses for Paired Test

Let δ = population mean reduction in systolic BP (Baseline - Week 8)

  • H₀: δ = 0 (no change in blood pressure)
    • The DASH diet has no effect on BP
  • Hₐ: δ > 0 (blood pressure decreases)
    • The DASH diet reduces BP
  • This is a one-sided test (we expect reduction)

Why one-sided? The researchers hypothesized a decrease in BP based on prior evidence.

Paired t-Test Formula

\[t = \frac{\bar{d} - \delta_0}{s_d/\sqrt{n}}\]

Where:

  • \(\bar{d}\) = sample mean of differences
  • \(\delta_0\) = hypothesized population mean difference (usually 0)
  • \(s_d\) = standard deviation of differences
  • \(n\) = number of pairs
  • \(df = n - 1\)

Calculating the Test Statistic

Using the published statistics:

  • \(\bar{d} = 5.5\) mmHg
  • \(s_d = 7.8\) mmHg
  • \(n = 151\) participants
  • \(df = 150\)

\[t = \frac{\bar{d} - 0}{s_d/\sqrt{n}} = \frac{5.5 - 0}{7.8/\sqrt{151}} = \frac{5.5}{0.635} = 8.66\]

Interpretation: The observed mean reduction is about 8.66 standard errors above what we’d expect by chance alone.

Finding the p-value

  • One-sided test (right tail) with df = 150
  • We’re asking: If there’s truly no effect, what’s the probability of seeing a reduction this large or larger?

p-value < 0.001 (extremely small!)

Decision: Since p < 0.001 << 0.05, we strongly reject H₀

Conclusion: There is overwhelming evidence that the DASH diet reduces systolic blood pressure.

The observed reduction of 5.5 mmHg is both: - Statistically significant (p < 0.001) - Clinically meaningful (5+ mmHg reduces cardiovascular risk)

Confidence Interval for Paired Data

A 95% confidence interval for the mean reduction δ:

\[\bar{d} \pm t^* \times \frac{s_d}{\sqrt{n}}\]

Where \(t^*\) is the critical value with area 0.025 in each tail

For our DASH data (df = 150), \(t^* \approx 1.976\):

\[5.5 \pm 1.976 \times \frac{7.8}{\sqrt{151}} = 5.5 \pm 1.976 \times 0.635 = 5.5 \pm 1.25\]

95% CI: (4.25, 6.75) mmHg

Interpretation: We’re 95% confident that the DASH diet reduces systolic blood pressure by 4.25 to 6.75 mmHg on average.

Clinical significance: This entire interval represents clinically meaningful reductions that reduce cardiovascular disease risk!

Think-Pair-Share 2

Scenario: A physical therapist measures knee flexibility (degrees) in 8 patients before and after a 6-week stretching program.

  1. Think (1 min): Is this paired or independent samples? Why?
  2. Pair (2 min): What would be the null and alternative hypotheses?
  3. Share: How would you set up the analysis?

Conditions for Paired t-Tests

Before conducting a paired t-test, check:

  1. Independence: Pairs are independent of each other
    • Random sampling or random assignment
    • One pair’s difference doesn’t affect another’s
  2. Normality: The differences are approximately normal
    • Check with histogram or Q-Q plot of differences
    • Less important for larger samples (n ≥ 30)

Checking Conditions: DASH Study

Before conducting a paired t-test, we need to check:

1. Independence:

  • ✓ Random assignment to diet groups
  • ✓ One participant’s BP change doesn’t affect another’s

2. Normality of differences:

  • We don’t have individual data to plot, but…
  • With n = 151, the Central Limit Theorem applies!
  • The sampling distribution of \(\bar{d}\) will be approximately normal even if individual differences aren’t perfectly normal

Conclusion: Conditions are satisfied. The paired t-test is appropriate.

Key advantage of large n: The t-test is robust to moderate violations of normality when sample size is large (n ≥ 30, and we have 151!).

When Conditions Aren’t Met

If normality is violated (especially with small samples):

  • Check for outliers (one extreme value can skew results)
  • Consider a non-parametric test (Wilcoxon signed-rank test)
  • Collect more data if possible
  • Transform the data (e.g., log transformation)

Remember: With large samples (n ≥ 30), the Central Limit Theorem helps!

Break Time! ☕ 5-minute break

Stretch, grab water, chat with neighbors!

We’ll resume with conditional probability.

Practice Problem: Exercise and Depression

A clinical psychologist studies whether regular exercise reduces depression symptoms. Fifteen patients with mild depression complete a standardized depression inventory (scale 0-50, higher = more depressed), then exercise 30 minutes daily for 8 weeks, then retake the inventory.

Data: Reductions in depression score (Before - After):

8, 12, 5, 15, 10, 7, 13, 9, 11, 8, 14, 6, 10, 12, 9

Question: Does exercise significantly reduce depression scores?

Setting Up the Analysis

  • Type: Paired t-test (same patients, before/after)
  • Hypotheses:
    • H₀: δ = 0 (no change in depression)
    • Hₐ: δ > 0 (depression decreases) - one-sided
  • Significance level: α = 0.05

Computational Steps

To perform paired t-test:

  1. Calculate the mean difference: \(\bar{d} = \frac{\sum d_i}{n}\)
  2. Calculate the SD of differences: \(s_d\)
  3. Calculate t-statistic: \(t = \frac{\bar{d} - 0}{s_d/\sqrt{n}}\)
  4. Find critical value or p-value with df = n - 1
  5. Make decision and interpret

Key: We’re analyzing the differences, not the original scores!

Solution

d̄ = 9.93 points
s_d = 2.91 points
n = 15 , df = 14 
t = 13.2 
p-value (one-sided) < 0.001

Decision: p < 0.001, strongly reject H₀

Conclusion: Exercise significantly reduces depression scores (mean reduction = 9.93 points, p < 0.001).

Clinical context: A reduction of ~10 points on a 50-point scale represents a meaningful improvement in depressive symptoms.

Think-Pair-Share 3

Back to the DASH Study: The full DASH trial found that the DASH diet reduced systolic BP by 5.5 mmHg (p < 0.001) compared to the control diet.

Meanwhile, a different diet intervention study with 5000 participants found a 1.2 mmHg reduction (p = 0.03).

  1. Think (1 min): Both are statistically significant. Which is more important clinically?
  2. Pair (2 min): Discuss: How does sample size affect what we can detect?
  3. Share: What’s the key lesson about statistical vs. practical significance?

Key Takeaways

  1. Paired tests analyze differences within subjects/pairs - more powerful when appropriate
  2. Statistical significance (p-value) ≠ practical significance (effect size)
  3. Always check conditions before running tests:
    • Independence of pairs
    • Normality of differences (or large n)
  4. Context matters - a 5 mmHg BP reduction is meaningful; a 0.5 mmHg reduction is not
  5. Report effect sizes and confidence intervals, not just p-values!

The DASH Study Impact

The original DASH trial (Appel et al., 1997):

  • 459 participants across three diet groups
  • Found clinically meaningful BP reductions
  • Led to national dietary guidelines
  • Influenced treatment of hypertension

Key lesson: Well-designed studies with appropriate analysis can change public health policy!

The DASH diet remains a cornerstone of dietary recommendations for cardiovascular health.

Looking Ahead

Next class (Thursday):

  • Continue with DASH study: comparing diet groups (independent samples)
  • Statistical power
  • Sample size calculations for study planning

For Thursday:

  • Review: How do independent samples differ from paired designs?
  • Think about: How was the full DASH study designed?
  • Bring questions about hypothesis testing!