Independent Samples & Statistical Power

STAT 7

Welcome!

STAT 7 - Winter 2026

Today’s Plan

Continue: The DASH Diet Study
Independent samples t-tests
Introduction to statistical power
Calculating power for a study
Sample size determination
Planning future research

Recap: Tuesday’s Analysis

The DASH Diet Study (Appel et al., 1997, NEJM)

We analyzed a paired design:

Same participants before and after DASH diet
Mean reduction: 5.5 mmHg systolic BP
p < 0.001
Both statistically and clinically significant!

Today: How do we compare the DASH diet to OTHER diets?

The Full DASH Study Design

Three independent groups (different people in each):

Control diet (n = 154) - typical American diet
Fruits & Vegetables (n = 154) - control + more produce
DASH diet (n = 151) - full intervention

Key difference from Tuesday:

Not the same people measured twice
Different participants in each diet group
This is an independent samples design

Comparing Two Diet Groups

Research Question: Does the DASH diet reduce blood pressure more than the Fruits & Vegetables diet?

Summary statistics (change in systolic BP from baseline):

DASH diet: mean = -5.5 mmHg, SD = 7.8 mmHg, n = 151
F&V diet: mean = -2.8 mmHg, SD = 7.5 mmHg, n = 154

Note: Negative values indicate BP decreased (good!)

Independent samples - Different people in each diet group

Independent Samples: Hypotheses

Let μ₁ = mean BP change for DASH diet
Let μ₂ = mean BP change for F&V diet

H₀: μ₁ = μ₂ (no difference between diets)
- Equivalently: μ₁ - μ₂ = 0
Hₐ: μ₁ ≠ μ₂ (there is a difference)
- Equivalently: μ₁ - μ₂ ≠ 0
Two-sided test at α = 0.05

Why two-sided? Though we expect DASH to be better, we test for any difference.

Independent Samples t-Test Formula

\[t = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}\]

Where:

\(\bar{x}_1, \bar{x}_2\) = sample means
\(s_1, s_2\) = sample standard deviations
\(n_1, n_2\) = sample sizes
Under H₀: μ₁ - μ₂ = 0

Degrees of Freedom for Independent Samples

Simple approximation: \(df = \min(n_1 - 1, n_2 - 1)\)

Better approximation (Welch’s):

\[df = \frac{(s_1^2/n_1 + s_2^2/n_2)^2}{(s_1^2/n_1)^2/(n_1-1) + (s_2^2/n_2)^2/(n_2-1)}\]

Statistical software uses Welch’s method automatically.

For our example: df ≈ 150 (using simpler method)

DASH vs. F&V: Calculation

Difference in means: -2.7 mmHg

Standard error: 0.88 mmHg

t-statistic: -3.08

df: 150

p-value: < 0.001

Decision: p < 0.001, reject H₀

Conclusion: The DASH diet reduces blood pressure significantly more than the Fruits & Vegetables diet alone (additional reduction of 2.7 mmHg, p < 0.001).

Clinical and Practical Significance

Statistical result: p < 0.001 - highly significant

Practical significance:

DASH reduces BP 2.7 mmHg more than F&V diet
This additional reduction, while statistically significant, is modest
Both diets show benefit compared to control

The bigger picture:

DASH: 5.5 mmHg reduction from baseline
F&V: 2.8 mmHg reduction from baseline
Control: 0.9 mmHg reduction from baseline

All three differ significantly from each other!

Confidence Interval: Independent Samples

\[(\bar{x}_1 - \bar{x}_2) \pm t^* \times \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}\]

For 95% CI with large df, \(t^* \approx 1.96\):

95% CI: (-4.42, -0.98) mmHg

Interpretation: We’re 95% confident that the DASH diet reduces blood pressure between 0.98 and 4.42 mmHg more than the F&V diet.

Note: Entire interval is negative (favoring DASH)

Conditions for Independent Samples t-Test

Independence:
- Observations within each group are independent
- The two groups are independent of each other
- DASH study: Random assignment to diet groups ✓
Normality:
- Data in each group should be approximately normal
- OR large enough samples (n₁, n₂ ≥ 30)
- DASH study: n > 150 in each group ✓
Note: We DON’T assume equal variances (use Welch’s t-test)

The DASH study meets all conditions!

Break Time! ☕ 5-minute break

Stretch, grab water, chat with neighbors!

We’ll resume with conditional probability.

Planning the Next Study

Based on DASH results, researchers want to design a new study:

Question: Can a modified DASH diet be effective in adolescents with pre-hypertension?

Before collecting data, need to determine sample size
How many adolescents to enroll?
This requires power analysis

Statistical Power: Introduction

Imagine planning this new dietary intervention study…

The diet truly does work (unknown to you)
You run your study
Will you detect the effect?

Power = Probability of correctly rejecting H₀ when Hₐ is true

In other words: Probability of detecting a real effect when it exists

Type I and Type II Errors (Review)

	H₀ is TRUE	H₀ is FALSE (Hₐ is TRUE)
Reject H₀	Type I Error (False Positive) - Probability = α	Correct! (True Positive) - Probability = 1-β (Power)
Fail to Reject H₀	Correct (True Negative) - Probability = 1-α	Type II Error (False Negative) - Probability = β

Power = 1 - β = Probability of detecting a true effect

Why Does Power Matter?

Low power → High chance of missing real effects
- Wastes resources (time, money, participants)
- Misleading null results
- Ethical concerns (participants for no gain)
High power → Good chance of detecting real effects
- More confidence in study design
- Better use of resources

Typical goal: Power ≥ 80% (sometimes 90%)

What Affects Power?

Power depends on:

Effect size (Δ) - How large is the true difference?
- Larger effects → easier to detect → higher power
Variability (σ) - How much do measurements vary?
- Less variability → easier to detect effects → higher power
Sample size (n) - How many subjects?
- Larger samples → more precise estimates → higher power
Significance level (α) - How strict is our threshold?
- Larger α → easier to reject H₀ → higher power (but more Type I errors!)

Example: Planning a Dietary Study

Scenario: Researchers want to test a simplified DASH-style diet in adolescents.

Previous studies (like DASH): σ ≈ 8 mmHg for BP changes
Effect of interest: Δ = 4 mmHg reduction (clinically meaningful)
Significance level: α = 0.05 (two-sided)
Question: With n = 50 per group, what’s the power?

Why n = 50? Budget constraints for this pilot study

Visualizing the Problem

Null distribution (H₀: no difference):

Center: 0
SE = \(\sqrt{8^2/50 + 8^2/50} = 1.60\) mmHg

Alternative distribution (Hₐ: difference = -4):

Center: -4
Same SE = 1.60 mmHg

Rejection region: |difference| > 1.96 × 1.60 = 3.14 mmHg

The Power Calculation

Green area = Power = P(Reject H₀ | Hₐ is true) ≈ 71%

Computing Power

Step 1: Find rejection region for H₀

For α = 0.05, two-sided: critical values at ±1.96 SE
Rejection region: difference < -3.14 or > 3.14 mmHg

Step 2: Calculate probability under Hₐ (when true difference = -4)

Convert to z-score: \(z = \frac{-3.14 - (-4)}{1.60} = 0.54\)
P(Z < 0.54) ≈ 0.71

Power ≈ 71% - Better than our earlier example, but could be higher.

Interpretation: If the diet truly reduces BP by 4 mmHg, there’s a 71% chance this study will detect it.

Increasing Power to 80%

Target: 80% power to detect Δ = 4 mmHg reduction

Key insight: Rejection region is always 1.96 SE from 0. We need the alternative distribution far enough left that 80% falls in the rejection region.

This requires: \(0.84 \times SE + 1.96 \times SE = 4\)

Where 0.84 is the z-score for 80th percentile (for 80% power)

Sample Size Calculation

\[2.8 \times SE = 4\]

\[SE = \frac{4}{2.8} = 1.43\]

Since \(SE = \sqrt{\frac{\sigma_1^2}{n} + \frac{\sigma_2^2}{n}} = \sqrt{\frac{2\sigma^2}{n}}\) with σ = 8:

\[\sqrt{\frac{2 \times 8^2}{n}} = 1.43\]

\[n = \frac{2 \times 8^2}{1.43^2} = 63\] participants per group

Conclusion: Need 63 adolescents in each diet group for 80% power.

Sample Size Formula

For comparing two means with equal n per group:

\[n = \frac{(\sigma_1^2 + \sigma_2^2)(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}\]

Where:

σ₁, σ₂ = population standard deviations (often assumed equal)
Δ = minimum effect size of interest
z_{1-α/2} = critical value for significance (1.96 for α=0.05)
z_{1-β} = critical value for power (0.84 for 80% power, 1.28 for 90%)

Always round UP!

Verify Our Calculation

Using the formula for our dietary study:

\[n = \frac{(8^2 + 8^2)(1.96 + 0.84)^2}{4^2}\]

\[n = \frac{128 \times (2.8)^2}{16} = \frac{128 \times 7.84}{16} = 62.7\]

Round up to n = 63 per group ✓

This matches our earlier calculation!

Practice: Omega-3 Supplementation Study

A nutrition researcher wants to test omega-3 supplementation on inflammation markers.

Previous data: σ ≈ 12 mg/L for C-reactive protein (CRP)
Target reduction: Δ = 8 mg/L (clinically meaningful)
Desired power: 90%
Significance: α = 0.05 (two-sided)

Calculate: How many participants needed per group?

Solution

\[n = \frac{(12^2 + 12^2)(1.96 + 1.28)^2}{8^2}\]

Where:

z_{1-α/2} = 1.96 (for α = 0.05)
z_{1-β} = 1.28 (for 90% power)

\[n = \frac{2 \times 144 \times (3.24)^2}{64} = \frac{288 \times 10.50}{64} = 47.25\]

Answer: Need 48 participants per group (supplement vs. placebo)

This is a modest sample size - omega-3 studies are feasible!

The DASH Study’s Legacy

Original DASH trial (1997):

Carefully planned with adequate sample size
Multiple diet groups for comparisons
Detected clinically meaningful effects
Published in top medical journal

Impact:

Changed dietary guidelines nationwide
Led to DASH eating plan recommendations
Spawned follow-up studies (DASH-Sodium, DASH-Plus)
Continues to influence public health policy

Key lesson: Good study design with proper power analysis leads to impactful science!

Power Analysis: Key Principles

Do power analysis BEFORE collecting data
- Determines appropriate sample size
- Justifies study to reviewers/funders
Balance competing factors:
- Higher power → need more participants → more cost/time
- Lower power → risk missing real effects
Be realistic about effect sizes
- Use previous studies
- Consider minimum clinically important difference
Remember: Power = 1 - P(Type II Error)

Real-World Considerations

Why not always use huge samples for 99% power?

Cost - Each participant costs money
Time - Recruitment takes time
Ethics - Don’t expose more participants than necessary
Diminishing returns - Going from 80% to 90% power requires much larger n than 60% to 70%

Standard practice: Target 80-90% power

Summary: Power and Sample Size

To Increase Power:	Effect on Study:
Increase sample size (n)	More expensive, takes longer
Study larger effect sizes (Δ)	May not match research question
Reduce variability (σ)	Better measurement, stricter inclusion
Increase α	More Type I errors - usually not done

Bottom line: Sample size is usually the only practical lever

Key Takeaways

Independent samples t-tests compare means from two separate groups (like DASH vs. F&V diet)
Power = probability of detecting a real effect when it exists
Power depends on: effect size (Δ), variability (σ), sample size (n), and α
Sample size planning ensures adequate power - do this BEFORE collecting data!
Standard target: 80-90% power
The DASH study exemplifies well-designed research with lasting impact

Bottom line: Invest time in planning. The DASH researchers did, and it changed nutrition guidelines!

Looking Ahead

Next week (Week 8):

Correlation between two quantitative variables
Simple linear regression
Moving from comparing groups to studying relationships

Before next class:

Complete DSA 6 (due after DS or on Thursday class for those on Monday DS)
Complete HW 5 (due Friday)
Review: paired vs. independent designs

Independent Samples & Statistical Power

Welcome!

Recap: Tuesday’s Analysis

The Full DASH Study Design

Comparing Two Diet Groups

Independent Samples: Hypotheses

Independent Samples t-Test Formula

Degrees of Freedom for Independent Samples

DASH vs. F&V: Calculation

Clinical and Practical Significance

Confidence Interval: Independent Samples

Think-Pair-Share 1

Conditions for Independent Samples t-Test

Planning the Next Study

Statistical Power: Introduction

Type I and Type II Errors (Review)

Why Does Power Matter?

What Affects Power?

Example: Planning a Dietary Study

Visualizing the Problem

The Power Calculation

Computing Power

Think-Pair-Share 2

Increasing Power to 80%

Sample Size Calculation

Sample Size Formula

Verify Our Calculation

Practice: Omega-3 Supplementation Study

Solution

Think-Pair-Share 3

The DASH Study’s Legacy

Power Analysis: Key Principles

Real-World Considerations

Summary: Power and Sample Size

Key Takeaways

Looking Ahead