Stat 17: Introduction to Statistical Methods for Business and Economics – STAT 17: Confidence Intervals & Hypothesis Testing

Case Study: The Drug Approval Decision

Meet Dr. Chen, a medical researcher testing a new drug to lower blood pressure.

Her challenge: The pharmaceutical company claims the drug lowers blood pressure by at least 10 mmHg. Dr. Chen must:

Test whether this claim is supported by data
Balance two types of errors: approving ineffective drugs vs rejecting effective ones
Make a decision with statistical evidence
Communicate findings to FDA

The stakes: Approving an ineffective drug wastes money and gives false hope. Rejecting an effective drug denies patients a helpful treatment.

The tool: Hypothesis testing - the scientific method in statistical form!

Understanding hypothesis testing helps Dr. Chen (and you!) make evidence-based decisions.

Quick Review: Confidence Intervals

What we learned last time:

Central Limit Theorem: - \(\bar{x} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\) for large n - Standard Error: SE = σ/√n

Three types of CIs:

Mean, σ known: \(\bar{x} \pm z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\)
Mean, σ unknown: \(\bar{x} \pm t_{\alpha/2,df} \times \frac{s}{\sqrt{n}}\)
Proportion: \(\hat{p} \pm z_{\alpha/2} \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Key insight: CIs provide a range of plausible values for a parameter

What We’ll Accomplish Today

By the end of this lecture, you will be able to:

Formulate appropriate null and alternative hypotheses
Understand Type I and Type II errors and their probabilities
Choose appropriate test statistics and distributions
Calculate p-values using Google Sheets
Conduct complete hypothesis tests with proper conclusions
Interpret statistical significance in context
Understand the relationship between CIs and hypothesis tests

From Estimation to Testing

So far: Using data to estimate parameters

Point estimate: x̄ = 24.3 hours
Interval estimate: 95% CI = (23.0, 24.6) hours

Now: Using data to test claims about parameters

Claim: “The average battery life is 24 hours”
Question: Do our data support or refute this claim?

This is hypothesis testing!

The Logic of Hypothesis Testing

The scientific method:

Start with a claim (hypothesis)
Collect data
See if data are consistent with the claim
Make a decision: support or reject the claim

Key principle: Proof by contradiction

Assume the claim is true
If data are very unlikely under this assumption, reject the claim
If data are reasonably likely, we don’t reject the claim

Important: We never “prove” hypotheses, we only gather evidence for or against them!

The Logic of Hypothesis Testing

The scientific method:

Start with a claim (hypothesis)
Collect data
See if data are consistent with the claim
Make a decision: support or reject the claim

Example: Drug lowers BP by 10 mmHg

If we observe only 2 mmHg reduction with large sample → reject claim
If we observe 9 mmHg reduction → not enough evidence to reject

But how do we find the cutting point?

The Null and Alternative Hypotheses

Null Hypothesis (H₀):

The “status quo” or “nothing interesting” claim
What we assume is true initially
Always has =, ≤, or ≥
The hypothesis we try to find evidence AGAINST

Alternative Hypothesis (H₁ or Hₐ):

The “research” hypothesis
What we’re trying to find evidence FOR
Has ≠, <, or >
Determines the type of test (two-tailed, left-tailed, right-tailed)

The Null and Alternative Hypotheses

Null Hypothesis (H₀)

Alternative Hypothesis (H₁ or Hₐ)

Key rule: H₀ and H₁ must be:

Mutually exclusive (can’t both be true)
Exhaustive (one must be true)
Statements about POPULATION parameters, not sample statistics

Types of Alternative Hypotheses

Three types based on research question:

1. Two-tailed (≠):

H₀: μ = μ₀
H₁: μ ≠ μ₀
Use when: Interested in detecting any difference (either direction)
Example: “Is the mean different from 24?”

2. Right-tailed (>):

H₀: μ ≤ μ₀
H₁: μ > μ₀
Use when: Want to show parameter is greater
Example: “Has the new process increased battery life?”

Types of Alternative Hypotheses

Three types based on research question:

3. Left-tailed (<):

H₀: μ ≥ μ₀
H₁: μ < μ₀
Use when: Want to show parameter is less
Example: “Has the drug lowered blood pressure?”

The alternative hypothesis determines which tail(s) we look at!

Formulating Hypotheses: Examples

Example 1: Drug testing

Claim: Drug lowers BP by at least 10 mmHg (mean reduction μ ≥ 10)

H₀: μ ≥ 10 (drug is effective)
H₁: μ < 10 (drug is not effective enough)
Type: Left-tailed

Example 2: Quality control

Standard: Battery life should be 24 hours (μ = 24)

H₀: μ = 24 (meeting standard)
H₁: μ ≠ 24 (not meeting standard)
Type: Two-tailed

Formulating Hypotheses: Examples

Example 3: Process improvement

Question: Has training improved customer satisfaction above 75%?

H₀: p ≤ 0.75 (no improvement)
H₁: p > 0.75 (improvement occurred)
Type: Right-tailed

Key: The research question determines H₁, and H₀ is the complement!

THINK-PAIR-SHARE 1 (7 minutes)

Formulating Hypotheses

For each scenario, write H₀ and H₁, and identify the test type:

A company claims their phone battery lasts at least 48 hours. You want to test this claim.
Historical average GPA at UCSC is 3.2. Has it changed?
A website claims their ads have a 5% click-through rate. You think it’s lower.
A manufacturer wants to know if a new process produces parts with mean weight different from the current 50 grams.
A hospital wants to show that their new treatment reduces recovery time below the current 7 days.

For each: Identify the parameter, write hypotheses, name test type Post on Ed Discussion with partner’s name!

Test Statistics: The Evidence Measure

Test statistic: A single number that measures how far the sample data are from H₀

For means (σ known or large sample):

\[z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}\]

For means (σ unknown, small sample):

\[t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\]

For proportions:

\[z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\]

Interpretation: - How many standard errors is our sample statistic from the null value? - Large |test statistic| → data inconsistent with H₀

The P-Value: Measuring Evidence

P-value definition:

The probability of observing data as extreme or more extreme than what we got, assuming H₀ is true

Interpretation:

Small p-value → Data unlikely under H₀ → Evidence against H₀
Large p-value → Data consistent with H₀ → Insufficient evidence against H₀

Important: P-value is NOT:

The probability H₀ is true
The probability H₁ is true
The probability we made the wrong decision

Calculating P-Values by Test Type

The p-value depends on the alternative hypothesis:

Two-tailed test (H₁: μ ≠ μ₀):

p-value = 2 × P(Z > |test statistic|)
Look at both tails

Right-tailed test (H₁: μ > μ₀):

p-value = P(Z > test statistic)
Look at right tail only

Left-tailed test (H₁: μ < μ₀):

p-value = P(Z < test statistic)
Look at left tail only

Google Sheets formulas coming up!

Significance Level (alpha α)

Significance level (α): The threshold for rejecting H₀

Common choices:

α = 0.05 (most common)
α = 0.01 (more conservative)
α = 0.10 (more liberal)

Decision rule:

If p-value ≤ α → Reject H₀ (statistically significant)
If p-value > α → Fail to reject H₀ (not statistically significant)

Relationship to confidence intervals: - α = 0.05 corresponds to 95% CI - α = 0.01 corresponds to 99% CI - α = 0.10 corresponds to 90% CI

Note: α is chosen BEFORE seeing the data!

The Four-Step Hypothesis Test

Step 1: STATE

State H₀ and H₁
Define parameters clearly
Choose significance level α

Step 2: PLAN

Check conditions (random sample, sample size, etc.)
Choose test statistic (z or t)
Identify distribution

The Four-Step Hypothesis Test

Step 3: SOLVE

Calculate test statistic
Find p-value
Use Google Sheets for calculations

Step 4: CONCLUDE

Compare p-value to α
Make decision (reject or fail to reject H₀)
State conclusion in context

Always follow all four steps for complete tests!

Example: Battery Life Test

Scenario: Sarah’s company claims μ = 24 hours. She tests n = 100 phones, finds x̄ = 23.5 hours, s = 4 hours. Test at α = 0.05.

STEP 1: STATE

H₀: μ = 24 hours
H₁: μ ≠ 24 hours (two-tailed)
α = 0.05
Parameter: μ = true mean battery life

STEP 2: PLAN

Conditions: Random sample ✓, n = 100 ≥ 30 ✓, unknown sigma
Test statistic: \(t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\)
Distribution: t with df = 99

Example: Battery Life Test

Scenario: Sarah’s company claims μ = 24 hours. She tests n = 100 phones, finds x̄ = 23.5 hours, s = 4 hours. Test at α = 0.05.

STEP 3: SOLVE

t = (23.5 - 24)/(4/SQRT(100)) = -0.5/0.4 = -1.25
p-value = 2 × P(T < -1.25) with df = 99
=2*T.DIST(-1.25, 99, TRUE) ≈ 0.214

STEP 4: CONCLUDE

p-value = 0.214 > α = 0.05
Fail to reject H₀
Conclusion: There is insufficient (statistical) evidence at the 0.05 significance level to conclude that the mean battery life is different from 24 hours.

Google Sheets: P-Value Calculations

z-tests

For proportions or means with σ known:

Two-tailed:

=2*NORM.S.DIST(-ABS(z), TRUE)

Right-tailed:

=1-NORM.S.DIST(z, TRUE)

Left-tailed:

=NORM.S.DIST(z, TRUE)

t-tests

For means with σ unknown:

Two-tailed:

=2*T.DIST(-ABS(t), df, TRUE)

Right-tailed:

=1-T.DIST(t, df, TRUE)

Left-tailed:

=T.DIST(t, df, TRUE)

Pro tip: Use ABS() for absolute value in two-tailed tests!

THINK-PAIR-SHARE 2 (7 minutes)

Complete Hypothesis Test

A coffee shop claims the average wait time is 5 minutes. You sample n = 36 customers and find x̄ = 5.8 minutes with s = 2.4 minutes. Test at α = 0.05 whether the true mean wait time is different from 5 minutes.

Follow the four-step process:

STATE: Write H₀, H₁, define α and parameter
PLAN: Check conditions, identify test statistic and distribution
SOLVE: Calculate test statistic and p-value (use Google Sheets)
CONCLUDE: Make decision and write conclusion in context

Bonus: What would change if this were a right-tailed test (want to show wait time exceeds 5 minutes)?

Post on Ed Discussion with partner’s name!

Share your answers in Poll Everywhere!

What is the p-value for this test?

🧘‍♀️ STRETCH BREAK

Time to move! (5 minutes)

Stand up and stretch 🤸‍♀️
Chat with neighbors about inference 💬
Grab some water 💧

Type I and Type II Errors

Four possible outcomes in hypothesis testing:

	H₀ True	H₀ False
Reject H₀	Type I Error (α)	Correct Decision (Power)
Fail to Reject H₀	Correct Decision	Type II Error (β)

Type I Error (False Positive):

Reject H₀ when H₀ is actually true
Probability = α (significance level)
Example: Approve ineffective drug

Type II Error (False Negative):

Fail to reject H₀ when H₀ is actually false
Probability = β
Example: Reject effective drug

Power = 1 - β: - Probability of correctly rejecting false H₀ - Higher power is better!

Understanding Errors in Context

Dr. Chen’s drug trial:

H₀: Drug reduces BP by ≥ 10 mmHg (effective)
H₁: Drug reduces BP by < 10 mmHg (ineffective)

Type I Error (α):

Reject H₀ when true
Conclude drug ineffective when it actually works
Consequence: Deny patients effective treatment
Controlled by significance level α

Type II Error (β):

Fail to reject H₀ when false
Conclude drug effective when it doesn’t work well enough
Consequence: Approve ineffective drug
Affected by sample size, effect size, α

The trade-off: - Decrease α → increase β (more conservative) - Increase α → decrease β (more liberal) - Increase n → decrease both α and β!

Calculating Type I Error Probability

Type I error rate = α (by definition!)

We CHOOSE α, which directly sets the Type I error rate

Example scenarios:

Life-or-death medical decision:

Use α = 0.01 (very conservative)
Only 1% chance of approving ineffective treatment

Preliminary research:

Use α = 0.10 (more liberal)
10% chance of false positive acceptable

Standard research:

Use α = 0.05 (balanced)
5% chance of Type I error

Key insight: We control Type I error directly by choosing α!

Calculating Type II Error Probability

Type II error rate (β) depends on:

Significance level (α): Lower α → higher β
Sample size (n): Larger n → lower β
Effect size: Larger true difference → lower β
Population variability (σ): Higher σ → higher β

Calculating β requires:

Specifying alternative value of parameter
Finding probability of not rejecting H₀ when that alternative is true
More complex calculation (often use software)

Calculating Type II Error Probability

Type II error rate (β) depends on:

Significance level (α): Lower α → higher β
Sample size (n): Larger n → lower β
Effect size: Larger true difference → lower β
Population variability (σ): Higher σ → higher β

Power = 1 - β:

Probability of detecting a true effect
Researchers often aim for power ≥ 0.80 (β ≤ 0.20)

Sample size planning:

Choose n to achieve desired power for expected effect size

Power Analysis Example

Dr. Chen’s power analysis:

Setup:

H₀: μ ≥ 10 mmHg reduction
H₁: μ < 10 mmHg reduction
Want to detect if true reduction is only 8 mmHg
α = 0.05, σ = 5 mmHg

Question: What sample size gives 80% power?

Answer: Use power analysis (Google Sheets add-on or statistical software)

For this scenario: n ≈ 100 patients needed
With n = 100: Power = 0.80, β = 0.20

Interpretation: - 80% chance of detecting that drug is ineffective (μ = 8) - 20% chance of Type II error (approving drug that only reduces BP by 8)

This is BEFORE conducting the study!

Example: Testing a Proportion

Scenario: Company claims 80% customer satisfaction (p = 0.80). You survey n = 200 customers, find 148 satisfied (p̂ = 0.74). Test at α = 0.05.

STEP 1: STATE

H₀: p = 0.80
H₁: p ≠ 0.80 (two-tailed)
α = 0.05

STEP 2: PLAN

Check: np₀ = 200(0.80) = 160 ≥ 10 ✓, n(1-p₀) = 40 ≥ 10 ✓
Test statistic: \(z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\)

Example: Testing a Proportion

Scenario: Company claims 80% customer satisfaction (p = 0.80). You survey n = 200 customers, find 148 satisfied (p̂ = 0.74). Test at α = 0.05.

STEP 3: SOLVE

SE = SQRT(0.80*0.20/200) = 0.0283
z = (0.74 - 0.80)/0.0283 = -2.12
p-value = 2*NORM.S.DIST(-2.12, TRUE) ≈ 0.034

STEP 4: CONCLUDE

p-value = 0.034 < α = 0.05
Reject H₀
Conclusion: There is sufficient evidence at the 0.05 level to conclude that the true customer satisfaction rate is different from 80%.

One-Tailed vs Two-Tailed Tests

Two-tailed test:

Use when: Interested in ANY difference
Critical regions in BOTH tails
p-value calculation: multiply by 2
More conservative (harder to reject)

One-tailed test:

Use when: Interested in specific direction only
Critical region in ONE tail
p-value calculation: use one tail only
More powerful for detecting effect in specified direction

One-Tailed vs Two-Tailed Tests

Example: Battery life (μ₀ = 24 hours)

Two-tailed:

Is it different? (could be better OR worse) - H₁: μ ≠ 24

One-tailed:

Is it worse? (only care about decrease) - H₁: μ < 24

Important: Choose BEFORE seeing data based on research question!

CIs and Hypothesis Tests

Key connection: CIs and two-tailed tests give same conclusion!

For a two-tailed test at α:

If (1-α)×100% CI contains μ₀ → Fail to reject H₀
If (1-α)×100% CI does NOT contain μ₀ → Reject H₀

Example: Battery life

95% CI: (22.7, 24.3) hours
Test H₀: μ = 24 vs H₁: μ ≠ 24
Since 24 is IN the CI → Fail to reject H₀ at α = 0.05
Test H₀: μ = 22 vs H₁: μ ≠ 22
Since 22 is NOT in the CI → Reject H₀ at α = 0.05

CIs and Hypothesis Tests

Key connection: CIs and two-tailed tests give same conclusion!

Why this works:

CI shows plausible values for parameter
If μ₀ is plausible (in CI), we don’t reject it

Note: This relationship only holds for two-tailed tests with independence!

THINK-PAIR-SHARE 3 (7 minutes)

Hypothesis Testing with Proportions

A politician claims to have 55% support. A poll of n = 400 voters finds 200 support the politician (p̂ = 0.50). Test at α = 0.01 whether the true support differs from 55%.

Complete the test:

STATE: Write hypotheses, α, parameter
PLAN: Check conditions, identify test statistic
SOLVE: Calculate z and p-value
CONCLUDE: Decision and interpretation

Additional questions:

Construct a 99% CI for p. Does it contain 0.55?
Does the CI conclusion match the hypothesis test? Why?
What would be the conclusion at α = 0.05 instead?

Work in pairs, and then answer on PE individually.

Share your answers in Poll Everywhere!

What is the test statistic (z-value)?

Statistical vs Practical Significance

Statistical significance: p-value < α

Says: Effect is unlikely due to chance alone
Affected by sample size

Practical significance: Effect size matters in real world

Says: Effect is large enough to care about
Independent of sample size

Statistical vs Practical Significance

Example: Large study (n = 10,000)

Drug lowers BP by 1 mmHg
p-value < 0.001 (highly statistically significant!)
But 1 mmHg clinically meaningless (not practically significant)

Example: Small study (n = 20)

Drug lowers BP by 15 mmHg
p-value = 0.08 (not statistically significant)
But 15 mmHg very important clinically (practically significant!)

Always report: p-value AND effect size (like difference in means)

Common Mistakes in Hypothesis Testing

Mistake 1: “Accepting” H₀

Wrong: “We accept H₀”
Right: “We fail to reject H₀” or “insufficient evidence against H₀”

Mistake 2: Wrong interpretation of p-value

Wrong: “p = 0.03 means there’s 3% chance H₀ is true”
Right: “p = 0.03 means data this extreme occur 3% of time if H₀ true”

Mistake 3: Changing α after seeing results

Must choose α BEFORE analyzing data
“P-hacking” or “data fishing” invalidates test

Common Mistakes in Hypothesis Testing

Mistake 4: Confusing significance with importance

Small p-value doesn’t mean large or important effect

Mistake 5: Wrong conclusion statement

Always state conclusion in context of problem
Not just “reject H₀” but what this means practically

Complete Test: Dr. Chen’s Drug Trial

Setup: Drug should reduce BP by ≥10 mmHg. Test n = 50 patients, find x̄ = 8.5 mmHg reduction, s = 6 mmHg. Test at α = 0.05.

STEP 1: STATE

H₀: μ ≥ 10 mmHg (drug effective as claimed)
H₁: μ < 10 mmHg (drug not effective enough)
α = 0.05, μ = true mean BP reduction

STEP 2: PLAN

Random sample ✓, n = 50 ≥ 30 ✓
Test statistic: \(t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\), df = 49

Complete Test: Dr. Chen’s Drug Trial

STEP 3: SOLVE

t = (8.5 - 10)/(6/SQRT(50)) = -1.5/0.849 = -1.77
p-value = T.DIST(-1.77, 49, TRUE) ≈ 0.042

STEP 4: CONCLUDE

p-value = 0.042 < α = 0.05 → Reject H₀
Conclusion: There is sufficient evidence at the 0.05 level to conclude that the drug reduces blood pressure by less than 10 mmHg. The drug does not meet the effectiveness claim.

Practical implication: FDA should not approve based on this evidence.

Choosing the Right Test

Scenario	Parameter	Conditions	Test Statistic	Distribution
Mean, σ known	μ	Random, any n if normal, n≥30 if not	\(z = \frac{\bar{x}-\mu_0}{\sigma/\sqrt{n}}\)	z
Mean, σ unknown	μ	Random, n≥30 or normal	\(t = \frac{\bar{x}-\mu_0}{s/\sqrt{n}}\)	t (df=n-1)
Proportion	p	Random, np₀≥10, n(1-p₀)≥10	\(z = \frac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\)	z

Choosing the Right Test

Decision tree:

Testing mean or proportion?
If mean: Do you know σ?
Check conditions
Calculate test statistic
Find p-value
Make decision

THINK-PAIR-SHARE 4 (7 minutes)

Complete Hypothesis Test with Errors

A factory produces bolts with specified length of 5 cm. Quality control samples n = 40 bolts, finds x̄ = 5.15 cm, s = 0.4 cm.

Test at α = 0.05 whether mean length differs from 5 cm (complete 4 steps)
What type of error could you have made? Describe it in context.
If you failed to reject H₀, what would the Type II error be in context?
How would you reduce the probability of Type II error?
Construct a 95% CI for μ. Does it support your hypothesis test conclusion?
If α = 0.01 instead, would your conclusion change?

Use Google Sheets! Post on Ed Discussion!

Summary: Hypothesis Testing Framework

The Four-Step Process:

STATE: H₀, H₁, α, parameter
PLAN: Conditions, test statistic, distribution
SOLVE: Calculate test statistic and p-value
CONCLUDE: Decision and interpretation in context

Key Concepts:

Hypotheses: H₀ (assume true), H₁ (trying to show)
Test statistic: Measures deviation from H₀
P-value: Probability of data or more extreme, given H₀
α: Significance level (Type I error rate)
Errors: Type I (reject true H₀), Type II (fail to reject false H₀)
Power: 1 - β, probability of detecting false H₀

Google Sheets: T.DIST, NORM.S.DIST for p-values

Connecting Everything: CIs and Hypothesis Tests

Both methods use: - Same conditions (random sampling, sample size) - Same distributions (z or t) - Same standard errors

Key differences:

Feature	Confidence Interval	Hypothesis Test
Purpose	Estimate parameter	Test claim about parameter
Output	Range of values	Decision (reject or not)
Information	Shows precision	Shows significance
Flexibility	Test many values	Test one value

When to use each:

CI: When you want to estimate parameter
Test: When you want to test specific claim
Both: Often reported together in research!

Example: Drug trial - CI: (7.8, 9.2) mmHg reduction - Test: Reject H₀: μ ≥ 10 (p = 0.042) - Together: Drug reduces BP by 8-9 mmHg (not the claimed 10+)

Back to Dr. Chen’s Story

Dr. Chen now understands:

✅ How to formulate hypotheses about drug effectiveness

✅ The trade-off between Type I and Type II errors

✅ How to calculate and interpret p-values

✅ That rejecting H₀: μ ≥ 10 means insufficient evidence drug is effective

✅ How to communicate findings: “Drug reduces BP by 8.5 mmHg (95% CI: 7.8-9.2), which is significantly less than the claimed 10 mmHg (p = 0.042)”

Her decision:

Reject the pharmaceutical company’s claim
Recommend against FDA approval
Suggest company reformulate or revise claims

This is evidence-based decision making!

Real-World Applications

Medicine: - Clinical trials for new treatments - Comparing treatment effectiveness

Business: - A/B testing for website designs - Quality control in manufacturing

Science: - Testing scientific theories - Comparing experimental conditions

Public Policy: - Evaluating program effectiveness - Testing policy impacts

Sports: - Player performance analysis - Strategy effectiveness

All use the same hypothesis testing framework we learned today!

Quick Knowledge Check ✅

Rate your confidence (1-5) on Ed Discussion:

Formulating null and alternative hypotheses ⭐⭐⭐⭐⭐
Understanding Type I and Type II errors ⭐⭐⭐⭐⭐
Calculating and interpreting p-values ⭐⭐⭐⭐⭐
Conducting complete hypothesis tests (4 steps) ⭐⭐⭐⭐⭐
Choosing appropriate test statistics ⭐⭐⭐⭐⭐
Understanding relationship between CIs and tests ⭐⭐⭐⭐⭐
Interpreting results in context ⭐⭐⭐⭐⭐

If you rated anything 3 or below, visit office hours!

STAT 17: Confidence Intervals & Hypothesis Testing

Case Study: The Drug Approval Decision

Quick Review: Confidence Intervals

What We’ll Accomplish Today

From Estimation to Testing

The Logic of Hypothesis Testing

The Logic of Hypothesis Testing

The Null and Alternative Hypotheses

The Null and Alternative Hypotheses

Types of Alternative Hypotheses

Types of Alternative Hypotheses

Formulating Hypotheses: Examples

Formulating Hypotheses: Examples

THINK-PAIR-SHARE 1 (7 minutes)

Test Statistics: The Evidence Measure

The P-Value: Measuring Evidence

Calculating P-Values by Test Type

Significance Level (alpha α)

The Four-Step Hypothesis Test

The Four-Step Hypothesis Test

Example: Battery Life Test

Example: Battery Life Test

Google Sheets: P-Value Calculations

THINK-PAIR-SHARE 2 (7 minutes)

Share your answers in Poll Everywhere!

🧘‍♀️ STRETCH BREAK

Time to move! (5 minutes)

Type I and Type II Errors

Understanding Errors in Context

Calculating Type I Error Probability

Calculating Type II Error Probability

Calculating Type II Error Probability

Power Analysis Example

Example: Testing a Proportion

Example: Testing a Proportion

One-Tailed vs Two-Tailed Tests

One-Tailed vs Two-Tailed Tests

CIs and Hypothesis Tests

CIs and Hypothesis Tests

THINK-PAIR-SHARE 3 (7 minutes)

Share your answers in Poll Everywhere!

Statistical vs Practical Significance

Statistical vs Practical Significance

Common Mistakes in Hypothesis Testing

Common Mistakes in Hypothesis Testing

Complete Test: Dr. Chen’s Drug Trial

Complete Test: Dr. Chen’s Drug Trial

Choosing the Right Test

Choosing the Right Test

THINK-PAIR-SHARE 4 (7 minutes)

Summary: Hypothesis Testing Framework

Connecting Everything: CIs and Hypothesis Tests

Back to Dr. Chen’s Story

Real-World Applications

Quick Knowledge Check ✅

Thank you! 📊✨