Lecture 8: Normal Distribution & Intro to Inference

STAT 7 - Statistical Methods for the Biological, Environmental & Health Sciences

10 Mar 2026

Welcome! Today’s Agenda

  1. Normal Distribution
  2. Normal Approximation to Binomial
  3. Sampling and Samples
  4. Parameters vs. Statistics

This is where probability meets inference!

Normal Distribution

The Normal Distribution

The most important continuous distribution in statistics!

Properties:

  • Bell-shaped, symmetric curve
  • Defined by two parameters: mean (\(\mu\)) and standard deviation (\(\sigma\))
  • Notation: \(X \sim N(\mu, \sigma)\)
  • Total area under curve = 1
  • Many natural phenomena are approximately normal

Why Is It Called “Normal”?

History: The name comes from early statisticians who thought this distribution was the “norm” or typical pattern in nature.

Reality: It’s common but not universal!

  • Heights, weights (within groups)
  • Measurement errors
  • Test scores (often designed to be)
  • Averages of large samples (Central Limit Theorem!)

Normal Distribution Notation

\[X \sim N(\mu, \sigma)\]

\(\mu = 0, \sigma = 1\)

\(\mu = 20, \sigma = 3\)

Same shape, different location and spread!

The 68-95-99.7 Rule

Empirical Rule: For any normal distribution:

  • 68% of data within 1 SD of mean
  • 95% within 2 SD
  • 99.7% within 3 SD

This is incredibly useful for quick estimates!

Example: Student Heights

Heights in this class are approximately \(N(168, 7)\) (cm)

Using the 68-95-99.7 rule:

  1. What percentage are between 161 and 175 cm?
    • 161 = 168 - 7 and 175 = 168 + 7
    • This is \(\mu \pm 1\sigma\), so ≈68%
  2. What percentage are between 154 and 182 cm?
    • 154 = 168 - 14 and 182 = 168 + 14
    • This is \(\mu \pm 2\sigma\), so ≈95%
  3. What percentage are taller than 168 cm?
    • 168 = \(\mu\), and normal is symmetric
    • 50%

Z-Scores: Standardizing

Z-score = number of standard deviations from the mean

\[z = \frac{x - \mu}{\sigma}\]

Properties:

  • Positive z → above mean
  • Negative z → below mean
  • \(|z| > 2\) is “unusual”

Purpose: Allows comparison across different normal distributions!

Z-Score Example: Test Scores

SAT: \(N(1500, 300)\), ACT: \(N(21, 5)\)

  • Pam scored 1800 on SAT
  • Jim scored 24 on ACT

Who did better relative to their test?

Pam’s z-score: \[z = \frac{1800 - 1500}{300} = 1.0\]

Jim’s z-score: \[z = \frac{24 - 21}{5} = 0.6\]

Pam scored better (1 SD vs 0.6 SD above mean)

Using Google Sheets for Normal

NORM.DIST(x, mean, sd, cumulative)

  • Returns P(X ≤ x) when cumulative = TRUE
  • We almost always use cumulative = TRUE

NORM.INV(probability, mean, sd)

  • Returns the x value with that cumulative probability
  • Useful for finding percentiles

Example: For \(N(168, 7)\), what’s P(X ≤ 175)?

=NORM.DIST(175, 168, 7, TRUE) = 0.841

Practice: Normal Calculations

Heights: \(N(168, 7)\) cm

Calculate using Google Sheets:

  1. P(height > 168)
    • =1 - NORM.DIST(168, 168, 7, TRUE) = 0.50
  2. P(height between 161 and 175)
    • =NORM.DIST(175, 168, 7, TRUE) - NORM.DIST(161, 168, 7, TRUE)
    • = 0.841 - 0.159 = 0.682 ≈ 68%
  3. Find height of 70th percentile
    • =NORM.INV(0.70, 168, 7) = 171.7 cm

Interactive Demo

Let’s explore with the Normal Distribution App:

https://istats.shinyapps.io/NormalDist/

Try different values of \(\mu\) and \(\sigma\)!

Normal Distribution: More Examples

Example: Test scores are \(N(100, 15)\)

Using Google Sheets:

  1. What percentage have Test Score > 130?
    • =1 - NORM.DIST(130, 100, 15, TRUE) = 0.0228 ≈ 2.3%
  2. What Test Score score is the 90th percentile?
    • =NORM.INV(0.90, 100, 15) = 119.2
  3. What percentage have Test Score between 85 and 115?
    • =NORM.DIST(115, 100, 15, TRUE) - NORM.DIST(85, 100, 15, TRUE)
    • = 0.841 - 0.159 = 0.682 ≈ 68%

Checking for Normality

How do we know if data are approximately normal?

  1. Histogram - Is it roughly bell-shaped and symmetric?
  2. Q-Q Plot (Quantile-Quantile) - Do points fall on a straight line?
  3. 68-95-99.7 Rule - Does the data follow this pattern?
  4. Statistical tests - Shapiro-Wilk, Kolmogorov-Smirnov (later!)

Remember: Real data are never exactly normal, we look for “close enough”

Not Everything is Normal!

Examples of NON-normal distributions:

Right-skewed:

  • Income
  • House prices
  • Time until failure

Left-skewed:

  • Age at death (developed countries)
  • Test scores (easy test)

Bimodal, uniform, etc.

Don’t force normality where it doesn’t exist!

Stretch Break!

Stand up, stretch, take a deep breath!

Normal Approximation to Binomial

Recall: Binomial Distribution

\(X \sim Bin(n, p)\) counts successes in n trials

  • Discrete (whole numbers only)
  • Can calculate exact probabilities
  • But cumbersome for large n!

Question: Can we use the normal distribution to approximate?

The Normal-Binomial Connection

When n is large, the binomial distribution starts to look normal!

When Can We Approximate?

Rule of thumb: Normal approximation is good when:

\[np \geq 10 \quad \text{AND} \quad n(1-p) \geq 10\]

Why these conditions?

  • Ensures enough “successes” and “failures” expected
  • Distribution is sufficiently symmetric
  • Not too skewed to one side

If conditions met: Use \(N(\mu, \sigma)\) where:

\[\mu = np \qquad \sigma = \sqrt{np(1-p)}\]

Example: Should We Approximate?

Check if normal approximation is appropriate:

  1. n = 30, p = 0.5
    • \(np = 30(0.5) = 15 \geq 10\)
    • \(n(1-p) = 30(0.5) = 15 \geq 10\)
    • YES, use approximation
  2. n = 100, p = 0.05
    • \(np = 100(0.05) = 5 < 10\)
    • NO, don’t use approximation
  3. n = 500, p = 0.03
    • \(np = 500(0.03) = 15 \geq 10\)
    • \(n(1-p) = 500(0.97) = 485 \geq 10\)
    • YES, use approximation

Continuity Correction

Problem: Binomial is discrete, Normal is continuous

Solution: Continuity correction

  • For P(X ≤ k), use P(X < k + 0.5)
  • For P(X ≥ k), use P(X > k - 0.5)
  • For P(X = k), use P(k - 0.5 < X < k + 0.5)

Why? The bar for X = k extends from k - 0.5 to k + 0.5

Example: Normal Approximation

Setup: Flip a fair coin 100 times. Find P(X ≥ 60) where X = # heads.

Check conditions:

  • \(np = 100(0.5) = 50 \geq 10\)
  • \(n(1-p) = 100(0.5) = 50 \geq 10\)

Using normal approximation:

\[\mu = np = 50, \quad \sigma = \sqrt{np(1-p)} = \sqrt{25} = 5\]

With continuity correction: P(X ≥ 60) ≈ P(Z > 59.5)

In Google Sheets: =1 - NORM.DIST(59.5, 50, 5, TRUE) = 0.0287

About 2.9% chance of 60+ heads

Comparison: Exact vs Approximate

For the coin example (n=100, p=0.5):

Exact (using Binomial):

=1 - BINOM.DIST(59, 100, 0.5, TRUE)

Result: 0.0284

Approximate (using Normal):

=1 - NORM.DIST(59.5, 50, 5, TRUE)

Result: 0.0287

Very close! Normal approximation works well here.

From Probability to Inference

The Big Picture

So far (Weeks 1-4):

  • Described data
  • Calculated probabilities
  • Modeled random variables

Descriptive statistics & Probability

Coming up (Weeks 5-10):

  • Make inferences about populations
  • Test hypotheses
  • Estimate parameters

Inferential statistics

Bridge: Sampling distributions!

Populations vs Samples

Population

  • All individuals of interest
  • Usually large or infinite
  • Parameters describe it
  • Often unknown

Examples:

  • μ (population mean)
  • σ (population SD)
  • p (population proportion)

Sample

  • Subset of population
  • Practical to collect
  • Statistics describe it
  • We calculate these

Examples:

  • \(\bar{x}\) (sample mean)
  • s (sample SD)
  • \(\hat{p}\) (sample proportion)

Parameters vs Statistics

Key distinction:

Parameter → Population → Unknown (usually)

Statistic → Sample → Known (calculated)

Goal of inference: Use statistics to estimate parameters!

Example: Parameters vs Statistics

Scenario: Study average sleep hours of UCSC students

  • Population: All UCSC students Winter 2026 (~20,000)
  • Parameter of interest: μ = true average sleep hours
  • Sample: 100 randomly selected students
  • Statistic: \(\bar{x}\) = 6.8 hours (from our sample)

Question: Is \(\bar{x} = 6.8\) exactly equal to μ?

Answer: Probably not! There’s sampling variability

Sampling Variability

Key insight: Different samples give different statistics!

This is natural and expected!

The Path Forward

What we’ve built:

  1. Probability foundations ✓
  2. Random variables & distributions ✓
  3. Sampling distributions ✓

What’s next (Week 5+):

  • Review for the midterm (Tuesday) come with questions!
  • Thursday: midterm
  • Week 6: Sampling Distributions and Central Limit Theorem

Exit Ticket

Before you leave:

  1. Post the solutions to these on Ed Discussion:
  1. Name one new thing you learned today

  2. Check approximation: n=50, p=0.15. Can we use normal approximation?

  1. Due Friday:
    • HW 3
  2. Complete during DS:
    • DSA 3

Questions?

Office hours: I’ll be in the classroom nextdoor after class if you need me. If you can’t stay, check our office hours on Ed Discussion and look for a spot that works with your schedule.

Practice is essential: Don’t forget to think deeply when solving your assignments. Don’t do it as a check list, but rather as an opportunity to think about the concepts that we have learned.

Have a nice weekend!

See you next week!

Sampling Distributions

What is a Sampling Distribution?

Definition: The probability distribution of a statistic across all possible samples of size n

  • Not the distribution of the data
  • Distribution of the statistic (e.g., \(\bar{x}\))
  • Shows how the statistic varies from sample to sample
  • Theoretical concept (we imagine all possible samples)

Building Intuition

Thought experiment:

  1. Take a random sample of n=30 students, calculate \(\bar{x}\)
  2. Take another sample of n=30, calculate \(\bar{x}\) again
  3. Repeat thousands of times
  4. Make a histogram of all the \(\bar{x}\) values

That histogram approximates the sampling distribution of \(\bar{x}\)

Sampling Distribution Simulation

Notice: It’s approximately normal and centered at μ!

Properties of Sampling Distribution of \(\bar{x}\)

For samples of size n from a population with mean μ and SD σ:

1. Center: \[E(\bar{x}) = \mu\]

The mean of the sampling distribution equals the population mean

2. Spread: \[SD(\bar{x}) = \frac{\sigma}{\sqrt{n}}\]

Called the standard error (SE)

3. Shape: Depends on population and sample size (more soon!)

Standard Error vs Standard Deviation

Important distinction!

Standard Deviation (σ)

  • Measures spread of individuals in population
  • Does NOT depend on sample size
  • Describes population variability

Standard Error (SE)

  • Measures spread of sample means
  • \(SE = \frac{\sigma}{\sqrt{n}}\)
  • Decreases as n increases
  • Describes sampling variability

Why Does SE Decrease with n?

\[SE = \frac{\sigma}{\sqrt{n}}\]

Intuition: Larger samples give more stable estimates

  • Sample of n=1: Very unreliable, high variability
  • Sample of n=100: Much more reliable, low variability
  • Sample of n=10,000: Very reliable, very low variability

The Central Limit Theorem (CLT)

One of the most important theorems in statistics!

Central Limit Theorem Statement

For samples of size n from ANY population with mean μ and SD σ:

As n increases, the sampling distribution of \(\bar{x}\) becomes approximately normal with:

  • Mean = μ
  • Standard deviation = \(\frac{\sigma}{\sqrt{n}}\)

In notation:

\[\bar{x} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right) \quad \text{approximately, for large } n\]

What Makes CLT Amazing?

  1. Works for ANY population distribution (skewed, bimodal, anything!)
  2. Only needs n to be “large enough” (usually n ≥ 30)
  3. Explains why normal distribution is so common
  4. Foundation for most inferential statistics

CLT Visualization

Original population: Right-skewed (NOT normal)

Notice: As n increases, sampling distribution becomes normal!

When Does CLT Apply?

General rule: n ≥ 30 is usually sufficient

BUT this depends on population shape:

  • Population is normal: CLT applies for ANY n (even n=1!)
  • Population is symmetric: n ≥ 15 usually fine
  • Population is moderately skewed: n ≥ 30 needed
  • Population is heavily skewed: May need n ≥ 50 or more

Key point: More skewed population → need larger n

Example: Using the CLT

Problem: Average adult height μ = 170 cm, σ = 10 cm.

For a random sample of n = 36 adults, find P(\(\bar{x}\) > 172).

Solution:

By CLT, \(\bar{x} \sim N(170, \frac{10}{\sqrt{36}}) = N(170, 1.67)\)

In Google Sheets:

=1 - NORM.DIST(172, 170, 1.67, TRUE)

Result: 0.115

About 11.5% chance the sample mean exceeds 172 cm

CLT Application: Quality Control

Scenario: Light bulb manufacturer

  • Population: μ = 1000 hours, σ = 100 hours
  • Take sample of n = 25 bulbs
  • Want P(990 < \(\bar{x}\) < 1010)

Solution:

\(\bar{x} \sim N(1000, \frac{100}{\sqrt{25}}) = N(1000, 20)\)

=NORM.DIST(1010, 1000, 20, TRUE) - NORM.DIST(990, 1000, 20, TRUE)

Result: 0.383

38.3% of samples will have mean between 990-1010 hours

Summary: Key Distributions

Distribution Type When to Use Parameters
Binomial Discrete Count successes in n trials n, p
Normal Continuous Symmetric, bell-shaped data μ, σ
Sampling Dist of \(\bar{x}\) Theoretical Distribution of sample means μ, σ/√n

Remember: CLT tells us the third one is approximately Normal!

Common Misconceptions

  1. “CLT makes the population distribution normal”
    • NO! It makes the sampling distribution of \(\bar{x}\) normal
  2. “Larger samples reduce bias”
    • NO! They reduce variability (SE decreases)
    • Bias is a systematic error, not fixed by sample size
  3. “Standard error and standard deviation are the same”
    • NO! SD measures individual variability, SE measures sampling variability
  4. “Normal approximation always works for binomial”
    • NO! Only when np ≥ 10 AND n(1-p) ≥ 10

Summary:

  • Normal approximation to binomial (when np ≥ 10 and n(1-p) ≥ 10)
  • Parameters (population) vs Statistics (sample)
  • Sampling distributions show how statistics vary
  • Standard Error = SD of sampling distribution = σ/√n
  • Central Limit Theorem: Sampling distribution of \(\bar{x}\) is approximately N(μ, σ/√n) for large enough n

This is the foundation for all of inferential statistics!