Statistics - UCSC
23 Oct 2025
Meet Maria, a UCSC student working part-time while pursuing her degree.
Her challenge: She commutes from San Jose and wants to understand:
The pattern: All these questions involve measurements that follow a bell curve - the Normal Distribution!
Understanding the normal distribution helps Maria (and you!) make informed decisions about time management, goal-setting, and career planning.
What we learned last time:
Key Difference from Discrete:
X ~ Uniform(a, b): All values equally likely
PDF: f(x) = 1/(b-a) for a ≤ x ≤ b
Properties:
Example: Bus arrives Uniform(0, 20) minutes - Flat probability across interval - Simple rectangular areas
By the end of this lecture, you will be able to:
The most important distribution in statistics!
Where do we see it?
Why so common? Central Limit Theorem (coming soon!)
Notation: X ~ Normal(μ, σ²) or X ~ N(μ, σ²)
Key Parameters:
{
}
Properties:
For ANY normal distribution:
Example: Commute time X ~ N(45, 100) minutes
μ = 45 minutes, σ = 10 minutes
68% of commutes: 35 to 55 minutes
95% of commutes: 25 to 65 minutes
99.7% of commutes: 15 to 75 minutes
This helps Maria plan - she should budget 65 minutes to be 97.5% confident!
The normal PDF is complex:
\[f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}\]
Problems:
Solution: Use technology (Google Sheets) or standardize with z-scores!
Definition: Z ~ N(0, 1)
Why standardize?
Instead of infinite tables for every μ and σ, we use ONE standard table and transform any normal variable to match it.
The transformation: z = (x - μ)/σ
This is called a z-score!
Z-score formula: z = (x - μ)/σ
Interpretation: Number of standard deviations from the mean
Properties:
Example: Maria’s commute X ~ N(45, 100)
Today’s commute: 60 minutes
z = (60 - 45)/10 = 1.5
Interpretation: 1.5 standard deviations above average (slower than usual!)
Heights of UCSC students: X ~ N(66, 9) inches
Calculate and interpret z-scores:
Understanding Z-Scores
GPA at UCSC: X ~ N(3.2, 0.16), so σ = 0.4
Calculate z-scores and interpret:
Bonus: What GPA corresponds to z = -1?
Answer question 4: Who is further from the mean?
Two essential functions:
=NORM.DIST(x, mean, std_dev, TRUE)
Returns P(X ≤ x) : cumulative probability
=NORM.INV(probability, mean, std_dev)
Returns x-value for given cumulative probability
For Standard Normal (Z):
=NORM.S.DIST(z, TRUE)
=NORM.S.INV(probability)
Maria’s commute: X ~ N(45, 100) minutes
Question: What’s the probability her commute is 50 minutes or less?
Solution in Google Sheets:
=NORM.DIST(50, 45, 10, TRUE)
Answer: ≈ 0.6915 or 69.15%
Interpretation: About 69% of the time, Maria’s commute takes 50 minutes or less.
Question in exam Given this information, what is the probability her commute is 50 minutes or more?
Question: Probability Maria’s commute exceeds 55 minutes?
Remember: P(X > x) = 1 - P(X ≤ x)
Solution in Google Sheets:
=1 - NORM.DIST(55, 45, 10, TRUE)
Answer: ≈ 0.1587 or 15.87%
Interpretation: About 16% of days, she should expect a commute longer than 55 minutes.
Question in exam Given this information, what is the probability that Maria’s commute is equal or exceeds 55 minutes?
Question: Probability Maria’s commute is between 40 and 50 minutes?
Formula: P(a ≤ X ≤ b) = P(X ≤ b) - P(X ≤ a)
Solution in Google Sheets:
=NORM.DIST(50, 45, 10, TRUE) - NORM.DIST(40, 45, 10, TRUE)
Step by step:
P(X ≤ 50) ≈ 0.6915
P(X ≤ 40) ≈ 0.3085
P(40 ≤ X ≤ 50) ≈ 0.3830
Interpretation: About 38% of her commutes fall in this range.
Question in exam Given this information, what is the probability that Maria’s commute is more than 50 minutes?
Question: Maria wants to leave early enough so she arrives on time 90% of days. How much time should she budget?
We need: x such that P(X ≤ x) = 0.90
Solution in Google Sheets:
=NORM.INV(0.90, 45, 10)
Answer: ≈ 57.8 minutes
Interpretation: Budget 58 minutes to be 90% confident of arriving on time!
For X ~ N(μ, σ²):
| Type | Formula | Google Sheets |
|---|---|---|
| P(X ≤ x) | Direct | =NORM.DIST(x, μ, σ, TRUE) |
| P(X < x) | Same as ≤ | =NORM.DIST(x, μ, σ, TRUE) |
| P(X ≥ x) | 1 - P(X ≤ x) | =1-NORM.DIST(x, μ, σ, TRUE) |
| P(X > x) | Same as ≥ | =1-NORM.DIST(x, μ, σ, TRUE) |
| P(a ≤ X ≤ b) | P(X≤b)-P(X≤a) | =NORM.DIST(b,μ,σ,TRUE)-NORM.DIST(a,μ,σ,TRUE) |
Key: For continuous distributions, < and ≤ give the same result!
Time to recharge!
Successful UCSC students study X ~ N(15, 16) hours per week
Scenario: Maria wants to understand study patterns.
Calculate:
P(X ≤ 12) - students studying 12 hours or less
=NORM.DIST(12, 15, 4, TRUE) ≈ 0.2266
About 23% of students
P(12 ≤ X ≤ 18) - students in the “typical” range
=NORM.DIST(18, 15, 4, TRUE) - NORM.DIST(12, 15, 4, TRUE) ≈ 0.5468
About 55% fall in this range
Normal Distribution Applications
Starting salary for UCSC graduates in Maria’s field: X ~ N(65000, 100000000) dollars (σ = $10,000)
Calculate using Google Sheets:
Bonus: If Maria wants to be in the top 10% of earners, what salary does she need?
What salary represents the 75th percentile?
When to use Z ~ N(0, 1):
If you’ve already calculated z-scores, use standard normal functions!
Example: z = 1.5
=NORM.S.DIST(1.5, TRUE) ≈ 0.9332
Interpretation: 93.32% of data falls below z = 1.5
Inverse:
=NORM.S.INV(0.95) ≈ 1.645
Interpretation: z = 1.645 is the 95th percentile
Two approaches for the same problem:
Approach 1: Work directly with X
=NORM.DIST(x, μ, σ, TRUE)
Approach 2: Convert to Z first, then use standard normal
z = (x - μ)/σ
=NORM.S.DIST(z, TRUE)
Both give the same answer! Use whichever feels more comfortable.
Tip: Approach 1 is usually faster and less prone to arithmetic errors.
Midterm scores: X ~ N(75, 64), so σ = 8
Questions for the class:
What percentage scored below 70?
=NORM.DIST(70, 75, 8, TRUE) ≈ 0.2660
About 27%
Professor A curves: top 15% get A’s. What’s the cutoff?
=NORM.INV(0.85, 75, 8) ≈ 83.29
Need 84+ for an A
Percentile: The value below which a percentage of data falls
Common percentiles:
Formula in Google Sheets:
=NORM.INV(percentile/100, mean, std_dev)
Example: 80th percentile of Maria’s commute
=NORM.INV(0.80, 45, 10) ≈ 53.4 minutes
IQR = Q3 - Q1 (middle 50% of data)
For normal distributions:
Q1 = =NORM.INV(0.25, μ, σ)
Q3 = =NORM.INV(0.75, μ, σ)
IQR = Q3 - Q1
Example: Study hours X ~ N(15, 16)
Q1 = =NORM.INV(0.25, 15, 4) ≈ 12.3 hours
Q3 = =NORM.INV(0.75, 15, 4) ≈ 17.7 hours
IQR ≈ 5.4 hours
Interpretation: The middle 50% of students study between 12-18 hours weekly.
Comprehensive Normal Distribution Problem
Credit hours per quarter at UCSC: X ~ N(16, 4), so σ = 2
Answer these questions:
Use Google Sheets for all calculations!
Between what two values do the middle 80% of students fall?
Mistake 1: Confusing σ and σ² - Parameters are (μ, σ²) but Google Sheets uses σ!
Mistake 2: Forgetting to use TRUE for cumulative
=NORM.DIST(x, μ, σ, FALSE) ← PDF (rarely needed)
=NORM.DIST(x, μ, σ, TRUE) ← CDF (what we want!)
Mistake 3: P(X > x) without the complement
Wrong: =NORM.DIST(x, μ, σ, TRUE)
Right: =1-NORM.DIST(x, μ, σ, TRUE)
Mistake 4: Using the wrong function for inverse problems - Finding values: use NORM.INV, not NORM.DIST
Key Concepts:
Why we use Google Sheets: Integrals don’t have a close solution for the normal pdf, cannot be calculated by hand!
Applications: Test scores, commute times, salaries, measurements - the normal distribution is everywhere!
Maria now understands:
✅ Budget 58 minutes for her commute (90% confidence)
✅ Her 3.6 GPA is above average (z = 1, top 16%)
✅ Studying 15-20 hours/week puts her in the successful range
The normal distribution helps her:
Set realistic expectations
Plan her schedule effectively
Make data-informed decisions
This is the power of understanding statistics!
Rate your confidence (1-5) on Ed Discussion:
If you rated anything 3 or below, visit office hours!
Questions? I have office hours right after class today!
Next up: More continuous distributions and the Central Limit Theorem
Remember:
STAT 17 – Fall 2025