28 Apr 2026
Foundations of Statistics & Data
Descriptive Statistics & Visualization
Probability Foundations
Marcus is a UCSC Economics major with entrepreneurial dreams…
His mission: Launch a student-run coffee cart at the base of campus during morning commutes
Key business questions:
The Challenge: Marcus needs to quantify uncertainty about customer counts and revenue to make smart business decisions! 🤔
Today’s goal: Use random variables and discrete distributions to model his business!
By the end of this lecture, you will be able to:
Key Insight: Random variables bridge the gap between random events and numerical analysis
NOT a random variable: “The next customer orders coffee” (this is an event)
IS a random variable: “The number of customers in the next hour” (assigns a number)
Marcus’s Example: Let X = number of customers during the 8-9am hour
X could be 0, 1, 2, 3, … (any non-negative integer)
Standard Notation:
Example: “Let X = number of customers”
Example: “x = 5 customers”
Example: “P(X = 5) = 0.15” means 15% chance of exactly 5 customers
Marcus’s Coffee Cart Example:
Discrete Random Variables
Marcus’s Examples:
Continuous Random Variables
Marcus’s Examples:
Today’s Focus: DISCRETE Random Variables (Binomial and Poisson distributions)
Marcus’s Customer Distribution (8-9am hour):
| Number of Customers (x) | Probability P(X = x) |
|---|---|
| 0 | 0.05 |
| 1 | 0.15 |
| 2 | 0.30 |
| 3 | 0.25 |
| 4 | 0.15 |
| 5 | 0.10 |
| Total | 1.00 |
Two Requirements for Valid Probability Distributions:
Formula:
\[E(X) = \mu = \sum [x \cdot P(X = x)]\]
“Sum of (each value × its probability)”
Calculating Marcus’s Expected Customers:
\[E(X) = 0(0.05) + 1(0.15) + 2(0.30) + 3(0.25) + 4(0.15) + 5(0.10)\]
\[E(X) = 0 + 0.15 + 0.60 + 0.75 + 0.60 + 0.50\]
\[E(X) = 2.6 \text{ customers}\]
Business Interpretation: On average, Marcus can expect 2.6 customers per hour during 8-9am. Over many days, the average will approach 2.6.
\[\sigma^2 = \sum[(x - \mu)^2 \cdot P(X = x)]\]
\[\sigma = \sqrt{\sigma^2}\]
Marcus’s Customer Variability (with μ = 2.6):
\[\sigma^2 = (0-2.6)^2(0.05) + (1-2.6)^2(0.15) + (2-2.6)^2(0.30) + ...\]
\[\sigma^2 \approx 1.64\]
\[\sigma \approx 1.28 \text{ customers}\]
Interpretation: The number of customers typically varies by about 1.28 from the average of 2.6. Most hours will see between 1-4 customers.
Practice with Expected Value & Variance:
Marcus is considering a lunch hour (12-1pm) operation. Here’s his probability distribution:
| Customers (x) | P(X = x) |
|---|---|
| 3 | 0.10 |
| 4 | 0.20 |
| 5 | 0.35 |
| 6 | 0.25 |
| 7 | 0.10 |
Calculate:
Discuss with a partner, then post your calculations on Poll Everywhere!
When Do We Use Binomial Distribution?
The binomial distribution models situations where we have a fixed number of independent trials, each with only two possible outcomes (success/failure).
Four Conditions for Binomial Distribution (BINS):
Scenario: Marcus offers free samples to 10 random students. Based on past data, 30% of students who try a sample make a purchase.
Let X = number of students (out of 10) who make a purchase
Check BINS conditions:
Therefore: X ~ Binomial(10, 0.30)
Standard Notation:
\[X \sim \text{Binomial}(n, p)\]
“X follows a binomial distribution with parameters n and p”
Parameters:
Marcus’s Example: X ~ Binomial(10, 0.30)
The Formula:
\[P(X = x) = \binom{n}{x} \cdot p^x \cdot (1-p)^{n-x}\]
Where:
Example: X ~ Binomial(10, 0.30), find P(X = 3)
\[P(X = 3) = \binom{10}{3} \cdot (0.30)^3 \cdot (0.70)^7\]
\[P(X = 3) = 120 \cdot 0.027 \cdot 0.0824 \approx 0.267\]
Good news! We don’t usually calculate by hand - we use Google Sheets! 💻
BINOM.DIST Function:
=BINOM.DIST(x, n, p, cumulative)
Examples for X ~ Binomial(10, 0.30):
| Probability | Formula | Result |
|---|---|---|
| P(X = 3) | =BINOM.DIST(3, 10, 0.3, FALSE) |
0.2668 |
| P(X ≤ 3) | =BINOM.DIST(3, 10, 0.3, TRUE) |
0.6496 |
| P(X ≥ 4) | =1 - BINOM.DIST(3, 10, 0.3, TRUE) |
0.3504 |
| P(2 ≤ X ≤ 5) | =BINOM.DIST(5,10,0.3,TRUE) - BINOM.DIST(1,10,0.3,TRUE) |
0.7869 |
Special Formulas for Binomial!
When X ~ Binomial(n, p):
Mean (Expected Value):
\[\mu = E(X) = n \cdot p\]
Variance:
\[\sigma^2 = n \cdot p \cdot (1-p)\]
Standard Deviation:
\[\sigma = \sqrt{n \cdot p \cdot (1-p)}\]
Marcus’s Example: X ~ Binomial(10, 0.30)
Business Insight: Expect 3 purchases ± 1.45, so typically 2-4 purchases per 10 samples
Binomial Distribution Practice:
Marcus runs a promotion: “Buy a coffee, spin the wheel for a free pastry!”
Each customer has a 25% chance of winning. During one morning, 12 customers buy coffee and spin.
Answer these questions:
Post your work on Ed Discussion with your group members’ names!
When Do We Use Poisson Distribution?
The Poisson distribution models the number of events occurring in a fixed interval of time or space, when events occur at a constant average rate and independently.
Key Characteristics:
Marcus’s Question: “How many customers will arrive per hour?”
From past data: average of λ = 4.5 customers per hour
Let X = number of customer arrivals in one hour → Poisson situation! ✅
BINOMIAL
POISSON
Key Insight:
Where Do We See Poisson Distributions?
NOT Poisson Examples:
Notation:
\[X \sim \text{Poisson}(\lambda)\]
λ (lambda) = average rate of events per interval
The Poisson Formula:
\[P(X = x) = \frac{e^{-\lambda} \cdot \lambda^x}{x!}\]
Where:
Example: X ~ Poisson(λ = 4.5), find P(X = 3)
\[P(X = 3) = \frac{e^{-4.5} \cdot 4.5^3}{3!} = \frac{0.0111 \times 91.125}{6} \approx 0.169\]
POISSON.DIST Function:
=POISSON.DIST(x, λ, cumulative)
Examples for X ~ Poisson(4.5):
| Probability | Formula | Result |
|---|---|---|
| P(X = 3) | =POISSON.DIST(3, 4.5, FALSE) |
0.1687 |
| P(X ≤ 5) | =POISSON.DIST(5, 4.5, TRUE) |
0.7029 |
| P(X ≥ 6) | =1 - POISSON.DIST(5, 4.5, TRUE) |
0.2971 |
| P(3 ≤ X ≤ 7) | =POISSON.DIST(7,4.5,TRUE) - POISSON.DIST(2,4.5,TRUE) |
0.7116 |
Remember: FALSE for exact (=), TRUE for cumulative (≤)
Unique Property of Poisson!
When X ~ Poisson(λ):
Mean (Expected Value):
\[\mu = E(X) = \lambda\]
Variance:
\[\sigma^2 = \lambda\]
Standard Deviation:
\[\sigma = \sqrt{\lambda}\]
Amazing fact: For Poisson, the mean EQUALS the variance! This is a key characteristic.
Marcus’s Example: X ~ Poisson(4.5 customers per hour)
Business Insight: Expect 4.5 customers ± 2.12, so typically 2-7 customers per hour
Important Property: If events occur at rate λ per unit time, then:
Marcus’s Example:
Practice: What’s P(at least 10 customers in 2 hours)?
X ~ Poisson(9)
P(X ≥ 10) = 1 - P(X ≤ 9) = =1 - POISSON.DIST(9, 9, TRUE) ≈ 0.413
Poisson Distribution Practice:
Marcus notices that on average, 3.2 customers per hour order specialty drinks (requiring extra prep time).
Let Y = number of specialty drink orders per hour
Answer these questions:
Post your work on Ed Discussion with your group members’ names!
Decision Framework:
Use BINOMIAL when:
Use POISSON when:
Quick Test: Can you list all possible outcomes in advance?
For each scenario, identify Binomial or Poisson:
How Marcus Uses These Distributions:
Binomial Applications:
Poisson Applications:
Combined Strategy:
Binomial: X ~ Binomial(n, p)
=BINOM.DIST(x, n, p, FALSE) # P(X = x)
=BINOM.DIST(x, n, p, TRUE) # P(X ≤ x)
=n*p=n*p*(1-p)=SQRT(n*p*(1-p))Poisson: X ~ Poisson(λ)
=POISSON.DIST(x, lambda, FALSE) # P(X = x)
=POISSON.DIST(x, lambda, TRUE) # P(X ≤ x)
=lambda=lambda=SQRT(lambda)For P(X ≥ a): Use =1 - DIST(a-1, ..., TRUE) for both!
How random variables transformed his business:
✅ Inventory Management: Used Poisson to predict hourly demand → reduced waste by 40%
✅ Sampling Strategy: Used Binomial to calculate optimal sample size → 30% conversion rate
✅ Staffing Decisions: Expected value calculations → hired exactly the right number of helpers
✅ Pricing Strategy: Variance calculations → set prices to cover high-demand periods
✅ Impact: Increased profitability by 45% in first semester! Now expanding to lunch service! ☕💰
Industries Using These Distributions:
Key Insight: Understanding random variables and probability distributions is fundamental to data-driven decision making in ANY field! 📊
Rate your confidence (1-5) on Ed Discussion:
If you rated anything 3 or below, please visit office hours or post questions on Ed! 🤗
Try these before next class:
Problem 1 (Binomial): A multiple-choice exam has 20 questions, each with 4 options. If a student guesses randomly on all questions:
Problem 2 (Poisson): A website receives an average of 15 visits per hour.
Problem 3 (Choosing): For each scenario, identify which distribution to use and why:
Mastering discrete distributions:
Common pitfall to avoid: Reading a problem and immediately calculating without first identifying the distribution and checking conditions!
Beyond the classroom:
Understanding random variables and discrete distributions helps you:
Bottom line: These tools transform vague uncertainty into quantifiable, actionable information! 🚀
Remember:
Marcus started with uncertainty about his coffee cart business…
Now he makes data-driven decisions using probability distributions!
You can too! 📊☕✨
See you next class for continuous distributions! 🎲
Questions? Office hours information on Canvas.
Next up: The Normal Distribution & Continuous Random Variables!
Don’t forget: Post Think-Pair-Share responses on Ed Discussion!
| Feature | Binomial | Poisson |
|---|---|---|
| Type | Discrete | Discrete |
| Question | “How many out of n?” | “How many in interval?” |
| Parameters | n (trials), p (probability) | λ (rate) |
| Notation | X ~ Binomial(n, p) | X ~ Poisson(λ) |
| Mean | μ = np | μ = λ |
| Variance | σ² = np(1-p) | σ² = λ |
| Conditions | BINS | Fixed interval, constant rate, independence |
| Range | x = 0, 1, 2, …, n | x = 0, 1, 2, … (infinite) |
| Google Sheets | BINOM.DIST | POISSON.DIST |
| Example | Coin flips, surveys | Customer arrivals, defects |
Binomial Distribution Functions:
=BINOM.DIST(x, n, p, FALSE) # P(X = x)
=BINOM.DIST(x, n, p, TRUE) # P(X ≤ x)
=1 - BINOM.DIST(x-1, n, p, TRUE) # P(X ≥ x)
=n*p # Mean
=n*p*(1-p) # Variance
=SQRT(n*p*(1-p)) # Standard deviation
Poisson Distribution Functions:
=POISSON.DIST(x, lambda, FALSE) # P(X = x)
=POISSON.DIST(x, lambda, TRUE) # P(X ≤ x)
=1 - POISSON.DIST(x-1, lambda, TRUE) # P(X ≥ x)
=lambda # Mean
=lambda # Variance
=SQRT(lambda) # Standard deviation
Other Useful Functions:
=FACT(n) # n factorial
=COMBIN(n, x) # Combinations: n choose x
=EXP(x) # e^x
General Discrete Random Variable:
\(E(X) = \sum [x \cdot P(X = x)]\)
\(\sigma^2 = \sum[(x - \mu)^2 \cdot P(X = x)]\)
Binomial: X ~ Binomial(n, p):
\(P(X = x) = \binom{n}{x} \cdot p^x \cdot (1-p)^{n-x}\)
\(\mu = np, \quad \sigma^2 = np(1-p)\)
Poisson: X ~ Poisson(λ):
\(P(X = x) = \frac{e^{-\lambda} \cdot \lambda^x}{x!}\)
\(\mu = \lambda, \quad \sigma^2 = \lambda\)
![]()
STAT 17 – Spring 2026