STAT 17: Statistical Methods for Business and Economics

28 Apr 2026

Exam at a Glance

πŸ“ Section 1

15 Multiple Choice
30 points
~3 min per question

✍️ Section 2

2 Free Response
20 points
~20 min per question

⏱️ Time

90 Minutes
DRC: extra time applies
Permitted: pen, ID, calculator

Remember: A number alone is rarely a complete answer β€” always interpret in context.

Concept Map

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#1E3A5F', 'primaryTextColor': '#fff', 'primaryBorderColor': '#0D9488', 'lineColor': '#94A3B8', 'secondaryColor': '#CCFBF1', 'tertiaryColor': '#EFF6FF'}}}%%
graph TD
  M(["πŸŽ“ STAT 17<br/>Midterm"]):::center
  A(["β‘  Sampling &<br/>Study Design"]):::node
  B(["β‘‘ Descriptive<br/>Statistics"]):::node
  C(["β‘’ Probability<br/>Rules"]):::node
  D(["β‘£ Discrete<br/>Random Variables"]):::node
  E(["β‘€ Continuous<br/>Uniform Dist."]):::node
  M --- A
  M --- B
  M --- C
  M --- D
  M --- E
  classDef center fill:#1E3A5F,color:#fff,stroke:#0D9488,stroke-width:3px,rx:10
  classDef node fill:#0D9488,color:#fff,stroke:#065F52,rx:8

All five areas appear on both Multiple Choice and Free Response.

β‘  Sampling & Study Design

Key vocabulary

Population β†’ all units of interest

Sample β†’ subset we actually observe

Parameter β†’ number describing the population

Statistic β†’ number from the sample

Sampling Methods

Method How it works Watch out for
Simple Random Every unit equally likely Requires complete list
Stratified Random sample from each subgroup Groups must be meaningful
Cluster Select whole groups randomly Less precise than stratified
Systematic Select the nth observation Important biases
Convenience Whoever is easiest to reach Bias β€” not representative

Warning

Convenience and voluntary-response sampling systematically exclude parts of the population β€” always flag this as a limitation.

Observational Studies vs. Experiments

πŸ” Observational Study

  • No manipulation of variables
  • Can show association only
  • ❌ Cannot establish causation
  • Watch for confounding variables

Example: Cities with more Starbucks have higher housing prices. Is coffee causing expensive housing?

πŸ§ͺ Experiment

  • Researcher randomly assigns treatments
  • Includes a control group
  • Uses replication
  • βœ… Can establish cause-and-effect

Example: Randomly assign 200 employees to training A or B; compare output.

Variable Types

Numerical (Quantitative)

  • Discrete β€” countable, has gaps
    e.g., # of sales calls, # of defects
  • Continuous β€” measured, infinite values
    e.g., revenue, processing time

Categorical (Qualitative)

  • Nominal β€” no natural order
    e.g., industry sector, product type
  • Ordinal β€” ordered categories
    e.g., credit rating AAA > AA > A

Tip

Why it matters: Variable type determines which summary statistics and methods are appropriate.

Practice ✏️ β€” Sampling & Design

Q1. A researcher surveys every 10th customer who enters a store on a Monday morning. What sampling method is this? What is one limitation?

Q2. A study finds cities with more Starbucks have higher housing prices. A journalist concludes coffee shops cause high prices. What is wrong?

Q3. Classify: (a) monthly revenue ($), (b) credit rating (AAA/AA/A), (c) # customer complaints per day.

Practice ✏️ β€” Sampling & Design

Q1. A researcher surveys every 10th customer who enters a store on a Monday morning. What sampling method is this? What is one limitation?

πŸ‘‰ Systematic sampling. Limitation: Monday customers may not represent all days β€” potential day-of-week bias.

Q2. A study finds cities with more Starbucks have higher housing prices. A journalist concludes coffee shops cause high prices. What is wrong?

πŸ‘‰ Observational study β€” correlation β‰  causation. Both are driven by a confounder: urban population/wealth.

Q3. Classify: (a) monthly revenue ($), (b) credit rating (AAA/AA/A), (c) # customer complaints per day.

πŸ‘‰ (a) Numerical, continuous. (b) Categorical, ordinal. (c) Numerical, discrete.

β‘‘ Descriptive Statistics

Summarize data with numbers and graphs

Measures of Center

Measure Formula Best for
Mean \(\bar{x}\) \(\sum x_i / n\) Symmetric data
Median Middle value Skewed data (e.g., income)
Mode Most frequent Categorical data

Shape

  • Right-skewed: mean > median β†’ use median
  • Left-skewed: mean < median β†’ use median
  • Symmetric: mean β‰ˆ median β†’ either works

Classic Exam Scenario

Annual salaries: mean = $85,000, median = $62,000.

β†’ Right-skewed (a few very high earners pull mean up)
β†’ Report the median as the β€œtypical” salary

Measures of Spread

Standard Deviation & Variance

\[s^2 = \frac{\sum(x_i - \bar{x})^2}{n-1} \qquad s = \sqrt{s^2}\]

Larger \(s\) = more variability around the mean
Same units as the data (unlike variance)

IQR & Outlier Rule

\[\text{IQR} = Q_3 - Q_1\]

Outlier boundaries:

Lower: Q₁ βˆ’ 1.5 Γ— IQR

Upper: Q₃ + 1.5 Γ— IQR

IQR is robust β€” unaffected by outliers

Practice ✏️ β€” Descriptive Statistics

Q1. Q1 = 20, Q3 = 50. Is the value 105 an outlier?

Q2. A business owner sees daily revenue has SD = $1,260. Interpret.

Q3. A histogram of housing prices is right-skewed. Should you report mean or median to represent a β€œtypical” price?

Practice ✏️ β€” Descriptive Statistics

Q1. Q1 = 20, Q3 = 50. Is the value 105 an outlier?

πŸ‘‰ IQR = 30. Upper fence = 50 + 1.5(30) = 95. Since 105 > 95 β†’ yes, outlier.

Q2. A business owner sees daily revenue has SD = $1,260. Interpret.

πŸ‘‰ On a typical day, revenue deviates from the mean by about $1,260. Higher SD = more day-to-day variability.

Q3. A histogram of housing prices is right-skewed. Should you report mean or median to represent a β€œtypical” price?

πŸ‘‰ Median β€” the mean is pulled upward by a few very expensive homes and overstates what is β€œtypical.”

β‘’ Probability Rules

The language of uncertainty in business decisions

Core Probability Rules

Rule Formula
Addition Rule \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
Multiplication (independent) \(P(A \cap B) = P(A) \cdot P(B)\)
Conditional Probability \(P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}\)
Law of Total Probability \(P(B) = P(B\mid A)\,P(A) + P(B\mid A^c)\,P(A^c)\)
Independence Check \(P(A \mid B) = P(A)\)

Warning

Independent β‰  Mutually Exclusive!
Mutually exclusive means \(P(A \cap B) = 0\) β€” they cannot both happen.
If \(P(A) > 0\) and \(P(B) > 0\), they cannot be both independent AND mutually exclusive.

Contingency Tables

200 customers β€” Premium member Γ— Made purchase

Purchased Did Not Total
Premium 70 30 100
Non-Prem. 40 60 100
Total 110 90 200

Three probability types from one table:

  • Marginal: \(P(\text{Purchased}) = 110/200 = 0.55\)
  • Joint: \(P(\text{Premium} \cap \text{Purch.}) = 70/200 = 0.35\)
  • Conditional: \(P(\text{Purch.} \mid \text{Premium}) = 70/100 = 0.70\)

Independence Check

\(P(\text{Purchased}) = 0.55\)
\(P(\text{Purchased} \mid \text{Premium}) = 0.70\)

\(0.70 \neq 0.55\)
β†’ NOT independent
Knowing someone is Premium does change the probability they purchase.

Practice ✏️ β€” Probability

Q1. \(P(A) = 0.4\), \(P(B) = 0.3\), \(P(A \cap B) = 0.12\). Are A and B independent? Find \(P(A \cup B)\).

Q2. 60% of loan applicants are approved. Of approved, 80% earn >$50k. Of denied, 30% earn >$50k. Find \(P(\text{income} > \$50k)\).

Q3. Can two events be both independent and mutually exclusive (assuming \(P(A)>0\), \(P(B)>0\))?

Practice ✏️ β€” Probability

Q1. \(P(A) = 0.4\), \(P(B) = 0.3\), \(P(A \cap B) = 0.12\). Are A and B independent? Find \(P(A \cup B)\).

πŸ‘‰ \(P(A) \cdot P(B) = 0.12 = P(A\cap B)\) βœ“ Independent. \(P(A\cup B) = 0.4+0.3-0.12 = \mathbf{0.58}\)

Q2. 60% of loan applicants are approved. Of approved, 80% earn >$50k. Of denied, 30% earn >$50k. Find \(P(\text{income} > \$50k)\).

πŸ‘‰ LOTP: \(0.80(0.60) + 0.30(0.40) = 0.48 + 0.12 = \mathbf{0.60}\)

Q3. Can two events be both independent and mutually exclusive (assuming \(P(A)>0\), \(P(B)>0\))?

πŸ‘‰ No. Mut. exclusive β†’ \(P(A\cap B) = 0\). Independence requires \(P(A\cap B) = P(A)\cdot P(B) > 0\). Contradiction.

β‘£ Discrete Random Variables

Countable outcomes with assigned probabilities

Expected Value & Variance

Core Formulas

\[E(X) = \sum x \cdot P(X = x)\]

\[\text{Var}(X) = \sum (x - E(X))^2 \cdot P(X=x)\]

\[\text{SD}(X) = \sqrt{\text{Var}(X)}\]

Linear Transformations

\[E(aX + b) = a\,E(X) + b\] \[\text{Var}(aX + b) = a^2\,\text{Var}(X)\]

\(b\) shifts the mean but has zero effect on variance!

Example: Profit per Contract

Profit \(x\) \(P(X=x)\) \(x \cdot P\)
βˆ’$10,000 0.20 βˆ’2,000
$0 0.30 0
$20,000 0.50 10,000

\[E(X) = \$8{,}000 \text{ per contract}\]

Interpretation: On average, each contract earns $8,000.
Valid? \(\sum P = 1.00\) βœ“, all \(P \geq 0\) βœ“

Binomial Distribution

Use When (BINS)

  • Binary outcome (success/failure)
  • Independent trials
  • N is fixed (\(n\))
  • Same probability \(p\) each trial

Formulas

\[P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}\] \[E(X) = np \qquad SD(X) = \sqrt{np(1-p)}\]

Business Example

A sales rep has a 30% success rate and makes 20 calls per day.

\(n = 20\), \(p = 0.30\)

\[E(X) = 20 \times 0.30 = 6 \text{ sales}\] \[SD(X) = \sqrt{20 \times 0.30 \times 0.70} \approx 2.05\]

\[P(X=8) = \binom{20}{8}(0.30)^8(0.70)^{12} \approx 0.114\]

β€œOn average 6 sales per day, varying by about 2.”

Poisson Distribution

Use When

  • Counting events over a fixed interval of time or space
  • Events occur at a known average rate \(\lambda\)
  • No fixed \(n\) β€” events simply arrive

Formulas

\[P(X=k) = \frac{e^{-\lambda}\,\lambda^k}{k!}\] \[E(X) = \lambda \qquad \text{Var}(X) = \lambda\]

Example: Customer Arrivals

Coffee shop: 4 customers/hour β†’ \(\lambda = 4\)

\[P(X=2) = \frac{e^{-4} \cdot 4^2}{2!} \approx 0.147\]

\[E(X) = 4 \quad SD(X) = \sqrt{4} = 2\]

Binomial vs Poisson

Binomial Poisson
Fixed \(n\)? βœ… Yes ❌ No
Two outcomes? βœ… ❌ (counts)
Parameters \(n, p\) \(\lambda\)

Practice ✏️ β€” Random Variables

Q1. Weekly sales: \(E(X) = 50\), \(SD(X) = 8\). Bonus \(B = 3X + 200\). Find \(E(B)\) and \(SD(B)\).

Q2. Inspector checks 15 items; 10% defect rate. Find \(E(\text{# defective})\) and \(P(X=0)\).

Q3. Call center receives 3 complaints/hour on average. Which distribution? Find \(P(X=5)\).

Practice ✏️ β€” Random Variables

Q1. Weekly sales: \(E(X) = 50\), \(SD(X) = 8\). Bonus \(B = 3X + 200\). Find \(E(B)\) and \(SD(B)\).

πŸ‘‰ \(E(B) = 3(50)+200 = \$350\). \(\text{Var}(B) = 9 \times 64 = 576\). \(SD(B) = \sqrt{576} = \mathbf{\$24}\).

Q2. Inspector checks 15 items; 10% defect rate. Find \(E(\text{# defective})\) and \(P(X=0)\).

πŸ‘‰ Binomial, \(n=15\), \(p=0.10\). \(E(X) = 1.5\). \(P(X=0) = 0.90^{15} \approx \mathbf{0.206}\).

Q3. Call center receives 3 complaints/hour on average. Which distribution? Find \(P(X=5)\).

πŸ‘‰ Poisson, \(\lambda=3\). \(P(X=5) = e^{-3} \cdot 3^5 / 5! \approx \mathbf{0.101}\).

β‘€ Continuous: Uniform Distribution

Uniform Distribution

Warning

For continuous distributions, \(P(X = \text{exactly } c) = 0\).
Probabilities = areas under the curve (or rectangle for Uniform).

\(X \sim \text{Uniform}(a, b)\)

\[P(c \leq X \leq d) = \frac{d-c}{b-a}\]

\[E(X) = \frac{a+b}{2} \qquad \text{Var}(X) = \frac{(b-a)^2}{12}\]

The β€œrectangle” has width \((d-c)\) and height \(\frac{1}{b-a}\).
Area = probability.

Example: Delivery Times

\(X \sim U(20, 60)\) minutes

\[P(30 \leq X \leq 45) = \frac{45-30}{60-20} = \frac{15}{40} = \mathbf{0.375}\]

\[E(X) = \frac{20+60}{2} = 40 \text{ min}\]

\[\text{Var}(X) = \frac{(60-20)^2}{12} \approx 133.3\]

β€œ37.5% of deliveries arrive in the 30–45 min window.”

Practice ✏️ β€” Uniform Distribution

Q1. \(X \sim U(5, 25)\). Find \(P(X > 18)\) and \(E(X)\).

πŸ‘‰ \(P(X>18) = (25-18)/(25-5) = 7/20 = \mathbf{0.35}\). \(E(X) = (5+25)/2 = \mathbf{15}\) minutes.

Q2. Bag weight \(X \sim U(490, 510)\) g. What fraction weigh between 495 and 505 g?

πŸ‘‰ \(P(495 \leq X \leq 505) = (505-495)/(510-490) = 10/20 = \mathbf{0.50}\) β€” half of bags.

Q3. \(X \sim U(0, 12)\). Find \(\text{Var}(X)\) and interpret for scheduling.

πŸ‘‰ \(\text{Var}(X) = (12-0)^2/12 = 12\), \(SD \approx 3.46\) min. Appointments vary by about 3.5 min from the average of 6 min.

Which Distribution? β€” Quick Reference

Binomial Poisson Uniform Custom Table
Data type Discrete Discrete Continuous Discrete
Fixed \(n\)? βœ… ❌ ❌ varies
Two outcomes? βœ… ❌ ❌ varies
Parameters \(n, p\) \(\lambda\) \(a, b\) table
\(E(X)\) \(np\) \(\lambda\) \((a+b)/2\) \(\sum x P(x)\)
Classic context # successes in trials counts/arrivals per interval any value in \([a,b]\) equally likely given probabilities

Use context clues from the scenario: Does it mention β€œn trials”? β†’ Binomial. β€œPer hour/day/km”? β†’ Poisson. β€œBetween a and b, equally likely”? β†’ Uniform.

Free Response Strategy

20 points Β· 2 questions Β· ~20 min each

Attacking Free Response Questions

  1. Read carefully β€” identify: What distribution? What quantity? What context?

  2. State your setup β€” name the distribution, define the variable, state parameters
    e.g., β€œLet X = # defective items. X ~ Binomial, n = 15, p = 0.10”

  3. Show your work β€” write the formula first, then substitute values. Partial credit is available even if the final answer is wrong.

  4. Compute accurately β€” use the formula sheet. Double-check on your calculator.

  5. Always interpret β€” write 1–2 sentences: β€œThis means that…” in the context of the problem.

Warning

A number alone is rarely a complete answer.
Numerical answers without written interpretation receive partial credit only.

You’ve got this! 🎯

Last-Minute Checklist

  • βœ… Answer every question β€” no MC penalty for guessing
  • βœ… Manage time: ~3 min/MC, ~20 min/FR
  • βœ… Show ALL work β€” partial credit counts
  • βœ… Always interpret results in context
  • βœ… Use context clues to pick the right distribution
  • βœ… Write legibly β€” if we can’t read it, we can’t grade it

Office Hours Before the Exam

Come see us if any concept is unclear!


The goal is to demonstrate your understanding of statistical concepts and their applications in economics and business.