HW4: Practice Quiz

60 Questions Covering All Learning Objectives

Author
Published

28 April 2026

0.1 📚 How to Use This Practice Quiz

  • Answer each question before revealing the solution
  • Check the explanation to understand the concept
  • Track your progress — try to get at least 80% correct!
  • Focus on weak areas — multiple choice questions are organized by learning objective, and there is a section at the end to calculate and interpret probabilities.
  • Write your answers on paper — the exam will include a free response section where you will have to show your work.
  • Use the reference formulas provided in the Midterm Guide — they are the same as the ones you will have access to on the exam.
  • Take it multiple times — repetition helps retention!

1 Section 1: Basic Concepts & Definitions

1.1 Question 1

Learning Objective: Parameter vs. Statistic

A financial analyst surveys 300 publicly traded companies and calculates an average quarterly profit of $4.2 million. This value is best described as a:

  1. Parameter
  2. Statistic
  3. Population
  4. Variable

Correct Answer: B

Explanation: A statistic is a numerical summary calculated from a sample. Here, the 300 companies are a sample drawn from the population of all publicly traded companies. If the analyst had access to every publicly traded company, the average would be a parameter.


1.2 Question 2

Learning Objective: Sampling Methods

A national bank wants to study customer satisfaction. It divides all branches into four groups by size (small, medium, large, flagship) and randomly selects 30 branches from each group to survey. This is an example of:

  1. Cluster sampling
  2. Convenience sampling
  3. Stratified sampling
  4. Simple random sampling

Correct Answer: C

Explanation: Stratified sampling divides the population into subgroups (strata) based on a relevant characteristic — here, branch size — and randomly samples from each stratum. This ensures all branch sizes are represented in the study.


1.3 Question 3

Learning Objective: Types of Variables

Which of the following is a continuous numerical variable in a business context?

  1. Number of customer service tickets opened per day
  2. Credit score category (Poor, Fair, Good, Excellent)
  3. Number of quarterly earnings reports a company has filed
  4. Market capitalization of a firm measured in dollars

Correct Answer: D

Explanation: Market capitalization can take any non-negative value along a continuous scale (e.g., $4.217 billion), making it continuous. The other options are either discrete counts (A, C) or categorical (B).


1.4 Question 4

Learning Objective: Experimental Design

A company randomly assigns 120 employees to either a four-day workweek (treatment) or the standard five-day workweek (control) for three months, then compares productivity scores. The key feature that allows a cause-and-effect conclusion is:

  1. The large sample size of 120 employees
  2. The use of a control group
  3. The random assignment of employees to groups
  4. The three-month duration of the study

Correct Answer: C

Explanation: Random assignment is the critical feature. It distributes confounding variables (like seniority, department, or prior productivity) evenly across groups by chance, so any difference in outcomes can be attributed to the treatment itself. Without it, the study would be observational and could not establish causation.


1.5 Question 5

Learning Objective: Research Ethics

An e-commerce platform secretly changes the pricing display for half of its users to test whether a new layout increases purchases, without informing any users or giving them the option to opt out. This primarily violates:

  1. Random assignment
  2. Statistical significance
  3. Informed consent
  4. Control group requirements

Correct Answer: C

Explanation: Informed consent requires that study participants be told they are part of an experiment and given the opportunity to decline participation. Conducting an experiment on users without their knowledge violates this fundamental ethical principle, regardless of whether the study is scientifically well-designed.


1.6 Question 6

Learning Objective: Sources of Bias

An economist wants to estimate average household income in a large city. She surveys visitors at a high-end business hotel on weekday mornings. The primary concern with this method is:

  1. Response bias
  2. Nonresponse bias
  3. Sampling bias
  4. Measurement error

Correct Answer: C

Explanation: Sampling bias occurs when the sample is not representative of the population. Business hotel guests on weekday mornings are overwhelmingly business travelers with above-average incomes, so the sample systematically over-represents high earners and will likely overestimate the city’s average household income.


1.7 Question 7

Learning Objective: Parameter vs. Statistic

The true average annual return on all U.S. stock mutual funds over the past decade is a:

  1. Statistic
  2. Sample
  3. Variable
  4. Parameter

Correct Answer: D

Explanation: A parameter is a numerical summary of an entire population. The phrase “all U.S. stock mutual funds” refers to the complete population of such funds, so the resulting average is a parameter — a fixed (though often unknown) value.


1.8 Question 8

Learning Objective: Sampling Methods

A management consulting firm wants to study employee engagement. They randomly select 6 office locations from the company’s 40 locations, then survey every employee at those 6 locations. This is:

  1. Stratified sampling
  2. Simple random sampling
  3. Systematic sampling
  4. Cluster sampling

Correct Answer: D

Explanation: Cluster sampling randomly selects entire groups (clusters) — here, office locations — and then includes all members of the selected clusters. Unlike stratified sampling, the clusters are not subgroups defined by a characteristic; they are just naturally occurring groupings, and only some clusters are selected.


2 Section 2: Data Displays

2.1 Question 9

Learning Objective: Interpreting Histograms

A histogram of annual bonuses at an investment bank shows most values clustered between $10,000 and $40,000, with a long tail extending toward $500,000 and beyond. This distribution is best described as:

  1. Left-skewed
  2. Symmetric
  3. Right-skewed
  4. Bimodal

Correct Answer: C

Explanation: When the tail extends to the right (toward larger values), the distribution is right-skewed. Here, the few very large bonuses earned by senior executives pull the tail to the right, while most employees receive moderate bonuses clustered on the left.


2.2 Question 10

Learning Objective: Choosing Displays

A data analyst wants to visualize whether there is a relationship between advertising spending (in thousands of dollars) and monthly sales revenue (in thousands of dollars) across 50 stores. The best display is:

  1. Pie chart
  2. Histogram
  3. Bar chart
  4. Scatterplot

Correct Answer: D

Explanation: A scatterplot is the appropriate display for examining the relationship between two quantitative variables. Each store appears as one point, with advertising spending on one axis and sales revenue on the other, revealing any association or trend.


2.3 Question 11

Learning Objective: Boxplot Interpretation

A boxplot of quarterly revenue for a sample of retail companies shows that the whisker above the box is noticeably longer than the whisker below the box. This suggests the distribution of revenues is:

  1. Left-skewed
  2. Symmetric
  3. Uniform
  4. Right-skewed

Correct Answer: D

Explanation: A longer upper whisker indicates that the upper portion of the data (above Q3) spreads out farther than the lower portion (below Q1). This is a sign of right-skewness, consistent with revenue data where a few high-performing companies pull the distribution upward.


2.4 Question 12

Learning Objective: Choosing Displays

A business analyst wants to show the percentage of a company’s total revenue that comes from each of its five product lines. The most appropriate display is:

  1. Histogram
  2. Scatterplot
  3. Boxplot
  4. Pie chart or bar chart

Correct Answer: D

Explanation: Pie charts and bar charts are designed to display proportions or frequencies of categorical data. Here, the five product lines are categories, and we want to compare their share of total revenue. A histogram would be inappropriate since it is for continuous quantitative data.


3 Section 3: Measures of Central Tendency

3.1 Question 13

Learning Objective: Choosing the Right Measure

A dataset of annual salaries at a tech startup includes: $55,000; $60,000; $62,000; $65,000; $68,000; and $1,200,000 (the CEO). Which measure of central tendency best represents the typical employee salary?

  1. Mean
  2. Mode
  3. Range
  4. Median

Correct Answer: D

Explanation: The median is the better choice here because it is resistant to extreme values (outliers). The CEO’s salary of $1.2M would dramatically inflate the mean, making it unrepresentative of what a typical employee earns. The median focuses on the middle of the distribution, unaffected by the extreme value.


3.2 Question 14

Learning Objective: Mode

A retail store records the following number of daily transactions over a week: 142, 158, 158, 165, 172, 158, 180. What is the mode?

  1. 158
  2. 165
  3. 162
  4. There is no mode

Correct Answer: A

Explanation: The mode is the value that appears most frequently in the data. The value 158 appears three times, while all other values appear only once. Therefore, the mode is 158.


3.3 Question 15

Learning Objective: Median

A financial advisor records the following client portfolio returns (in %) for one quarter: −2, 3, 5, 8, 12. What is the median return?

  1. 3%
  2. 5%
  3. 6.5%
  4. 8%

Correct Answer: B

Explanation: With 5 values arranged in order (−2, 3, 5, 8, 12), the median is the 3rd value: 5%. For an odd number of observations, the median is always the middle value after sorting.


3.4 Question 16

Learning Objective: Properties of the Mean

A company discovers that all reported sales figures for last quarter were understated by $5,000. After correcting every entry by adding $5,000, what happens to the mean?

  1. The mean stays the same
  2. The mean increases by $5,000
  3. The mean is multiplied by $5,000
  4. The mean doubles

Correct Answer: B

Explanation: Adding a constant to every value in a dataset shifts the mean by that same constant. If every observation increases by $5,000, the average also increases by $5,000. This follows from the property: \(\bar{x}_{\text{new}} = \bar{x}_{\text{old}} + 5{,}000\).


3.5 Question 17

Learning Objective: Sigma Notation

For any dataset, what does \(\sum_{i=1}^{n}(x_i - \bar{x})\) always equal?

  1. The sample variance
  2. The standard deviation
  3. The sample size
  4. Zero

Correct Answer: D

Explanation: The sum of deviations from the mean always equals zero for any dataset. This is a fundamental property of the mean: positive deviations (values above the mean) and negative deviations (values below the mean) cancel each other out perfectly.


4 Section 4: Measures of Spread

4.1 Question 18

Learning Objective: Interpreting Standard Deviation

Two investment funds both report an average annual return of 8%. Fund A has a standard deviation of 2%, and Fund B has a standard deviation of 12%. Which statement is correct?

  1. Fund A has more variable returns than Fund B
  2. Fund B has more consistent returns than Fund A
  3. Fund B has more variable returns than Fund A
  4. Both funds have equal variability

Correct Answer: C

Explanation: Standard deviation measures variability. A larger SD means returns fluctuate more widely around the average. Fund B’s SD of 12% indicates its annual returns swing dramatically — this is a riskier investment. Fund A’s SD of 2% indicates much more consistent, predictable returns.


4.2 Question 19

Learning Objective: Range

A financial analyst records daily closing prices for a stock over a week: $42, $45, $41, $49, $44. What is the range?

  1. $4
  2. $7
  3. $8
  4. $44

Correct Answer: C

Explanation: Range = Maximum − Minimum = $49 − $41 = $8. The range gives the total spread of the data but is sensitive to extreme values, since it only uses the two most extreme observations.


4.3 Question 20

Learning Objective: IQR

The interquartile range (IQR) captures:

  1. The middle 25% of the data
  2. The middle 50% of the data
  3. The middle 75% of the data
  4. The entire range of the data

Correct Answer: B

Explanation: IQR = Q3 − Q1. Since Q1 is the 25th percentile and Q3 is the 75th percentile, the IQR spans the middle 50% of the data. It is a resistant measure of spread, unaffected by extreme values.


4.4 Question 21

Learning Objective: Properties of Standard Deviation

A consulting firm converts all reported revenues from thousands of dollars to dollars (i.e., multiplies every value by 1,000). What happens to the standard deviation?

  1. It stays the same
  2. It is divided by 1,000
  3. It increases by 1,000
  4. It is multiplied by 1,000

Correct Answer: D

Explanation: Multiplying every value in a dataset by a constant \(c\) multiplies the standard deviation by \(|c|\). Here, \(c = 1{,}000\), so the standard deviation increases by a factor of 1,000. (By contrast, adding a constant to all values does not change the SD at all.)


4.5 Question 22

Learning Objective: IQR Calculation

A sample of employee years-of-experience data has Q1 = 3 years and Q3 = 11 years. What is the IQR, and what is the upper fence for outlier detection?

  1. IQR = 8; upper fence = 23
  2. IQR = 8; upper fence = 19
  3. IQR = 14; upper fence = 32
  4. IQR = 8; upper fence = 27

Correct Answer: A

Explanation: IQR = Q3 − Q1 = 11 − 3 = 8. Upper fence = Q3 + 1.5 × IQR = 11 + 1.5 × 8 = 11 + 12 = 23. Any employee with more than 23 years of experience would be flagged as a potential outlier.


4.6 Question 23

Learning Objective: Skewness and Central Tendency

For a left-skewed distribution of employee performance scores, which relationship between mean and median typically holds?

  1. Mean > Median
  2. Mean = Median
  3. Mean < Median
  4. Cannot be determined

Correct Answer: C

Explanation: In a left-skewed distribution, the long tail extends toward lower values. These low outliers pull the mean downward more than the median, so Mean < Median. Think of a distribution where most employees score high, but a small group scores very poorly — the mean gets dragged down by those low scores.


5 Section 5: Probability Fundamentals

5.1 Question 24

Learning Objective: Probability Basics

A market research firm says the probability that a randomly selected adult prefers online shopping is 0.68. This is consistent with which rule of probability?

  1. The probability is negative, which is invalid
  2. The probability exceeds 1, which is invalid
  3. The probability is between 0 and 1, which is valid
  4. Probabilities must equal exactly 0.5

Correct Answer: C

Explanation: Any probability must satisfy \(0 \leq P(A) \leq 1\). A value of 0.68 falls within this range, so it is a valid probability. A probability of 0 means the event never occurs; a probability of 1 means it always occurs.


5.2 Question 25

Learning Objective: Complement Rule

The probability that a randomly selected business loan application is approved is 0.62. What is the probability that a randomly selected application is not approved?

  1. 0.62
  2. 0.38
  3. 1.62
  4. 0.50

Correct Answer: B

Explanation: The complement rule states \(P(A^c) = 1 - P(A)\). Here, \(P(\text{Not Approved}) = 1 - 0.62 = \mathbf{0.38}\).


5.3 Question 26

Learning Objective: Mutually Exclusive Events

For a given quarter, let event A = “the company reports a profit” and event B = “the company reports a loss.” These events are:

  1. Independent
  2. Mutually exclusive
  3. Both mutually exclusive and independent
  4. Neither mutually exclusive nor independent

Correct Answer: B

Explanation: Events are mutually exclusive if they cannot occur at the same time. A company cannot simultaneously report a profit and a loss for the same quarter — these are opposite outcomes, so \(P(A \text{ AND } B) = 0\). Note that mutually exclusive events with positive probabilities are not independent.


5.4 Question 27

Learning Objective: Independent Events

Two events A and B are independent. Which of the following must be true?

  1. \(P(A \text{ AND } B) = 0\)
  2. \(P(A \text{ OR } B) = P(A) + P(B)\)
  3. \(P(A \text{ AND } B) = P(A) \times P(B)\)
  4. \(P(A) = P(B)\)

Correct Answer: C

Explanation: Two events are independent if knowing one occurred gives no information about the other. Formally: \(P(A \text{ AND } B) = P(A) \times P(B)\). Option A describes mutually exclusive events. Option B only holds when events are mutually exclusive (and then only without the overlap correction).


5.5 Question 28

Learning Objective: Addition Rule

A venture capital firm is reviewing two startup investments. The probability that Startup A succeeds is 0.40, the probability that Startup B succeeds is 0.55, and the probability that both succeed is 0.25. What is the probability that at least one startup succeeds?

  1. 0.40
  2. 0.55
  3. 0.70
  4. 0.95

Correct Answer: C

Explanation: By the general addition rule: \(P(A \text{ OR } B) = P(A) + P(B) - P(A \text{ AND } B) = 0.40 + 0.55 - 0.25 = \mathbf{0.70}\).


5.6 Question 29

Learning Objective: Multiplication Rule

A supply chain manager knows that the probability of a shipping delay on any given route is 0.15, and delays on different routes are independent. If two separate routes are used, what is the probability that both experience a delay?

  1. 0.30
  2. 0.15
  3. 0.0225
  4. 0.2775

Correct Answer: C

Explanation: For independent events, \(P(A \text{ AND } B) = P(A) \times P(B) = 0.15 \times 0.15 = \mathbf{0.0225}\). There is only a 2.25% chance both routes are delayed simultaneously.


5.7 Question 30

Learning Objective: Conditional Probability

At a financial firm, the probability that an employee is a senior analyst and holds a CFA certification is 0.18. The probability that an employee is a senior analyst is 0.30. What is the probability that an employee holds a CFA certification, given that they are a senior analyst?

  1. 0.18
  2. 0.30
  3. 0.54
  4. 0.60

Correct Answer: D

Explanation: \(P(\text{CFA} \mid \text{Senior}) = \dfrac{P(\text{CFA AND Senior})}{P(\text{Senior})} = \dfrac{0.18}{0.30} = \mathbf{0.60}\). Among senior analysts, 60% hold a CFA certification.


5.8 Question 31

Learning Objective: Law of Total Probability

A bank has two types of credit card customers: 45% are rewards card holders and 55% are standard card holders. Rewards card holders carry a balance (do not pay in full) 30% of the time, while standard card holders carry a balance 60% of the time. What is the probability that a randomly selected customer carries a balance?

  1. 0.33
  2. 0.45
  3. 0.465
  4. 0.60

Correct Answer: C

Explanation: By the Law of Total Probability: \(P(\text{Balance}) = P(\text{Balance} \mid \text{Rewards}) \times P(\text{Rewards}) + P(\text{Balance} \mid \text{Standard}) \times P(\text{Standard}) = 0.30 \times 0.45 + 0.60 \times 0.55 = 0.135 + 0.330 = \mathbf{0.465}\).


6 Section 6: Contingency Tables

6.1 Question 32

Learning Objective: Reading Contingency Tables

A company surveys 400 employees about remote work. 220 employees work remotely; 180 work on-site. Of remote workers, 150 report high job satisfaction. Of on-site workers, 90 report high job satisfaction. How many employees in total report high job satisfaction?

  1. 150
  2. 240
  3. 90
  4. 400

Correct Answer: B

Explanation: Total high satisfaction = remote high satisfaction + on-site high satisfaction = 150 + 90 = 240 employees. Always check: marginal totals are found by summing across rows or columns.


6.2 Question 33

Learning Objective: Marginal Probability

Using the table from Question 32 (400 employees total, 220 remote, 180 on-site), what is the marginal probability that a randomly selected employee works remotely?

  1. 0.45
  2. 0.55
  3. 0.68
  4. 0.38

Correct Answer: B

Explanation: A marginal probability uses a row or column total divided by the grand total: \(P(\text{Remote}) = 220/400 = \mathbf{0.55}\). It is called “marginal” because it comes from the table’s margin (the row or column totals).


6.3 Question 34

Learning Objective: Joint Probability

Using the table from Questions 32–33, what is the joint probability of being a remote worker and reporting high job satisfaction?

  1. 0.375
  2. 0.55
  3. 0.60
  4. 0.682

Correct Answer: A

Explanation: A joint probability divides the cell count by the grand total: \(P(\text{Remote AND High Satisfaction}) = 150/400 = \mathbf{0.375}\). This is the probability that a randomly chosen employee is both remote and highly satisfied.


6.4 Question 35

Learning Objective: Conditional Probability from Tables

Using the table from Questions 32–34, what is \(P(\text{High Satisfaction} \mid \text{Remote})\)?

  1. 0.375
  2. 0.55
  3. 0.682
  4. 0.600

Correct Answer: C

Explanation: \(P(\text{High Satisfaction} \mid \text{Remote}) = \dfrac{\text{Remote AND High Satisfaction}}{\text{Remote Total}} = \dfrac{150}{220} \approx \mathbf{0.682}\). Among remote workers specifically, about 68% report high job satisfaction.


7 Section 7: Binomial Distribution

7.1 Question 36

Learning Objective: Recognizing Binomial Scenarios

Which of the following scenarios is best modeled by a binomial distribution?

  1. The number of customer support calls received per hour
  2. The time (in minutes) until the next online order is placed
  3. Whether each of 25 independently reviewed loan applications is approved or denied
  4. The hourly revenue of a coffee shop throughout the day

Correct Answer: C

Explanation: A binomial distribution requires: (1) a fixed number of trials — here, \(n = 25\) applications; (2) two outcomes per trial — approved or denied; (3) constant probability of success; and (4) independence across trials. Option A suggests a Poisson; option B is better modeled by a continuous distribution; option D is a continuous monetary amount.


7.2 Question 37

Learning Objective: Expected Value — Binomial

A sales team knows that 35% of cold calls result in a scheduled meeting. The team makes 60 calls in one day. What is the expected number of meetings scheduled?

  1. 35
  2. 21
  3. 14
  4. 60

Correct Answer: B

Explanation: For \(X \sim \text{Binomial}(n = 60, p = 0.35)\): \(E(X) = np = 60 \times 0.35 = \mathbf{21}\) meetings. On average, 21 of the 60 calls are expected to lead to a scheduled meeting.


7.3 Question 38

Learning Objective: Standard Deviation — Binomial

For \(X \sim \text{Binomial}(n = 50, p = 0.40)\), what is \(\text{SD}(X)\)?

  1. 20
  2. 12
  3. 3.46
  4. 2.83

Correct Answer: C

Explanation: \(\text{SD}(X) = \sqrt{np(1-p)} = \sqrt{50 \times 0.40 \times 0.60} = \sqrt{12} \approx \mathbf{3.46}\).


7.4 Question 39

Learning Objective: Identifying the Binomial

A quality control manager at a factory knows that 5% of products coming off the line are defective. She randomly inspects 80 products, and the inspection outcomes are independent. Which distribution models the number of defective items?

  1. Poisson(80)
  2. Binomial(80, 0.05)
  3. Uniform(0, 80)
  4. Binomial(5, 0.80)

Correct Answer: B

Explanation: The four binomial conditions are met: fixed \(n = 80\) trials, binary outcome (defective or not), constant \(p = 0.05\), and independence. Therefore \(X \sim \text{Binomial}(80,\ 0.05)\).


7.5 Question 40

Learning Objective: Binomial Properties

For a binomial distribution with fixed \(p\), what happens to the variance as \(n\) increases?

  1. The variance decreases
  2. The variance stays the same
  3. The variance increases
  4. The variance becomes 0

Correct Answer: C

Explanation: \(\text{Var}(X) = np(1-p)\). Since \(p(1-p)\) is a positive constant (for \(0 < p < 1\)), the variance increases proportionally with \(n\). More trials means more total variability in the outcome count, even though the proportion \(X/n\) becomes more stable.


8 Section 8: Poisson Distribution

8.1 Question 41

Learning Objective: Recognizing Poisson Scenarios

Which of the following is best modeled by a Poisson distribution?

  1. Number of successes in 30 independent sales calls with a 20% success rate
  2. The continuous time between consecutive ATM withdrawals
  3. Whether an individual customer defaults on a loan or not
  4. Number of fraud alerts triggered on a bank’s system per hour

Correct Answer: D

Explanation: The Poisson distribution models counts of events occurring at a constant average rate over a fixed time or space interval, with events occurring independently. Fraud alerts per hour fit this perfectly. Option A is binomial (fixed trials, two outcomes). Option B is continuous. Option C is binary (Bernoulli).


8.2 Question 42

Learning Objective: Expected Value — Poisson

A corporate IT help desk receives an average of 7 support tickets per hour. If the number of tickets follows a Poisson distribution, what is \(E(X)\) and \(\text{Var}(X)\)?

  1. \(E(X) = 7\); \(\text{Var}(X) = 49\)
  2. \(E(X) = 7\); \(\text{Var}(X) = 7\)
  3. \(E(X) = \sqrt{7}\); \(\text{Var}(X) = 7\)
  4. \(E(X) = 49\); \(\text{Var}(X) = 7\)

Correct Answer: B

Explanation: A key property of the Poisson distribution is that \(E(X) = \text{Var}(X) = \lambda\). With \(\lambda = 7\): \(E(X) = 7\) and \(\text{Var}(X) = 7\). The standard deviation would be \(\sqrt{7} \approx 2.65\).


8.3 Question 43

Learning Objective: Poisson Rate Scaling

A stock brokerage firm receives an average of 4 trade order errors per day. Assuming a Poisson process, what is the expected number of errors during a 5-day trading week?

  1. 4
  2. 0.8
  3. 20
  4. 16

Correct Answer: C

Explanation: For a Poisson process, the rate scales linearly with time. Over \(t = 5\) days: \(\lambda_{\text{week}} = 4 \times 5 = \mathbf{20}\) errors. The distribution over the 5-day period is Poisson with \(\lambda = 20\).


8.4 Question 44

Learning Objective: Poisson Properties

A market analyst says: “For a Poisson random variable, the mean and variance are always equal.” This statement is:

  1. False — the variance is always less than the mean
  2. False — the variance is always the square of the mean
  3. True — both equal \(\lambda\)
  4. True — but only when \(\lambda > 1\)

Correct Answer: C

Explanation: For any Poisson distribution, \(E(X) = \text{Var}(X) = \lambda\), regardless of the value of \(\lambda\). This is a defining property that makes the Poisson distribution unique among common distributions.


9 Section 9: Expected Values and Variance

9.1 Question 45

Learning Objective: Valid Probability Distribution

A financial planner models the number of new clients \(X\) she gains per month with the following table:

\(x\) 0 1 2 3
\(P(X=x)\) 0.20 0.45 0.25 0.10

Is this a valid probability distribution?

  1. No, because the probabilities do not sum to 1
  2. No, because some probabilities exceed 0.5
  3. Yes, because all probabilities are non-negative and sum to 1
  4. Yes, because all probabilities are less than 1

Correct Answer: C

Explanation: A valid discrete probability distribution requires: (1) all probabilities \(\geq 0\) — satisfied, and (2) probabilities sum to 1 — \(0.20 + 0.45 + 0.25 + 0.10 = 1.00\) ✓. Both conditions hold, so this is a valid distribution.


9.2 Question 46

Learning Objective: Expected Value — Discrete

Using the distribution from Question 45, what is \(E(X)\), the expected number of new clients per month?

  1. 1.00
  2. 1.25
  3. 1.50
  4. 2.00

Correct Answer: B

Explanation: \(E(X) = 0(0.20) + 1(0.45) + 2(0.25) + 3(0.10) = 0 + 0.45 + 0.50 + 0.30 = \mathbf{1.25}\) clients per month. On average, she gains about 1.25 new clients per month over the long run.


9.3 Question 47

Learning Objective: Linear Functions — Expected Value

A delivery company models the number of packages \(X\) a driver delivers per day with \(E(X) = 40\). Each delivered package generates a revenue of $12, and the driver incurs a fixed daily cost of $50. Daily profit \(Y = 12X - 50\). What is \(E(Y)\)?

  1. $430
  2. $480
  3. $530
  4. $440

Correct Answer: A

Explanation: Using \(E(aX + b) = aE(X) + b\): \(E(Y) = 12 \times E(X) - 50 = 12 \times 40 - 50 = 480 - 50 = \mathbf{\$430}\) per day.


9.4 Question 48

Learning Objective: Linear Functions — Variance

Using the delivery driver from Question 47, if \(\text{Var}(X) = 25\), what is \(\text{Var}(Y)\) where \(Y = 12X - 50\)?

  1. 300
  2. 3,600
  3. 250
  4. 25

Correct Answer: B

Explanation: Using \(\text{Var}(aX + b) = a^2 \cdot \text{Var}(X)\): \(\text{Var}(Y) = 12^2 \times 25 = 144 \times 25 = \mathbf{3{,}600}\). Note that adding or subtracting a constant (the \(-50\)) does not affect the variance — only the multiplicative constant (\(a = 12\)) matters.


10 Section 10: Continuous Distributions — Uniform

10.1 Question 49

Learning Objective: Uniform Distribution Probability

A tech company’s software build time is uniformly distributed between 8 and 20 minutes. What is the probability that a build takes between 10 and 14 minutes?

  1. 0.20
  2. 0.25
  3. 0.33
  4. 0.50

Correct Answer: C

Explanation: For a Uniform distribution on \([a, b]\): \(P(c \leq X \leq d) = \dfrac{d - c}{b - a} = \dfrac{14 - 10}{20 - 8} = \dfrac{4}{12} = \dfrac{1}{3} \approx \mathbf{0.33}\).


10.2 Question 50

Learning Objective: Uniform Distribution Properties

A logistics company’s truck loading time is uniformly distributed between 30 and 90 minutes. What is the expected loading time, and what does this value represent?

  1. E(X) = 30 min; the minimum time
  2. E(X) = 90 min; the maximum time
  3. E(X) = 60 min; the average loading time over many sessions
  4. E(X) = 45 min; the most likely loading time

Correct Answer: C

Explanation: \(E(X) = \dfrac{a + b}{2} = \dfrac{30 + 90}{2} = \mathbf{60}\) minutes. This is the long-run average loading time. For a uniform distribution, the expected value is always the midpoint of the interval, and all loading times between 30 and 90 minutes are equally likely — there is no single “most likely” time.


11 Calculating and Interpreting Probabilities

11.1 Question 51

Scenario: A digital marketing agency runs email campaigns and knows that 22% of recipients open any given email. They send a campaign to 180 recipients today.

Questions:

  1. What is the expected number of recipients who will open the email, and what is the standard deviation?

  2. Interpret \(E(X)\) in practical terms for the campaign manager.

  3. A campaign is considered effective if at least 45 recipients open the email. Without calculating the exact probability, explain whether the expected value suggests this goal is likely or unlikely to be met.

  4. The campaign manager reports that 55 people opened the email. Is this a surprising result? Calculate how many standard deviations above the mean this is, and interpret.

Show Solution

Solution:

Given: \(X \sim \text{Binomial}(n = 180,\ p = 0.22)\)


Part a) Expected Value and Standard Deviation:

\[E(X) = np = 180 \times 0.22 = 39.6 \text{ recipients}\]

\[\text{SD}(X) = \sqrt{np(1-p)} = \sqrt{180 \times 0.22 \times 0.78} = \sqrt{30.888} \approx 5.56 \text{ recipients}\]

Answer: \(E(X) \approx 39.6\), \(\text{SD}(X) \approx 5.56\)


Part b) Interpretation of \(E(X)\):

The expected value of approximately 39.6 means that if the agency sends this same campaign to 180 recipients many times (across many campaigns of the same size), the average number of opens per campaign will be approximately 39 or 40. It does not mean exactly 39.6 people will open any particular campaign — the actual number will vary from campaign to campaign, but the long-run average will approach 39.6.


Part c) Is the Goal of 45 Opens Likely?

The expected number of opens is 39.6, which falls below the goal of 45. Since 45 is \(\frac{45 - 39.6}{5.56} \approx 0.97\) standard deviations above the expected value, it is somewhat above average but not highly unusual. This means the goal is achievable but not guaranteed — roughly 1 in 6 campaigns of this size might reach 45 opens by chance alone. The campaign manager should be cautiously optimistic but should not assume the goal will be met routinely without changes to the campaign strategy.


Part d) Was 55 Opens a Surprising Result?

\[z = \frac{55 - 39.6}{5.56} = \frac{15.4}{5.56} \approx 2.77 \text{ standard deviations above the mean}\]

A result more than 2.77 standard deviations above the mean would occur by chance alone less than about 0.3% of the time under the historical 22% open rate. This is a notably surprising result — it suggests either a particularly effective campaign, a more receptive audience, or possibly a better subject line. The campaign manager has good reason to investigate what drove this higher-than-expected response and consider replicating the elements that may have contributed to it.


11.2 Question 52

Scenario: A bank’s fraud detection system generates an average of 3.2 alerts per hour during business hours, and alerts arrive independently at a constant rate.

Questions:

  1. What distribution models the number of alerts per hour? State the distribution name and parameter.

  2. Calculate the probability of receiving exactly 4 alerts in a given hour.

  3. Calculate the expected number of alerts during a 4-hour morning shift.

  4. What is the probability of receiving zero alerts during any single hour? Interpret this result in the context of bank operations.

Show Solution

Solution:

\(X\) = number of alerts per hour. \(X \sim \text{Poisson}(\lambda = 3.2)\)


Part a) Distribution:

Since alerts arrive independently at a constant average rate over a fixed time period, the appropriate model is the Poisson distribution with \(\lambda = 3.2\).


Part b) \(P(X = 4)\):

\[P(X = 4) = \frac{e^{-3.2} \times 3.2^4}{4!} = \frac{0.04076 \times 104.858}{24} = \frac{4.275}{24} \approx 0.1781\]

Answer: \(P(X = 4) \approx 0.178\) or 17.8%

Receiving exactly 4 alerts in an hour has about an 18% probability — close to the most likely outcome since 4 is just above the mean of 3.2.


Part c) Expected Alerts in 4 Hours:

\[E(X_{4\text{ hr}}) = \lambda \times t = 3.2 \times 4 = 12.8 \text{ alerts}\]

The number of alerts over a 4-hour shift follows a Poisson distribution with \(\lambda = 12.8\).


Part d) \(P(X = 0)\) — One Hour with Zero Alerts:

\[P(X = 0) = \frac{e^{-3.2} \times 3.2^0}{0!} = e^{-3.2} \approx 0.0408\]

Answer: \(P(X = 0) \approx 0.041\) or 4.1%

Interpretation: There is only about a 4% chance that a given hour passes without any fraud alerts. In practice, fraud-alert-free hours are uncommon and represent either genuinely low fraud activity or a brief quiet period. The bank’s fraud team should expect alerts during virtually every hour of operation, and an unusually quiet hour might even warrant a system check to confirm the detection system is functioning properly.


11.3 Question 53

Scenario: A contract negotiation process takes a uniformly distributed amount of time between 10 and 40 business days from initiation to signing.

Questions:

  1. What is the probability that a negotiation concludes in fewer than 20 days?

  2. What is the expected negotiation time, and what does this mean for planning purposes?

  3. A client complains if negotiations exceed 35 days. What fraction of contracts will generate a complaint?

  4. Calculate \(\text{Var}(X)\) and \(\text{SD}(X)\). Interpret \(\text{SD}(X)\) in the context of contract negotiations.

Show Solution

Solution:

\(X \sim \text{Uniform}(a = 10,\ b = 40)\)


Part a) \(P(X < 20)\):

\[P(X < 20) = \frac{20 - 10}{40 - 10} = \frac{10}{30} = \frac{1}{3} \approx 0.333\]

Answer: About 33.3% of negotiations conclude in fewer than 20 days.


Part b) Expected Negotiation Time:

\[E(X) = \frac{a + b}{2} = \frac{10 + 40}{2} = 25 \text{ days}\]

Interpretation: On average, a contract takes 25 business days to negotiate. Project managers and legal teams can use this as a planning benchmark — roughly half of contracts will finish faster, and half will take longer. Since the distribution is uniform, there is no “typical” duration within the range; every duration from 10 to 40 days is equally likely.


Part c) \(P(X > 35)\) — Fraction Generating a Complaint:

\[P(X > 35) = \frac{40 - 35}{40 - 10} = \frac{5}{30} = \frac{1}{6} \approx 0.167\]

Answer: About 16.7% of contracts will exceed 35 days and generate a complaint. Roughly 1 in 6 negotiations can be expected to draw a complaint under the current process timeline.


Part d) Variance and Standard Deviation:

\[\text{Var}(X) = \frac{(b - a)^2}{12} = \frac{(40 - 10)^2}{12} = \frac{900}{12} = 75 \text{ days}^2\]

\[\text{SD}(X) = \sqrt{75} \approx 8.66 \text{ days}\]

Interpretation: A standard deviation of approximately 8.66 days means that negotiation durations typically deviate from the 25-day average by about 8 to 9 days. Some negotiations wrap up in 16 days; others may stretch to 34 days. This sizable variability is a direct consequence of the uniform distribution spanning a 30-day range — no “gravitational pull” toward a central value exists, so timelines are genuinely unpredictable within the [10, 40] window.


11.4 Question 54

Scenario: A financial advisor tracks how many new clients \(X\) she signs per month. Based on three years of data, she has built this probability distribution:

\(x\) 0 1 2 3 4 5
\(P(X=x)\) 0.05 0.15 0.30 0.25 0.15 0.10

Questions:

  1. Verify the distribution is valid. Calculate \(E(X)\) and interpret it.

  2. The consultant tells her that \(\text{Var}(X) = 1.51\) and \(\text{SD}(X) \approx 1.23\). Without recalculating, interpret what the standard deviation tells her about month-to-month variability in new clients.

  3. Each new client generates an average first-year revenue of $3,000 for the advisor, and she incurs $500 in fixed monthly marketing costs regardless of how many clients she signs. Her monthly revenue from new clients (before fixed costs) can be written as \(Y = 3{,}000X - 500\). Simplify and find \(E(Y)\). Interpret the result.

  4. Using \(\text{Var}(X) = 1.51\), find \(\text{Var}(Y)\) and \(\text{SD}(Y)\). What does \(\text{SD}(Y)\) tell the advisor?

Show Solution

Solution:


Part a) Validity and Expected Value:

Validity: All probabilities are between 0 and 1 ✓. Sum: \(0.05 + 0.15 + 0.30 + 0.25 + 0.15 + 0.10 = 1.00\)

\[E(X) = 0(0.05) + 1(0.15) + 2(0.30) + 3(0.25) + 4(0.15) + 5(0.10)\] \[= 0 + 0.15 + 0.60 + 0.75 + 0.60 + 0.50 = \mathbf{2.60} \text{ clients}\]

Interpretation: Over many months, the advisor signs an average of 2.6 new clients per month. This doesn’t mean every month will bring exactly 2 or 3 new clients — some months she may sign none, and others she may sign 5 — but 2.6 is the long-run average she should plan around.


Part b) Interpretation of SD:

A standard deviation of approximately 1.23 clients means that the actual number of new clients typically deviates from the average of 2.6 by about 1 to 1.5 clients per month. The advisor should expect meaningful month-to-month fluctuation: in some months she may sign only 1 client, and in others as many as 4 or 5. The pattern is moderately variable — not so chaotic that planning is impossible, but variable enough that she should not rely on exactly 2.6 new clients every single month.


Part c) \(E(Y)\):

\[E(Y) = E(3{,}000X - 500) = 3{,}000 \times E(X) - 500 = 3{,}000 \times 2.60 - 500 = 7{,}800 - 500 = \mathbf{\$7{,}300}\]

Interpretation: On average, the advisor nets $7,300 per month from new client revenue after deducting fixed marketing costs. This is the long-run average monthly contribution from new clients, useful for setting annual income projections and budgeting.


Part d) \(\text{Var}(Y)\) and \(\text{SD}(Y)\):

\[\text{Var}(Y) = \text{Var}(3{,}000X - 500) = 3{,}000^2 \times \text{Var}(X) = 9{,}000{,}000 \times 1.51 = 13{,}590{,}000 \text{ dollars}^2\]

\[\text{SD}(Y) = \sqrt{13{,}590{,}000} \approx \mathbf{\$3{,}687}\]

Interpretation: The standard deviation of approximately $3,687 means that the advisor’s monthly new-client revenue will typically vary from the expected $7,300 by roughly $3,700 in either direction. In good months she might clear over $11,000; in slower months she might bring in closer to $3,600. This variability highlights why maintaining a marketing pipeline and a financial cushion is important, even when her average monthly performance is solid.


11.5 Question 55

Scenario: A regional bank approves auto loans, and 18% of approved borrowers miss at least one payment in the first year. A loan officer reviews a new batch of 45 recently approved loans.

Questions:

  1. State the distribution of \(X\), the number of borrowers who miss at least one payment. Justify your choice by referencing the conditions required.

  2. Calculate \(E(X)\) and \(\text{SD}(X)\). Interpret \(E(X)\) for the loan officer.

  3. A colleague reports that \(P(X \geq 12) \approx 0.07\). Interpret this probability. Does this result seem surprising given the distribution parameters? Explain.

  4. Suppose the delinquency rate differs between first-time borrowers (\(p = 0.28\)) and returning customers (\(p = 0.09\)), but both groups have \(n = 45\) loans reviewed. Compare the expected number of missed payments and the variability between the two groups. Which group is riskier from the bank’s perspective, and why?

Show Solution

Solution:


Part a) Distribution:

\(X \sim \text{Binomial}(n = 45,\ p = 0.18)\)

Justification:

  1. Fixed number of trials: \(n = 45\) loans are reviewed.
  2. Binary outcome per trial: Each borrower either misses a payment (success) or does not (failure).
  3. Constant probability: Each borrower has the same 18% probability of missing a payment.
  4. Independence: Whether one borrower misses a payment does not affect another (individual loan outcomes are independent).

All four conditions are met, so the binomial distribution is appropriate.


Part b) \(E(X)\) and \(\text{SD}(X)\):

\[E(X) = np = 45 \times 0.18 = \mathbf{8.1} \text{ loans}\]

\[\text{SD}(X) = \sqrt{np(1-p)} = \sqrt{45 \times 0.18 \times 0.82} = \sqrt{6.642} \approx \mathbf{2.58} \text{ loans}\]

Interpretation of \(E(X)\): On average, the loan officer should expect about 8 of the 45 loans in this batch to have at least one missed payment in the first year. This is the long-run average across many such batches of 45 loans; the actual count will vary from batch to batch.


Part c) Interpreting \(P(X \geq 12) \approx 0.07\):

There is approximately a 7% chance that 12 or more of the 45 loans will have a missed payment in the first year. This is a relatively uncommon outcome — it would happen by chance in only about 1 out of every 14 batches of 45 loans.

This result is somewhat surprising but not extreme: 12 loans represent \(12/45 \approx 26.7\%\) of the batch, which is considerably above the historical delinquency rate of 18%. The fact that it carries only a 7% probability suggests that seeing 12 or more delinquencies would be an unusually high rate, possibly warranting a review of the credit assessment process, but not entirely impossible due to random variation alone.


Part d) Comparison: First-Time vs. Returning Borrowers:

First-Time (\(p = 0.28\)) Returning (\(p = 0.09\))
\(E(X)\) \(45 \times 0.28 = 12.6\) \(45 \times 0.09 = 4.05\)
\(\text{Var}(X)\) \(45 \times 0.28 \times 0.72 = 9.072\) \(45 \times 0.09 \times 0.91 = 3.685\)
\(\text{SD}(X)\) \(\sqrt{9.072} \approx 3.01\) \(\sqrt{3.685} \approx 1.92\)

First-time borrowers have a higher expected number of missed payments (12.6 vs. 4.05) and greater variability in outcomes (SD ≈ 3.01 vs. 1.92). From the bank’s perspective, first-time borrowers are riskier on both dimensions: the average loss exposure is higher, and there is more unpredictability in how many loans in a given batch will become delinquent. The bank may wish to apply stricter approval criteria or require larger down payments for first-time borrowers to offset this higher risk.


11.6 Question 56

Scenario: A human resources team creates the following contingency table for 500 recent job applicants, recording their highest education level and whether they received a job offer:

Offer Received No Offer Total
Graduate Degree 135 65 200
Undergraduate Only 105 195 300
Total 240 260 500

Questions:

  1. Calculate \(P(\text{Graduate})\), \(P(\text{Offer})\), and \(P(\text{Graduate AND Offer})\).

  2. Are “having a graduate degree” and “receiving an offer” independent? Show your work and interpret the result.

  3. Are “receiving an offer” and “not receiving an offer” mutually exclusive? Explain.

  4. Calculate and interpret \(P(\text{Offer} \mid \text{Graduate})\) and \(P(\text{Offer} \mid \text{Undergraduate Only})\). What does the comparison suggest about the value of a graduate degree in this hiring process?

Show Solution

Solution:


Part a) Basic Probabilities:

\[P(\text{Graduate}) = \frac{200}{500} = 0.40\]

\[P(\text{Offer}) = \frac{240}{500} = 0.48\]

\[P(\text{Graduate AND Offer}) = \frac{135}{500} = 0.27\]


Part b) Testing Independence:

For independence: \(P(\text{Graduate AND Offer}) = P(\text{Graduate}) \times P(\text{Offer})\)

\[P(\text{Graduate}) \times P(\text{Offer}) = 0.40 \times 0.48 = 0.192\]

But the actual joint probability is \(0.27 \neq 0.192\).

Conclusion: The events are NOT independent. The probability of being a graduate-degree holder who received an offer (27%) is much higher than the 19.2% we would expect if the two events were unrelated. This means that having a graduate degree is associated with a higher probability of receiving an offer.


Part c) Mutual Exclusivity:

Yes, “receiving an offer” and “not receiving an offer” are mutually exclusive. Every applicant receives exactly one outcome — either an offer or no offer. It is impossible for an applicant to be in both categories simultaneously, so \(P(\text{Offer AND No Offer}) = 0\).


Part d) Conditional Probabilities and Implications:

\[P(\text{Offer} \mid \text{Graduate}) = \frac{135}{200} = 0.675 \text{ or } 67.5\%\]

\[P(\text{Offer} \mid \text{Undergraduate Only}) = \frac{105}{300} = 0.35 \text{ or } 35\%\]

Interpretation: Among applicants with a graduate degree, 67.5% received a job offer, compared to only 35% of undergraduate-only applicants. Graduate-degree holders were nearly twice as likely to receive an offer. This suggests that, in this hiring process, a graduate degree is a meaningful differentiator. However, this is an observational comparison — other factors (field of study, work experience, interview performance) may also explain the difference. The HR team should be cautious about concluding that the graduate degree itself causes a higher offer rate without controlling for other variables.


11.7 Question 57

Scenario: A call center for a major retailer receives customer calls at an average rate of 6 calls per 15-minute interval during peak hours.

Questions:

  1. What distribution models the number of calls per 15-minute interval? Calculate \(E(X)\) and \(\text{SD}(X)\).

  2. Calculate the probability of receiving exactly 6 calls in a 15-minute interval.

  3. Calculate the expected number of calls during a 1-hour peak period.

  4. The call center can handle up to 8 calls per 15-minute interval without customers experiencing hold times. What is the probability that customers will experience hold times in a given 15-minute interval? Interpret this for staffing purposes.

Show Solution

Solution:

\(X \sim \text{Poisson}(\lambda = 6)\) per 15-minute interval.


Part a) Distribution and Summary Statistics:

\[E(X) = \lambda = 6 \text{ calls per 15 minutes}\]

\[\text{SD}(X) = \sqrt{\lambda} = \sqrt{6} \approx 2.45 \text{ calls}\]


Part b) \(P(X = 6)\):

\[P(X = 6) = \frac{e^{-6} \times 6^6}{6!} = \frac{0.002479 \times 46{,}656}{720} = \frac{115.67}{720} \approx 0.1606\]

Answer: \(P(X = 6) \approx 0.161\) or 16.1%.

This is one of the most likely single outcomes, as it equals the mean exactly. Receiving exactly the expected number of calls is still only about a 1-in-6 chance for any given interval.


Part c) Expected Calls in 1 Hour:

A 1-hour period contains four 15-minute intervals.

\[E(X_{1\text{ hr}}) = 6 \times 4 = 24 \text{ calls}\]

The distribution for the 1-hour period is Poisson(\(\lambda = 24\)).


Part d) \(P(X > 8)\) — Probability of Hold Times:

\[P(X > 8) = 1 - P(X \leq 8)\]

Computing \(P(X \leq 8)\) for Poisson(\(\lambda = 6\)):

\(k\) \(P(X = k)\)
0 0.00248
1 0.01487
2 0.04462
3 0.08924
4 0.13385
5 0.16062
6 0.16062
7 0.13768
8 0.10326

\(P(X \leq 8) \approx 0.847\)

\[P(X > 8) = 1 - 0.847 = \mathbf{0.153}\]

Answer: \(P(X > 8) \approx 0.153\) or 15.3%

Staffing interpretation: There is approximately a 15% chance that any given 15-minute interval during peak hours will exceed the call center’s handling capacity of 8 calls, causing customer hold times. This means roughly 1 in every 6–7 intervals will have hold times. Over a 1-hour peak period (4 intervals), customers should expect hold times in at least one interval more often than not. Management should consider whether a second wave of agents scheduled during peak hours could reduce this probability to an acceptable level.


11.8 Question 58

Scenario: A warehouse manager models the number of shipping errors \(X\) per day with the following distribution:

\(x\) 0 1 2 3 4
\(P(X=x)\) 0.30 0.35 0.20 0.10 0.05

Each shipping error costs the company $150 to resolve (reship, customer credit, labor). Let \(Y = 150X\) be the daily error cost.

Questions:

  1. Verify the distribution is valid. Calculate \(E(X)\) and interpret it.

  2. Calculate \(\text{Var}(X)\). Show your work using the formula \(\text{Var}(X) = \sum(x - E(X))^2 P(X = x)\).

  3. Find \(E(Y)\), \(\text{Var}(Y)\), and \(\text{SD}(Y)\). Interpret \(E(Y)\) and \(\text{SD}(Y)\) for the warehouse manager.

  4. The manager sets a daily budget of $200 for error resolution. On what percentage of days can she expect to exceed this budget? (Hint: determine which values of \(X\) would cause \(Y > 200\), then use the distribution.)

Show Solution

Solution:


Part a) Validity and \(E(X)\):

Validity: All probabilities non-negative ✓. Sum: \(0.30 + 0.35 + 0.20 + 0.10 + 0.05 = 1.00\) ✓.

\[E(X) = 0(0.30) + 1(0.35) + 2(0.20) + 3(0.10) + 4(0.05)\] \[= 0 + 0.35 + 0.40 + 0.30 + 0.20 = \mathbf{1.25} \text{ errors per day}\]

Interpretation: On average, the warehouse experiences 1.25 shipping errors per day. Over many days, the daily error count averages out to about 1 or 2 errors.


Part b) \(\text{Var}(X)\):

\(x\) \(P(X=x)\) \((x - 1.25)\) \((x - 1.25)^2\) \((x - 1.25)^2 \cdot P(X=x)\)
0 0.30 −1.25 1.5625 0.4688
1 0.35 −0.25 0.0625 0.0219
2 0.20 0.75 0.5625 0.1125
3 0.10 1.75 3.0625 0.3063
4 0.05 2.75 7.5625 0.3781

\[\text{Var}(X) = 0.4688 + 0.0219 + 0.1125 + 0.3063 + 0.3781 = \mathbf{1.2875}\]


Part c) \(E(Y)\), \(\text{Var}(Y)\), and \(\text{SD}(Y)\):

\[E(Y) = 150 \times E(X) = 150 \times 1.25 = \mathbf{\$187.50 \text{ per day}}\]

\[\text{Var}(Y) = 150^2 \times \text{Var}(X) = 22{,}500 \times 1.2875 = 28{,}968.75\]

\[\text{SD}(Y) = \sqrt{28{,}968.75} \approx \mathbf{\$170.20 \text{ per day}}\]

Interpretation of \(E(Y)\): The warehouse manager can expect error resolution to cost an average of $187.50 per day over the long run.

Interpretation of \(\text{SD}(Y)\): The daily cost of errors typically deviates from the $187.50 average by about $170 — a large standard deviation relative to the mean. This reflects the high variability: many days cost nothing or very little (0 or 1 errors), but occasional days with 3 or 4 errors generate significantly higher costs. The manager should maintain a flexible budget rather than expecting consistent daily costs.


Part d) Probability of Exceeding the $200 Budget:

\(Y > 200\) when \(150X > 200\), i.e., \(X > 200/150 = 1.33\), which means \(X \geq 2\).

\[P(X \geq 2) = P(X=2) + P(X=3) + P(X=4) = 0.20 + 0.10 + 0.05 = \mathbf{0.35}\]

Answer: The manager can expect to exceed her $200 daily budget on approximately 35% of days — more than 1 in every 3 days. Given that the expected daily cost of $187.50 is already close to her budget ceiling of $200, she has very little room for above-average error days. The manager should consider either raising the budget threshold, implementing process improvements to reduce the error rate, or negotiating lower per-error resolution costs.


11.9 Question 59

Scenario: An insurance company processes auto insurance claims. Historically, 14% of claims filed are later found to be fraudulent. An investigator is assigned a batch of 35 claims to review.

Questions:

  1. Identify the appropriate distribution and calculate \(E(X)\) and \(\text{SD}(X)\), where \(X\) is the number of fraudulent claims in the batch.

  2. A supervisor reports that \(P(X \leq 2) \approx 0.28\). Interpret this probability. What does it mean for the investigator’s workload?

  3. \(P(X \geq 8) \approx 0.04\). Interpret this probability in context. If the investigator finds 8 or more fraudulent claims in this batch, what might this suggest?

  4. Compare this batch (\(n = 35\), \(p = 0.14\)) with a batch from a higher-risk region where \(p = 0.30\), \(n = 35\). Calculate \(E(X)\) and \(\text{SD}(X)\) for each batch, and explain which batch poses a greater fraud risk to the company.

Show Solution

Solution:


Part a) Distribution, \(E(X)\), and \(\text{SD}(X)\):

\(X \sim \text{Binomial}(n = 35,\ p = 0.14)\)

\[E(X) = np = 35 \times 0.14 = \mathbf{4.9} \text{ fraudulent claims}\]

\[\text{SD}(X) = \sqrt{np(1-p)} = \sqrt{35 \times 0.14 \times 0.86} = \sqrt{4.214} \approx \mathbf{2.05} \text{ claims}\]


Part b) Interpreting \(P(X \leq 2) \approx 0.28\):

There is approximately a 28% chance that 2 or fewer of the 35 claims are fraudulent in a given batch. For the investigator, this means that in about 1 out of every 4 batches, the fraud workload will be light — only 0, 1, or 2 fraudulent cases requiring detailed follow-up. However, this also means that the remaining 72% of batches will have 3 or more fraudulent claims, which is the more common situation.


Part c) Interpreting \(P(X \geq 8) \approx 0.04\):

There is approximately a 4% chance that 8 or more fraudulent claims appear in a single batch. This is an uncommon outcome — occurring in roughly 1 out of every 25 batches under normal conditions. If an investigator actually finds 8 or more fraudulent claims in a batch, this may suggest:

  • The batch was drawn from a higher-risk pool of claimants
  • A coordinated fraud ring may be operating in the area
  • Claims review criteria may have been loosened, attracting more fraudulent submissions
  • While possible due to random variation, it warrants escalation and further investigation of the claim intake process.

Part d) Comparison — Standard vs. High-Risk Region:

Standard batch (\(p = 0.14\)): - \(E(X) = 4.9\), \(\text{SD}(X) \approx 2.05\)

High-risk batch (\(p = 0.30\)): - \(E(X) = 35 \times 0.30 = 10.5\) - \(\text{SD}(X) = \sqrt{35 \times 0.30 \times 0.70} = \sqrt{7.35} \approx 2.71\)

The high-risk region batch has more than twice the expected number of fraudulent claims (10.5 vs. 4.9) and greater variability (SD = 2.71 vs. 2.05). From the company’s perspective, the high-risk region poses substantially greater fraud exposure: the average investigator should expect about 10–11 fraudulent claims per 35-claim batch, and the higher standard deviation means the actual count could range widely from batch to batch. The company may want to allocate more experienced investigators, apply stricter claims processing, or consider pricing adjustments for policies in that region.


11.10 Question 60

Scenario: A supermarket chain analyzes three aspects of its operations:

  • Product demand: The number of premium olive oil bottles sold per hour follows a Poisson distribution with \(\lambda = 2.5\).
  • Supplier reliability: 70% of deliveries from a particular supplier arrive on time. Today, 20 deliveries are scheduled from this supplier.
  • Checkout time: Customer checkout time at the express lane is uniformly distributed between 1 and 7 minutes.

Questions:

  1. For each scenario, state the distribution, parameter(s), and expected value. Interpret each expected value in context.

  2. For the product demand scenario, calculate the probability that exactly 3 bottles are sold in a given hour.

  3. For the supplier reliability scenario, calculate the expected number of on-time deliveries and the standard deviation. If only 10 deliveries arrive on time, should the store manager be concerned? Explain using standard deviations.

  4. For the checkout time scenario, a customer complains if checkout takes more than 5 minutes. What fraction of express-lane customers will complain? What is the expected checkout time, and how does this compare to the complaint threshold?

Show Solution

Solution:


Part a) Distributions, Parameters, and Expected Values:

Olive oil demand: \(X \sim \text{Poisson}(\lambda = 2.5)\)\(E(X) = 2.5\) bottles/hour. Interpretation: On average, the store sells 2.5 premium olive oil bottles per hour. In any given hour sales could vary — possibly 0, 1, 2, 3, or more bottles.

Supplier deliveries: \(Y \sim \text{Binomial}(n = 20,\ p = 0.70)\)\(E(Y) = 20 \times 0.70 = 14\) on-time deliveries. Interpretation: Out of today’s 20 scheduled deliveries, the store can expect 14 to arrive on time on average.

Checkout time: \(Z \sim \text{Uniform}(a = 1,\ b = 7)\)\(E(Z) = (1+7)/2 = 4\) minutes. Interpretation: The average express checkout takes 4 minutes. Every duration between 1 and 7 minutes is equally likely.


Part b) \(P(X = 3)\) for Poisson(\(\lambda = 2.5\)):

\[P(X = 3) = \frac{e^{-2.5} \times 2.5^3}{3!} = \frac{0.08208 \times 15.625}{6} = \frac{1.2825}{6} \approx \mathbf{0.2138}\]

Answer: \(P(X = 3) \approx 0.214\) or 21.4%. Selling exactly 3 bottles in an hour is among the most likely outcomes, since 3 is close to the mean of 2.5.


Part c) Supplier Reliability — \(E(Y)\), \(\text{SD}(Y)\), and 10 On-Time Deliveries:

\[E(Y) = 14 \quad \text{(from Part a)}\]

\[\text{SD}(Y) = \sqrt{20 \times 0.70 \times 0.30} = \sqrt{4.2} \approx 2.05 \text{ deliveries}\]

Should the manager be concerned about only 10 on-time deliveries?

\[z = \frac{10 - 14}{2.05} = \frac{-4}{2.05} \approx -1.95 \text{ standard deviations below the mean}\]

Ten on-time deliveries is approximately 1.95 standard deviations below the expected value of 14 — a moderately unusual outcome that would occur by chance roughly 2–3% of the time. The manager has reason to be concerned — this is below average enough to suggest something may be wrong (supplier delays, route issues, staffing problems) rather than normal random variation. The manager should follow up with the supplier to determine the cause, as this outcome is unlikely to occur purely by chance.


Part d) Checkout Time — Complaint Probability and Expected Value:

\[P(Z > 5) = \frac{7 - 5}{7 - 1} = \frac{2}{6} = \frac{1}{3} \approx 0.333\]

Approximately 33.3% of express-lane customers will experience a checkout of more than 5 minutes and may complain.

The expected checkout time is 4 minutes (from Part a), which is below the 5-minute complaint threshold. This might seem reassuring, but the uniform distribution means that checkouts near 5, 6, or 7 minutes are just as likely as checkouts near 1, 2, or 3 minutes — the 1-minute intervals do not “pull” toward the mean. Even though the average customer checks out in 4 minutes, fully one-third of customers exceed the threshold, suggesting the express lane process has meaningful variability that the expected value alone does not capture. The store might consider redesigning the express lane or retraining cashiers to reduce the upper tail of checkout times.