HW7: ShopSmart Hypothesis Testing Challenge

Making Data-Driven Business Decisions

Author

🛒 The Case

Welcome, Data Analyst!

ShopSmart, a rapidly growing e-commerce platform, has hired you as their lead statistical analyst. The company is facing critical business decisions: Should they implement a new checkout system? Is their premium membership program effective? Are customer satisfaction scores improving?

Maya Chen, the VP of Analytics, greets you: “We’re drowning in data but starving for decisions. We need you to help us test our hypotheses, compare different customer segments, and make confident recommendations. Every decision affects our revenue and customer experience. We can’t just guess—we need rigorous statistical testing. Ready to transform our data into actionable insights?”

Your mission: Apply hypothesis testing frameworks to help ShopSmart make evidence-based business decisions!

Question 1: Hypothesis Testing Foundations

a. For each business scenario, formulate appropriate null and alternative hypotheses:

ShopSmart wants to test if their new express checkout reduces average purchase time (currently μ = 4.5 minutes)
They want to test if conversion rate exceeds the industry standard of 2.5%
They want to test if mean order value differs from last year’s $78.50

b. For scenario #1 above:

Describe what a Type I error would mean for ShopSmart. What’s the consequence?
Describe what a Type II error would mean. What’s the consequence?
If ShopSmart uses α = 0.05, what is the probability of Type I error?
If you’re told the test has 80% power, what is the probability of Type II error?

Question 2: Hypothesis Test with Known σ

ShopSmart has extensive historical data showing that order values have σ = $22.40. After implementing a new product recommendation algorithm, they randomly sample n = 64 orders and find x̄ = $82.30.

Historical mean order value: μ₀ = $78.50

Test at α = 0.05 whether the new algorithm has changed mean order value.

a. Set up: * State H₀ and Hₐ (two-tailed test) * Verify conditions for using the normal distribution * What significance level are you using?

b. Calculate the test statistic:

Use: $z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$

c. Find the p-value: * Sketch a normal curve and shade the relevant area * Calculate the p-value for this two-tailed test

d. Make your decision and conclusion: * Compare p-value to α and state your decision * Write a complete conclusion that Maya can present to leadership

Question 3: Hypothesis Test with Unknown σ (t-test)

ShopSmart tests a new mobile app interface with a random sample of n = 20 users. They measure time to complete a purchase (minutes):

3.2, 2.8, 3.5, 2.9, 3.1, 3.8, 2.7, 3.3, 2.9, 3.0, 3.4, 2.8, 3.2, 3.1, 2.9, 3.3, 3.0, 2.8, 3.4, 3.1

The old interface had a mean completion time of μ₀ = 3.5 minutes.

Test at α = 0.05 if the new interface has reduced completion time.

a. Calculate summary statistics: * Find x̄ and s (show your work or describe your process)

b. Set up your test: * State H₀ and Hₐ (one-tailed test) * What distribution should you use and why? * What are the degrees of freedom? * State any assumptions needed

c. Calculate test statistic:

Use: $t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$

d. Find the critical value: * For a one-tailed test with df = 19 and α = 0.05, t* = -1.729 * Make your decision by comparing test statistic to critical value

e. Conclusion: * State your statistical decision * Write a practical conclusion for ShopSmart’s development team

Question 4: Comparing Two Independent Means (Equal Variances)

ShopSmart wants to compare customer satisfaction scores between two customer segments: Premium members vs. Standard members.

Premium members (n₁ = 35): x̄₁ = 8.4, s₁ = 1.6

Standard members (n₂ = 40): x̄₂ = 7.8, s₂ = 1.5

Test at α = 0.05 if Premium members have higher satisfaction scores.

a. Hypotheses: * State H₀ and Hₐ for testing if μ₁ > μ₂ * Define what μ₁ and μ₂ represent in context

b. Calculate the pooled standard deviation:

$s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}}$

c. Calculate the test statistic:

$t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$

With df = n₁ + n₂ - 2

d. Decision: * For df = 73 and one-tailed test at α = 0.05, critical value ≈ 1.666 * Make your decision and write a conclusion

e. Calculate Cohen’s d effect size:

$d = \frac{\bar{x}_1 - \bar{x}_2}{s_p}$

Interpret the effect size using Cohen’s standards: * Small: d = 0.2 * Medium: d = 0.5 * Large: d = 0.8

Is this difference practically significant?

Question 5: Comparing Two Independent Proportions

ShopSmart runs an A/B test on their checkout page design:

Design A (current): * n₁ = 500 visitors * x₁ = 85 completed purchases * p̂₁ = ?

Design B (new): * n₂ = 500 visitors * x₂ = 110 completed purchases * p̂₂ = ?

Test at α = 0.05 if the two designs have different conversion rates.

a. Calculate sample proportions: * Find p̂₁ and p̂₂

b. Set up: * State H₀ and Hₐ (two-tailed test) * Check conditions: Are np̂ ≥ 10 and n(1-p̂) ≥ 10 for both groups?

c. Calculate pooled proportion:

$\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$

d. Calculate test statistic:

$z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$

e. Find p-value and make decision: * Calculate the two-tailed p-value * State your conclusion * Should ShopSmart implement Design B?

Question 6: Understanding P-values and Statistical Decisions

a. ShopSmart conducts a hypothesis test and obtains p-value = 0.032.

What does this p-value mean in plain language?
What decision should they make at α = 0.05?
What decision should they make at α = 0.01?
Explain why the decision changes

b. True or False (explain each):

“If p-value = 0.04, there’s a 4% chance the null hypothesis is true.”
“A p-value of 0.001 provides stronger evidence against H₀ than p-value = 0.04.”
“If we fail to reject H₀, we have proven H₀ is true.”
“Statistical significance always means practical significance.”

Question 7: Choosing the Right Test & Distribution

For each scenario, identify: * What are you testing? (mean or proportion, one group or two groups) * Which distribution should you use? (z or t) * One-tailed or two-tailed test? * What conditions must be checked?

Scenarios:

a. Testing if mean customer lifetime value exceeds $500; n = 45, x̄ = $523, s = $85

b. Comparing mean response times between two customer service teams; Team 1: n₁ = 28, Team 2: n₂ = 32, both have unknown population standard deviations

c. Testing if the proportion of mobile users is greater than 0.65; n = 250, x = 175

d. Testing if mean shipping time has changed from 3 days; n = 15, x̄ = 2.7 days, σ = 0.8 days (known from historical data), population approximately normal

Question 8: Type I and Type II Errors in Context

ShopSmart is testing a new fraud detection algorithm. They want to test if the algorithm can reduce fraudulent transactions below the current rate of 1.2% (H₀: p = 0.012 vs. Hₐ: p < 0.012).

a. Explain in business terms: * What is a Type I error in this context? What are the costs? * What is a Type II error in this context? What are the costs?

b. Trade-offs: * If ShopSmart uses α = 0.01 instead of α = 0.05, how does this affect Type I error probability? * How does this change affect Type II error probability? * Which error type should ShopSmart be more concerned about? Why?

c. Power analysis: * If the test has power = 0.75, what does this mean? * How could ShopSmart increase the power of their test?

Question 9: Comprehensive Hypothesis Test

ShopSmart claims their new premium shipping option delivers orders faster than the standard 3.2 days. They know from extensive logistics data that σ = 0.85 days. A random sample of n = 50 premium shipments shows x̄ = 2.85 days.

Conduct a complete hypothesis test at α = 0.01.

a. Formulate hypotheses: * Write H₀ and Hₐ * Explain why this is a one-tailed test

b. Check conditions and identify distribution: * List all conditions needed * Which distribution will you use?

c. Calculate test statistic: * Show your calculation * Interpret what this value means

d. Find p-value: * Sketch the distribution and shade the relevant area * Calculate the p-value

e. Make decision at α = 0.01: * State your decision * Provide a complete business conclusion

f. What if α = 0.05?: * Would your decision change? Explain without recalculating

💭 Question 10: Critical Thinking

a. Multiple testing problem:

ShopSmart runs 50 different A/B tests in a month, each at α = 0.05, testing various website features. If none of the features actually have any effect, approximately how many tests would show “statistically significant” results just by chance? What does this suggest about how ShopSmart should interpret their results?

b. Statistical vs. Practical Significance:

ShopSmart conducts a massive study with n = 10,000 customers and finds that a new feature increases average order value from $78.50 to $79.10, with p-value < 0.001. The result is highly statistically significant.

Why is this result statistically significant?
Calculate the actual difference in dollars
Is this difference practically significant for business decisions?
What other factors should Maya consider before implementing this feature company-wide?

c. Reflection (4-6 sentences):

How do hypothesis tests help businesses make decisions under uncertainty?
Why is understanding Type I and Type II errors crucial for business decisions?
What’s one important concept about hypothesis testing that you think business managers often misunderstand?
Describe a real business situation (not from this homework) where you would need to compare two means or two proportions

🎉 Excellent work, Data Analyst!

You’ve successfully helped ShopSmart navigate the complex world of hypothesis testing! From formulating hypotheses to comparing different customer segments and understanding the nuances of p-values, you’ve demonstrated the rigorous thinking needed for data-driven business decisions.

Remember: Statistical significance isn’t the same as business significance. Always consider the practical implications of your findings, not just the p-value. In the business world, the best analysts combine statistical rigor with business judgment!