STAT 17: Central Limit Theorem & Confidence Intervals

Prof. Marcela Alfaro Cordoba

Statistics - UCSC

04 Nov 2025

Case Study: Quality Control at a Tech Company

Meet Chloe, a quality control manager at a smartphone manufacturer.

Her challenge: She can’t test every phone (population of millions), so she must:

  • Sample 100 phones from each production batch
  • Estimate the average battery life for ALL phones
  • Make confident statements about product quality
  • Decide if the batch meets standards

The problem: How can conclusions from 100 phones apply to millions?

The answer: The Central Limit Theorem (CLT) - one of the most powerful ideas in statistics! Understanding CLT helps Chloe (and you!) make reliable inferences from samples.

Quick Review: Normal Distribution

What we learned last time:

  • Normal distribution: X ~ N(μ, σ²), bell-shaped and symmetric
  • Empirical Rule: 68-95-99.7 for μ ± σ, μ ± 2σ, μ ± 3σ
  • Z-scores: z = (x - μ)/σ standardizes any value
  • Google Sheets: NORM.DIST for probabilities, NORM.INV for percentiles

Key formulas:

  • P(X ≤ x): =NORM.DIST(x, μ, σ, TRUE)
  • Finding x for given probability: =NORM.INV(probability, μ, σ)

Today: We extend this to samples and introduce statistical inference!

What We’ll Accomplish Today

By the end of this lecture, you will be able to:

  • Understand the Central Limit Theorem and why it’s fundamental
  • Recognize when CLT conditions are met
  • Apply CLT to calculate probabilities for sample means
  • Understand the concept of statistical inference
  • Construct confidence intervals for population means (σ known)
  • Construct confidence intervals for population means (σ unknown)
  • Interpret confidence intervals correctly in context

The Sampling Problem

Fundamental Challenge in Statistics:

  • Population: All items we care about (often huge or infinite)
  • Sample: Subset we actually observe (limited by time/cost)
  • Question: How do we learn about the population from a sample?

Example: Chloe’s battery life problem

  • Population: All phones manufactured (millions) in that company
  • Sample: 100 randomly selected phones
  • Goal: Estimate average battery life for ALL phones from just 100

This is statistical inference!

Population vs Sample: Key Notation

Concept Population Sample
Size N (usually unknown) n (we choose this)
Mean μ (unknown parameter) x̄ (known statistic)
Std Dev σ (usually unknown) s (calculated from data)
Proportion p (unknown parameter) p̂ (known statistic)

Key Terms:

  • Parameter: Fixed (but unknown) population characteristic
  • Statistic: Calculated value from sample data
  • Goal: Use statistics to estimate parameters!

ACTIVITY: Discovering the Sampling Distribution

Let’s experience the CLT with YOUR data!

PART 1: Individual Data (2 min)

Go to: [bit.ly/stat17-heights]

  1. Enter YOUR height (inches) in the “individual Heights” tab
  2. Watch the individual histogram (be patient :))
  3. Note the shape and spread

PART 2: Sample Means (3 min)

In groups of 5:

  1. Calculate your group average
  2. Enter in “Group Averages” tab
  3. Watch that histogram form
  4. Compare the two histograms!

DISCUSSION (2 min): What’s different about the two distributions?

Poll Everywhere: What Did You Notice?

Compare the distribution of individual heights to group averages:

  1. Group averages are more spread out
  2. Group averages are more bell-shaped
  3. Group averages have the same shape
  4. Group averages are more skewed

Introducing the Central Limit Theorem

The Central Limit Theorem (CLT) states:

For a random sample of size n from ANY population with mean μ and standard deviation σ:

As n gets large, the sampling distribution of x̄ is approximately normal:

\[\bar{x} \sim N\left(\mu, \frac{\sigma^2}{n}\right)\]

Or equivalently: \(\bar{x} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\)

This is AMAZING because:

  • Works for ANY population distribution (not just normal!)
  • Only need “large enough” sample size
  • Allows us to make probability statements about x̄

Understanding the CLT: Key Points

What the CLT tells us:

  1. Center: E(x̄) = μ
    • Sample mean is unbiased estimator of population mean
  2. Spread: SD(x̄) = σ/√n (called Standard Error)
    • Larger samples → less variability in x̄
    • Standard Error decreases as n increases
  3. Shape: Approximately normal (for large n)
    • Even if population is skewed or unusual!

The Standard Error (SE):

\[SE = \frac{\sigma}{\sqrt{n}}\]

Critical insight: Variability of x̄ decreases with √n, not n!

CLT Conditions: When Does It Apply?

Two scenarios where CLT works:

Scenario 1: Population is already normal

  • Then x̄ is exactly normal for ANY sample size n

  • No minimum n required!

Scenario 2: Population is not normal

  • Need “large enough” sample size

  • Rule of thumb: n ≥ 30 usually sufficient

  • More skewed population → need larger n

Additional requirements:

  • Random sampling from population
  • Observations should be independent

In practice: Check these conditions before using CLT!

Visualizing the Central Limit Theorem

Example: Population is strongly right-skewed

  • Population: Not normal at all
  • Sample means (n=5): Starting to look more normal
  • Sample means (n=30): Very close to normal
  • Sample means (n=100): Essentially normal

The magic: Regardless of population shape, x̄ becomes normal!

Standard Error gets smaller:

  • n=5: SE large, x̄ values spread out

  • n=30: SE smaller, x̄ values cluster around μ

  • n=100: SE even smaller, x̄ very close to μ

See Applet: https://istats.shinyapps.io/sampdist_cont/

Example: Battery Life Application

Chloe’s scenario: Battery life for all phones

  • Population mean: μ = 24 hours (unknown to Chloe)
  • Population std dev: σ = 4 hours (from historical data)
  • Sample size: n = 100 phones

Question: What is the probability that the sample mean is between 23.5 and 24.5 hours?

By CLT: \(\bar{x} \sim N\left(24, \frac{4^2}{100}\right) = N(24, 0.16)\)

So: \(\bar{x} \sim N(24, 0.4)\) where 0.4 is the standard error

Google Sheets:

=NORM.DIST(24.5, 24, 0.4, TRUE) - NORM.DIST(23.5, 24, 0.4, TRUE)

Answer: ≈ 0.7888 or about 79%

THINK-PAIR-SHARE 1 (7 minutes)

Understanding the Central Limit Theorem

A coffee shop’s daily revenue has μ = $2400 and σ = $600. You track revenue for n = 36 randomly selected days.

Answer these questions:

  1. What is the mean of the sampling distribution of x̄?
  2. What is the standard error of x̄?
  3. Is the CLT applicable here? Why or why not?
  4. What is P(x̄ > $2500)?
  5. What is P($2300 ≤ x̄ ≤ $2500)?
  6. Between what two values will the middle 95% of sample means fall?

Use Google Sheets for probability calculations!

Share your answers in Poll Everywhere!

Between what two values will the middle 95% of sample means fall?

Calculating Probabilities with CLT

General approach for sample means:

Given: Population with μ and σ, sample size n

Step 1: Check CLT conditions - Random sample? - n ≥ 30 or population normal?

Step 2: Find sampling distribution - Mean: μ - Standard error: σ/√n

Step 3: Calculate probability using normal distribution

=NORM.DIST(x, μ, σ/SQRT(n), TRUE)

For P(x̄ ≥ a):

=1 - NORM.DIST(a, μ, σ/SQRT(n), TRUE)

For P(a ≤ x̄ ≤ b):

=NORM.DIST(b, μ, σ/SQRT(n), TRUE) - NORM.DIST(a, μ, σ/SQRT(n), TRUE)

More practice with CLT

Example 1: Shipping Company Quality Control

The situation: A shipping company processes thousands of packages daily. Individual package weights vary considerably (σ = 10 lbs) around a mean of μ = 50 lbs.

The question: If a delivery truck can safely carry packages with an average weight up to 48 lbs, and we load n = 64 randomly selected packages, what’s the probability the truck is overloaded?

Solution:

  • Standard Error: SE = 10/√64 = 1.25 lbs

  • We need: P(x̄ < 48)

  • =NORM.DIST(48, 50, 1.25, TRUE) ≈ 0.0548

Interpretation: Only about 5.5% chance that 64 random packages average less than 48 lbs. The truck is likely safe! Notice how much more predictable the average is compared to individual packages.

More practice with CLT

Example 2: Standardized Testing Assessment

The situation: A state education department knows that individual student math scores vary widely (σ = 12 points) around a mean of μ = 75.

The question: A school’s performance is evaluated based on the average score of n = 100 randomly selected students. What’s the probability the school’s average falls within the “acceptable” range of 73-77 points?

Solution:

  • Standard Error: SE = 12/√100 = 1.2 points

  • We need: P(73 ≤ x̄ ≤ 77)

  • =NORM.DIST(77,75,1.2,TRUE)-NORM.DIST(73,75,1.2,TRUE) ≈ 0.9044

Interpretation: About 90% of the time, a school’s average will fall in this range. With 100 students, individual score variability (σ = 12) becomes much smaller at the average level (SE = 1.2)!

Comparing Examples: The Magic of √n

What do both examples teach us?

Feature Package Weights Test Scores Key Insight
Individual Variability σ = 10 lbs σ = 12 points High variability in individuals
Sample Size n = 64 n = 100 Larger samples better
Standard Error SE = 1.25 lbs SE = 1.2 points Much smaller than σ!
Predictability 95% of truck loads: 47.6-52.4 lbs 95% of schools: 72.6-77.4 points Averages are stable!

🧘‍♀️ STRETCH BREAK

Time to move! (5 minutes)

  • Stand up and stretch 🤸‍♀️
  • Chat with neighbors about CLT 💬
  • Grab some water 💧

From CLT to Statistical Inference

The big transition: Using samples to learn about populations

Three main types of inference:

  1. Point Estimation: Best single guess for parameter
    • Example: x̄ = 24.3 hours estimates μ
  2. Interval Estimation (Confidence Intervals): Range of plausible values
    • Example: μ is between 23.5 and 25.1 hours (we’re 95% confident)
  3. Hypothesis Testing: Testing claims about parameters
    • Example: Is μ really 24 hours? (coming in next lecture!)

Today’s focus: Confidence Intervals!

What is a Confidence Interval?

Definition: A range of values that likely contains the true parameter

Why we need them:

  • Point estimates (like x̄) are rarely exactly correct
  • CI provides margin of error around our estimate
  • Quantifies our uncertainty

General form:

\[\text{Estimate} \pm \text{Margin of Error}\]

For a mean:

\[\bar{x} \pm \text{(critical value)} \times \text{SE}\]

Interpretation: We are [confidence level]% confident the true parameter falls in this interval

Confidence Level vs Margin of Error

Confidence Level: How confident we are that interval contains true parameter

  • Common choices: 90%, 95%, 99%
  • Higher confidence → wider interval
  • Trade-off: confidence vs precision

Margin of Error (ME): Half the width of the CI

  • Larger ME → wider interval → less precise
  • ME depends on: variability (σ), sample size (n), confidence level

The relationship:

  • Want high confidence AND small ME?
  • Need larger sample size!

Confidence Interval: σ Known, large n

Scenario: We know the population standard deviation σ

When to use:

  • Historical data provides reliable σ
  • Large population with known variability
  • Theoretical standard deviation

Formula:

\[\bar{x} \pm z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\]

Confidence Interval: σ Known, large n

Formula:

\[\bar{x} \pm z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\]

Where:

  • x̄ = sample mean
  • z_{α/2} = critical z-value (depends on confidence level)
  • σ/√n = standard error

Common z-values:

  • 90% confidence: z = 1.645
  • 95% confidence: z = 1.96
  • 99% confidence: z = 2.576

Finding Critical Values in Google Sheets

For confidence level C (like 0.95 for 95%):

Method 1: Direct calculation

For 95% confidence (α = 0.05):

=NORM.S.INV(1 - 0.05/2)
=NORM.S.INV(0.975) ≈ 1.96

Method 2: For any confidence level C

=NORM.S.INV(1 - (1-C)/2)

Example: 90% confidence

=NORM.S.INV(1 - (1-0.90)/2)
=NORM.S.INV(0.95) ≈ 1.645

Example: 99% confidence

=NORM.S.INV(1 - (1-0.99)/2)
=NORM.S.INV(0.995) ≈ 2.576

Example: Battery Life CI (σ known)

Chloe’s data:

  • Sample: n = 100 phones
  • Sample mean: x̄ = 23.8 hours
  • Known population std dev: σ = 4 hours
  • Want 95% confidence interval

Step 1: Check conditions: Random sample? ✓ and n = 100 ≥ 30? ✓

Step 2: Calculate critical value = NORM.S.INV(0.975) = 1.96

Step 3: Calculate standard error SE = 4/SQRT(100) = 0.4

Step 4: Calculate margin of error ME = 1.96 × 0.4 = 0.784

Step 5: Construct interval 23.8 ± 0.784 = (23.016, 24.584)

Interpretation: We are 95% confident the true average battery life for all phones is between 23.0 and 24.6 hours.

THINK-PAIR-SHARE 2 (7 minutes)

Confidence Intervals with Known σ

A university measures commute times for students. Historical data shows σ = 15 minutes. A random sample of n = 225 students has x̄ = 42 minutes.

Calculate and interpret:

  1. A 90% confidence interval for mean commute time
  2. A 95% confidence interval for mean commute time
  3. A 99% confidence interval for mean commute time
  4. What happens to the interval width as confidence increases?
  5. What would happen to the 95% CI if we only sampled 100 students?

Use Google Sheets for all calculations!

Share your answers in Poll Everywhere!

  1. What would happen to the 95% CI if we only sampled 100 students?

Interpreting Confidence Intervals: Common Mistakes

CORRECT interpretation (95% CI: (23.0, 24.6)):

✓ “We are 95% confident that the true population mean battery life is between 23.0 and 24.6 hours.”

✓ “Using this method, 95% of intervals constructed this way will contain the true mean.”

INCORRECT interpretations:

✗ “There is a 95% probability that μ is in (23.0, 24.6).” - μ is fixed, not random!

✗ “95% of phones have battery life between 23.0 and 24.6.” - That’s about individual phones, not the mean!

✗ “We are 95% confident that x̄ is between 23.0 and 24.6.” - We know x̄ = 23.8 exactly!

Remember: The interval either contains μ or it doesn’t. Our confidence is in the METHOD, not a specific interval!

Sample Size and Margin of Error

Key relationship:

\[ME = z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\]

What affects margin of error:

  1. Confidence level (z_{α/2}): Higher confidence → larger ME
  2. Variability (σ): More spread → larger ME
  3. Sample size (n): Larger sample → smaller ME

The trade-off:

  • Want smaller ME? Need larger n!
  • ME decreases by √n, so 4× sample gives 2× precision
  • Doubling precision requires quadrupling sample size

Sample Size and Margin of Error

\[ME = z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\]

Example: ME = 0.784 with n = 100

  • For ME = 0.392 (half as wide), need n = 400 (4 times as many)

Calculating Required Sample Size

Question: How large should n be for a desired margin of error?

Formula:

\[n = \left(\frac{z_{\alpha/2} \times \sigma}{ME}\right)^2\]

Example: Chloe wants ME = 0.5 hours with 95% confidence, σ = 4

=ROUNDUP((1.96 * 4 / 0.5)^2, 0)
=ROUNDUP(245.86, 0) = 246

She needs n = 246 phones for ME of 0.5 hours

Google Sheets template:

Cell A1: Desired ME        | Cell B1: 0.5
Cell A2: Confidence Level  | Cell B2: 0.95
Cell A3: Pop Std Dev (σ)   | Cell B3: 4
Cell A4: Critical Value    | Cell B4: =NORM.S.INV(1-(1-B2)/2)
Cell A5: Required n        | Cell B5: =ROUNDUP((B4*B3/B1)^2, 0)

Always round UP to ensure ME doesn’t exceed target!

Confidence Intervals: σ Unknown

Reality check: We usually don’t know σ!

When σ is unknown:

  • Use sample standard deviation s instead
  • Use t-distribution instead of z-distribution
  • Formula becomes:

\[\bar{x} \pm t_{\alpha/2, df} \times \frac{s}{\sqrt{n}}\]

Where:

  • s = sample standard deviation
  • df = degrees of freedom = n - 1
  • t_{α/2, df} = critical t-value (depends on df and confidence level)

Confidence Intervals: σ Unknown

When to use t instead of z:

  • σ unknown (most real situations!)
  • Small samples especially (n < 30)
  • For large n, t ≈ z anyway

The t-Distribution

Properties:

  • Bell-shaped and symmetric (like z)
  • Heavier tails than normal distribution
  • Family of curves (different for each df)
  • As df increases, t → z

Degrees of freedom (df = n - 1):

  • Small df: heavier tails, wider intervals
  • Large df: closer to normal
  • df = 30+: very close to z-distribution

Why heavier tails?

  • Using s instead of σ adds extra uncertainty
  • t-distribution accounts for this
  • Provides appropriate wider intervals for small samples

See Applet: https://istats.shinyapps.io/tdist/

Finding t Critical Values in Google Sheets

For confidence level C with df = n - 1:

=T.INV.2T(alpha, df)

Where alpha = 1 - C

Example: 95% CI with n = 20 (df = 19)

=T.INV.2T(0.05, 19) ≈ 2.093

Compare to z:

=NORM.S.INV(0.975) ≈ 1.96

Notice: t-value is larger (wider interval for small sample)!

As sample size increases:

  • n = 10, df = 9: t = 2.262

  • n = 30, df = 29: t = 2.045

  • n = 100, df = 99: t = 1.984

  • z = 1.96

Complete Example: t-Confidence Interval

Chloe tests a new production method:

  • Sample: n = 25 phones
  • Sample mean: x̄ = 24.5 hours
  • Sample std dev: s = 3.5 hours
  • Want 95% confidence interval

Step 1: Check conditions

  • Random sample? ✓

  • n = 25 < 30, but assume battery life roughly normal ✓

Step 2: Calculate critical value

df = 25 - 1 = 24
=T.INV.2T(0.05, 24) ≈ 2.064

Complete Example: t-Confidence Interval

Chloe tests a new production method:

  • Sample: n = 25 phones
  • Sample mean: x̄ = 24.5 hours
  • Sample std dev: s = 3.5 hours
  • Want 95% confidence interval

Step 3: Calculate standard error

SE = 3.5/SQRT(25) = 0.7

Step 4: Calculate margin of error

ME = 2.064 × 0.7 = 1.445

Step 5: Construct interval

24.5 ± 1.445 = (23.055, 25.945)

Interpretation: We are 95% confident the true mean battery life with the new method is between 23.1 and 25.9 hours.

When to Use z vs t

Scenario Use Why
σ known, any n z Know population std dev
σ unknown, n ≥ 30 t (or z) CLT applies, t ≈ z
σ unknown, n < 30, normal pop t Account for extra uncertainty
σ unknown, n < 30, non-normal Neither Need non-parametric methods

In practice:

  • Almost always σ unknown → use t
  • Google Sheets makes t just as easy as z
  • When in doubt, use t (more conservative)

Example: If σ = 4 is historical but you calculate s = 3.5 from your sample, use t with s = 3.5!

THINK-PAIR-SHARE 3 (7 minutes)

Confidence Intervals with Unknown σ (t-distribution)

A nutritionist samples n = 16 students and measures daily protein intake (grams):

x̄ = 68, s = 12

Calculate and interpret:

  1. A 95% confidence interval for mean protein intake
  2. A 90% confidence interval
  3. Compare the widths - which is wider and why?
  4. If this had been n = 100 with same x̄ and s, calculate the 95% CI
  5. Compare the n=16 and n=100 intervals - what do you notice?
  6. What assumptions are required for this interval to be valid?

Use Google Sheets! Post on Ed Discussion with partner’s name!

Confidence Intervals for Proportions

New scenario: Estimating a population proportion p

Examples:

  • Proportion of defective products
  • Percentage of voters supporting a candidate
  • Proportion of customers who would buy a product

Sample proportion:

\[\hat{p} = \frac{x}{n}\]

where x = number of successes, n = sample size

Standard error for proportion:

\[SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

CI for Proportions: Formula

Confidence interval for proportion:

\[\hat{p} \pm z_{\alpha/2} \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

Conditions (check these!):

  1. Random sample
  2. np̂ ≥ 10 and n(1-p̂) ≥ 10 (success-failure condition)
  3. Population at least 10n (if sampling without replacement)

Why use z, not t?

  • For proportions, always use z-distribution
  • No unknown parameter analogous to σ
  • SE estimated from sample automatically

Example: Proportion CI

Chloe surveys customers:

  • n = 200 customers sampled
  • x = 156 satisfied with phone
  • p̂ = 156/200 = 0.78
  • Want 95% CI for true satisfaction rate

Step 1: Check conditions - Random sample? ✓ - np̂ = 200(0.78) = 156 ≥ 10 ✓ - n(1-p̂) = 200(0.22) = 44 ≥ 10 ✓

Step 2: Calculate

SE = SQRT(0.78*0.22/200) = 0.0293
z = 1.96
ME = 1.96 × 0.0293 = 0.0574
CI: 0.78 ± 0.0574 = (0.723, 0.837)

Interpretation: We are 95% confident that between 72.3% and 83.7% of ALL customers are satisfied.

Sample Size for Proportion CI

Question: How large should n be for desired ME?

Formula:

\[n = \left(\frac{z_{\alpha/2}}{ME}\right)^2 \times \hat{p}(1-\hat{p})\]

Problem: We need p̂ to calculate n, but we need n to get p̂!

Solutions:

  1. Use prior estimate: If you have preliminary data or historical data
  2. Use p̂ = 0.5: Most conservative (gives largest n)
  3. Use educated guess: Based on similar studies

Example: Want ME = 0.03 with 95% confidence, use p̂ = 0.5

=(1.96/0.03)^2 * 0.5 * 0.5 = 1067.11 ≈ 1068

Need n = 1068 for ME of 3% with no prior estimate

Summary: Confidence Intervals

Mean with σ known (or large n) \[\bar{x} \pm z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\]

Mean with σ unknown (use s) \[\bar{x} \pm t_{\alpha/2, df} \times \frac{s}{\sqrt{n}}\]

Proportion \[\hat{p} \pm z_{\alpha/2} \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

Always check conditions before constructing CIs!

Common Mistakes to Avoid

Mistake 1: Wrong interpretation - CI is about the parameter (μ or p), not the statistic or individuals

Mistake 2: Using z when should use t - If σ unknown, use t (especially for small samples)

Mistake 3: Forgetting to check conditions - Random sampling, sample size requirements, normality

Mistake 4: Confusing confidence level and probability - “95% confident” ≠ “95% probability μ is in the interval”

Mistake 5: Not rounding sample size correctly - Always round UP to ensure ME doesn’t exceed target

Mistake 6: Using wrong formula for proportions - Don’t use s, use √[p̂(1-p̂)/n]

Back to Chloe’s Story

Chloe can now:

✅ Use CLT to understand sampling variability of x̄

✅ Construct 95% CI: (23.0, 24.6 hours) for battery life

✅ Calculate required sample size for desired precision

✅ Make confidence statements: “95% confident true mean is in interval”

✅ Handle both known and unknown σ scenarios

✅ Build CIs for satisfaction rates (proportions)

This is the foundation of statistical inference!

Next time: Hypothesis testing - testing claims about parameters!

Quick Knowledge Check ✅

Rate your confidence (1-5) on Ed Discussion:

  1. Understanding the Central Limit Theorem ⭐⭐⭐⭐⭐
  2. Applying CLT to calculate probabilities ⭐⭐⭐⭐⭐
  3. Constructing CIs with σ known (z-interval) ⭐⭐⭐⭐⭐
  4. Constructing CIs with σ unknown (t-interval) ⭐⭐⭐⭐⭐
  5. Constructing CIs for proportions ⭐⭐⭐⭐⭐
  6. Calculating required sample sizes ⭐⭐⭐⭐⭐
  7. Interpreting CIs correctly ⭐⭐⭐⭐⭐

If you rated anything 3 or below, visit office hours!

Thank you! 📊✨

Questions? I have office hours right after class today!

Next up: Hypothesis Testing - testing claims about populations

Remember:

  • Post Think-Pair-Share on Ed Discussion and Poll Everywhere
  • Rate your confidence
  • Practice constructing different types of CIs
  • Review CLT conditions carefully