Confidence Intervals for Proportions
Estimating population proportions
Confidence Intervals for Proportions
What is a Confidence Interval?
Confidence Interval (CI): Range of plausible values for population parameter
Form: statistic ± margin of error
Interpretation: We are C% confident the interval contains the true parameter
Example: 95% CI for p: (0.52, 0.58)
We are 95% confident true population proportion is between 0.52 and 0.58
One-Sample CI for Proportion
Formula:
Where:
- = sample proportion
- z* = critical value (from confidence level)
- n = sample size
Critical Values
Common confidence levels:
| Confidence Level | z* | |------------------|-----| | 90% | 1.645 | | 95% | 1.96 | | 99% | 2.576 |
Higher confidence → wider interval
Example 1: Simple CI
Survey: 400 voters, 220 support candidate
95% CI:
Interpretation: We are 95% confident between 50.1% and 59.9% of voters support the candidate.
Conditions for CI
Random: Random sample
Normal: np̂ ≥ 10 and n(1-p̂) ≥ 10
Independent: n ≤ 10% of population
Check ALL before proceeding!
Margin of Error
Margin of Error (ME):
Factors affecting ME:
- Larger z* (higher confidence) → larger ME
- Larger n → smaller ME
- p̂ near 0.5 → larger ME (maximum variability)
Sample Size for Desired ME
To achieve margin of error m:
Conservative approach (if no estimate): Use p̂ = 0.5
Example: Want ME = 0.03 with 95% confidence
Need at least 1068 people!
Interpreting Confidence Level
95% confidence means:
- If we repeated sampling many times and built 95% CI each time
- About 95% of intervals would contain true p
- About 5% would miss true p
NOT:
- "95% chance p is in our interval" (p is fixed!)
- "95% of data is in interval"
Our specific interval either contains p or it doesn't (we just don't know which)
Increasing Confidence
Want higher confidence (say 99% instead of 95%):
- Use larger z* (2.576 instead of 1.96)
- Interval becomes wider
- Trade-off: More confidence but less precision
Example 2: With Interpretation
Survey of 500 students: 285 have jobs
Conditions:
- Random: Assume random sample ✓
- Normal: 500(0.57) = 285 ≥ 10, 500(0.43) = 215 ≥ 10 ✓
- Independent: 500 < 10% of all students (assume) ✓
90% CI:
Interpretation: We are 90% confident that between 53.4% and 60.6% of all students have jobs.
Common Mistakes
❌ Saying "95% of data in interval"
❌ Saying "95% chance p in interval"
❌ Not checking conditions
❌ Using t* instead of z* for proportions
❌ Rounding p̂ too early
Two-Sample CI for Difference in Proportions
Comparing two groups:
Conditions: Each group meets conditions separately
Interpretation: If interval contains 0, no significant difference
Calculator Commands (TI-83/84)
STAT → TESTS → A:1-PropZInt
Enter:
- x (count of successes)
- n (sample size)
- C-Level (confidence level as decimal)
Calculate → gives interval
Relationship to Hypothesis Testing
If testing H₀: p = p₀ at significance level α:
Equivalent: Check if (1-α)% CI contains p₀
- If p₀ in CI → fail to reject H₀
- If p₀ not in CI → reject H₀
Quick Reference
Formula:
Conditions: Random, np̂ ≥ 10 and n(1-p̂) ≥ 10, n < 10%N
Common z:* 1.645 (90%), 1.96 (95%), 2.576 (99%)
Sample size:
Remember: Higher confidence → wider interval. Larger sample → narrower interval. Always check conditions and interpret in context!
📚 Practice Problems
1Problem 1easy
❓ Question:
In a random sample of 400 voters, 220 support a proposition. Construct a 95% confidence interval for the true proportion of voters who support the proposition.
💡 Show Solution
Step 1: Identify the information n = 400 (sample size) x = 220 (number of successes) p̂ = 220/400 = 0.55 (sample proportion)
Confidence level: 95%
Step 2: Check conditions for proportion CI RANDOM: Sample is random ✓ NORMAL: np̂ ≥ 10 and n(1-p̂) ≥ 10 400(0.55) = 220 ≥ 10 ✓ 400(0.45) = 180 ≥ 10 ✓ INDEPENDENT: n ≤ 0.10N 400 ≤ 0.10(all voters) - assume yes ✓
All conditions met!
Step 3: Find critical value 95% confidence → α = 0.05 z* = 1.96 (from table for 95% CI)
Step 4: Calculate standard error SE = √[p̂(1-p̂)/n] = √[0.55(0.45)/400] = √[0.2475/400] = √0.00061875 ≈ 0.0249
Step 5: Calculate margin of error ME = z* × SE = 1.96 × 0.0249 ≈ 0.0488
Step 6: Construct confidence interval CI = p̂ ± ME = 0.55 ± 0.049 = (0.501, 0.599)
Or: (0.50, 0.60) rounded
Step 7: Interpret the interval We are 95% confident that the true proportion of voters who support the proposition is between 0.50 and 0.60 (or 50% and 60%).
This means:
- If we repeated this sampling process many times
- About 95% of intervals would contain true p
- This specific interval either contains p or doesn't
- But the process is reliable 95% of the time
Answer: 95% CI for p: (0.50, 0.60)
We are 95% confident that between 50% and 60% of all voters support the proposition.
2Problem 2easy
❓ Question:
A quality control inspector finds 8 defects in a sample of 200 items. Construct a 90% confidence interval for the defect rate.
💡 Show Solution
Step 1: Calculate sample proportion n = 200 x = 8 p̂ = 8/200 = 0.04
Step 2: Check conditions RANDOM: Assume random sample ✓ NORMAL: np̂ = 200(0.04) = 8 < 10 ✗ n(1-p̂) = 200(0.96) = 192 ≥ 10 ✓
Condition fails! But let's proceed with caution. (In practice, might use exact binomial method)
Step 3: Find z* for 90% confidence 90% confidence → z* = 1.645
Step 4: Calculate SE SE = √[p̂(1-p̂)/n] = √[0.04(0.96)/200] = √[0.0384/200] = √0.000192 ≈ 0.0139
Step 5: Calculate ME ME = 1.645 × 0.0139 ≈ 0.023
Step 6: Construct CI CI = 0.04 ± 0.023 = (0.017, 0.063) = (1.7%, 6.3%)
Step 7: Interpret with caution We are 90% confident the true defect rate is between 1.7% and 6.3%.
Note: This interval may not be as reliable since np̂ < 10.
Answer: 90% CI: (0.017, 0.063) or (1.7%, 6.3%)
Caution: The success-failure condition is marginally violated (only 8 successes), so this normal-based interval may not be fully reliable.
3Problem 3medium
❓ Question:
A researcher wants to estimate the proportion of defective items with a margin of error no more than 0.03 at 90% confidence. How large a sample is needed if no prior estimate exists?
💡 Show Solution
Step 1: Identify what we need ME = 0.03 Confidence level = 90% → z* = 1.645 No prior estimate → use p̂ = 0.5
Step 2: Use sample size formula n = (z*)²p̂(1-p̂)/ME²
Step 3: Calculate n = (1.645)²(0.5)(0.5)/(0.03)² = 2.706(0.25)/0.0009 = 0.6765/0.0009 ≈ 751.67
Step 4: Round UP Always round UP to ensure ME is no larger than desired n = 752
Step 5: Why use p̂ = 0.5? The product p̂(1-p̂) is maximized at p̂ = 0.5 This gives the most conservative (largest) sample size Guarantees ME ≤ 0.03 regardless of true p
Answer: n = 752
Need a sample of at least 752 items to achieve a margin of error no more than 0.03 at 90% confidence.
4Problem 4medium
❓ Question:
Two polls: Poll A (n=500, p̂=0.52) and Poll B (n=1000, p̂=0.51). Both use 95% confidence. Which poll has a smaller margin of error? Calculate both.
💡 Show Solution
Step 1: Recall margin of error formula ME = z*√[p̂(1-p̂)/n]
For 95% CI: z* = 1.96
Step 2: Calculate ME for Poll A p̂ = 0.52, n = 500
ME_A = 1.96√[0.52(0.48)/500] = 1.96√[0.2496/500] = 1.96√0.0004992 = 1.96(0.0223) ≈ 0.044
Step 3: Calculate ME for Poll B p̂ = 0.51, n = 1000
ME_B = 1.96√[0.51(0.49)/1000] = 1.96√[0.2499/1000] = 1.96√0.0002499 = 1.96(0.0158) ≈ 0.031
Step 4: Compare Poll A: ME ≈ 0.044 or 4.4% Poll B: ME ≈ 0.031 or 3.1%
Poll B has smaller margin of error!
Step 5: Why is Poll B better? Larger sample size (1000 vs 500) ME ∝ 1/√n Doubling n reduces ME by factor of √2 ≈ 1.41
500 × 2 = 1000 ME_A/ME_B = √(1000/500) = √2 ≈ 1.41 0.044/0.031 ≈ 1.42 ✓
Step 6: Effect of p̂ Poll B also has p̂ closer to 0.5 But this increases ME slightly Effect of larger n dominates
Answer: Poll B has smaller ME (0.031 vs 0.044)
Poll B's larger sample size (1000 vs 500) gives more precision despite having p̂ closer to 0.5.
5Problem 5hard
❓ Question:
Explain why we can't construct a valid confidence interval for a proportion when the sample proportion is 0 or 1.
💡 Show Solution
Step 1: Recall CI formula CI = p̂ ± z*√[p̂(1-p̂)/n]
SE = √[p̂(1-p̂)/n]
Step 2: What happens when p̂ = 0? SE = √[0(1)/n] = 0 CI = 0 ± 0 = (0, 0)
This says we're 100% certain p = 0 Unreasonable from a sample!
Step 3: What happens when p̂ = 1? SE = √[1(0)/n] = 0 CI = 1 ± 0 = (1, 1)
This says we're 100% certain p = 1 Also unreasonable!
Step 4: Normal approximation fails Need: np̂ ≥ 10 AND n(1-p̂) ≥ 10
When p̂ = 0: np̂ = 0 < 10 ✗ When p̂ = 1: n(1-p̂) = 0 < 10 ✗
Can't use normal-based method!
Step 5: What to do instead Use: Wilson score interval, Agresti-Coull, or exact binomial methods These give reasonable intervals even with extreme values
Answer: When p̂ = 0 or 1, SE = 0, giving a degenerate interval. Normal approximation conditions fail. Alternative methods should be used.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics