Charles H. Franklin
University of Wisconsin, Madison
October 27, 2002
The “margin of error” for a poll is routinely reported. But frequently we want to know about the difference between two proportions (or percentages). Often the question concerns differences between two responses to the same question within a single poll. For example, what is the lead of one candidate over another in an election poll. The second common question is whether a proportion has changed from one poll to the next. For example, has presidential approval increased from one poll to the next. The margin of error for these differences is not the same as the margin of error for the poll, which is what virtually all polls routinely report. The margin of error for the poll is for a single proportion, not differences. This leads to considerable confusion among reporters and interpreters of polls.
This note explains the correct way to calculate the margin of error (and hence the “significance”) for differences of proportions in polls. There is also a “quick reference” section at the end that provides the formulas in a single spot.
1. The Margin of Error for a Single Proportion
The usual “margin of error” for a poll is the 95% confidence interval for an individual proportion.
The formula for the variance of a proportion, ppp, isVar(p)=pqn−1\text{Var}(p) = \frac{pq}{n-1}Var(p)=n−1pq
where q=1−pq = 1 – pq=1−p. We usually use just nnn in the denominator since nnn is rather large for virtually all survey samples so the difference in nnn and n−1n – 1n−1 is trivial. The standard error for the proportion is thereforese(p)=pqn\text{se}(p) = \sqrt{\frac{pq}{n}}se(p)=npq
The 95% confidence interval (usually called the “margin of error” of the poll) is ±1.96×se(p)\pm 1.96 \times \text{se}(p)±1.96×se(p), using the normal distribution approximation for large samples.
The standard error depends on the proportion, ppp, and is at a maximum for p=.5p = .5p=.5, so a quick approximation of the widest confidence interval for a single proportion isCI(p)=±1.96×.5×.5n−1≈±2×.5n≈±1n\text{CI}(p) = \pm 1.96 \times \sqrt{\frac{.5 \times .5}{n – 1}} \approx \pm 2 \times \frac{.5}{\sqrt{n}} \approx \pm \frac{1}{\sqrt{n}}CI(p)=±1.96×n−1.5×.5≈±2×n.5≈±n1
This is usually what is reported as the margin of error for a poll. For example, if n=400n = 400n=400, 1400=.050\frac{1}{\sqrt{400}} = .0504001=.050, a MOE of ±5%\pm 5\%±5%. For n=625n = 625n=625, the MOE is 1625=.040\frac{1}{\sqrt{625}} = .0406251=.040 and for n=1111n = 1111n=1111, the MOE is 11111=.030\frac{1}{\sqrt{1111}} = .03011111=.030.
For proportions different from .5, the MOE is somewhat smaller. For example, if p=.6p = .6p=.6, then the MOE is approximately 2×.6×.4625=.0392 \times \sqrt{\frac{.6 \times .4}{625}} = .0392×625.6×.4=.039, a trivial difference. As the responses become more skewed the MOE can be noticeably smaller, as for example if p=.2p = .2p=.2, 2×.2×.8625=.0322 \times \sqrt{\frac{.2 \times .8}{625}} = .0322×625.2×.8=.032, or about a 3 point MOE compared to a 4 point margin for the p=.5p = .5p=.5 case. Still, unless we are looking at a highly skewed variable, these differences in the MOE are usually small enough to be ignored. Calculations for different distributions, such as these, are almost never reported in media accounts of polls.