Confidence intervals (CI) also provide information about the reliability of an estimate. Conceptually, under repeated sampling from the same population, if a proportion and its 95% confidence interval are estimated from each sample, the true value of the proportion is expected to be contained in 95% of the calculated intervals.
The Wald confidence interval [ \(\hat{p} \pm 1.96 \times \widehat{\operatorname{SE}}(\hat{p}) \), for a two-sided 95% CI] is commonly produced as the default CI method by statistical software but is known to have limitations for proportions. Because proportions are bounded by [0, 1], the upper and lower bounds of a CI should also fall within that same range. However, the Wald CI may produce negative lower bounds for proportions near zero and upper bounds greater than one for large proportions. In addition, the Wald confidence interval may be too narrow; simulation studies have shown that the true proportion is contained within a 95% Wald CI in less than 95% of the simulated CIs. The undercoverage is worse for small and for large proportions.
Several alternative methods have been proposed for calculating confidence intervals for estimated proportions, including those from complex surveys. Analysts are advised to consider the properties of the proportion and the analytic goals when selecting an approach. The Data Presentation Standards for Proportions include criteria based on the absolute width and the relative width of the Clopper-Pearson confidence interval, which was adapted for complex surveys by Korn and Graubard.
The calculation of the Korn and Graubard CI depends on the degrees of freedom. For proportions estimated for a subgroup, the degrees of freedom should be calculated as (the number of PSUs with sampled observations in the subgroup of interest) – (the number of strata with sampled observations in the subgroup of interest). For subgroups that are not represented in all primary sampling units (PSUs) or strata (e.g. some racial and ethnic groups), the degrees of freedom will therefore be lower than degrees of freedom available for the overall estimates. The default calculations from most statistical software packages do not properly account for the reduction in the degrees of freedom for subgroups that are not represented in all PSUs or strata. In order to properly account for the degrees of freedom, analysts may need to output the number of strata and number of PSUs available for each subgroup from the survey procedure or from a separate tabulation into a data set that can be used to calculate the Korn and Graubard CIs outside the procedure. The code examples provide Sample Code for calculating Korn and Graubard confidence intervals.
Formulas for Korn and Graubard Confidence Limits
Korn and Graubard (KG) confidence limits are a modification of the Clopper-Pearson ("exact") confidence limits for a binomial proportion, adapted for use with complex survey data. Where the Clopper-Pearson calculation uses the sample size (or "number of trials"), the KG confidence limit substitutes a degrees-of-freedom adjusted effective sample size \( n_e^* \). Where the Clopper-Pearson calculation uses the number of positive responses or "successes", the KG confidence limit substitutes the adjusted effective sample size times the (weighted) estimated proportion \(n_e^*\hat{p}\).
The Korn and Graubard confidence limits for a proportion (lower confidence limit \(P_L\) and upper confidence limit \(P_U\) ) can be formulated in terms of quantiles of the F distribution:
where \(F(\alpha/2, b,c)\) is the \((\alpha/2)\)th percentile of the \(F\) distribution with \(b\) and \(c\) degrees of freedom, and the degrees-of-freedom adjusted effective sample size \(n_e^*\) is defined as:
where the design effect \(\text{DEFF}\) is defined above in the section "Sample Size and Effective Sample Size." If the estimated proportion is age-adjusted, then the formula for the design effect for an age-adjusted proportion should be used.
Note that the degrees-of-freedom adjusted effective sample size \(n_e^*\) is capped at the actual sample size \(n\). This cap could be binding for subgroups where the estimated design effect is less than one.
References
Brown LD, Cai TT, Dasgupta A. "Interval estimation for a binomial proportion." Stat Sci 16(2):101–17. 2001. 12.
Clopper CJ, Pearson ES. "The use of confidence or fiducial limits illustrated in the case of the binomial." Biometrika 26(4):404–13. 1934.
Dean N, Pagano M. "Evaluating confidence interval methods for binomial proportions in clustered surveys." J Surv Stat Methodol 3(4):484–503. 2015. 11.
Korn EL, Graubard BI. "Confidence intervals for proportions with small expected number of positive counts estimated from survey data." Surv Methodol 24(2):193–201. 1998.
Graubard BI, Korn EL. "Survey inference for subpopulations." Am J Epidemiol. 1996;144(1):102-106.
Newcombe RG. "Two-sided confidence intervals for the single proportion: Comparison of seven methods." Stat Med 17(8):857–72. 1998.
“The SURVEYFREQ Procedure: Confidence Limits for Proportions.” SAS Institute Inc. 2018. SAS/STAT® 15.1 User’s Guide. Cary, NC: SAS Institute Inc.