********************************************************************************************************************** ** Example Stata code to replicate NCHS Data Brief No. 303, Figure 1 ** ** Figure 1. Percentage of persons aged 20 and over with depression, by age and sex: United States, 2013–2016 ** ** ** ** Brody DJ, Pratt LA, Hughes J. Prevalence of depression among adults aged 20 and over: United States, 2013–2016. ** ** NCHS Data Brief, no 303. Hyattsville, MD: National Center for Health Statistics. 2018. ** ********************************************************************************************************************** ** Note to tutorial users: you must update some lines of code (e.g. file paths) ** to run this code yourself. Search for comments labeled "TutorialUser" ** Display Stata Version Number ** version * Change working directory to a directory where we can save temporary files * * TutorialUser: Update this path to a valid location on your computer! cd "C:\Stata_workspace\" ** Download Demographic (DEMO) Data and Keep Variables Of Interest ** import sasxport "https://wwwn.cdc.gov/Nchs/Nhanes/2013-2014/DEMO_H.XPT", clear keep seqn riagendr ridageyr sdmvstra sdmvpsu wtmec2yr save "DEMO_H.dta", replace import sasxport "https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DEMO_I.XPT", clear keep seqn riagendr ridageyr sdmvstra sdmvpsu wtmec2yr ** Append Files ** append using "DEMO_H.dta" save "DEMO.dta", replace ** Download Mental Health - Depression Screener (DPQ) Data ** import sasxport "https://wwwn.cdc.gov/Nchs/Nhanes/2013-2014/DPQ_H.XPT", clear save "DPQ_H.dta", replace import sasxport "https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DPQ_I.XPT", clear ** Append Files ** append using "DPQ_H.dta" ** Merge Files ** merge 1:1 seqn using "DEMO.dta" ** Set Refused/Don't Know To Missing (for all variables that start with prefix dpq) ** recode dpq* (7/9 = .) ** Create Binary Depression Indicator as 0/100 variable ** ** note that the score will be missing if any of the items are missing ** gen Depression_Score = dpq010+dpq020+dpq030+dpq040+dpq050+dpq060+dpq070+dpq080+dpq090 recode Depression_Score (0/9 = 0) (10/27 = 100), generate(Depression_Indicator) ** Create a new variable with age categories: 20-39, 40-59, 60 and over ** recode ridageyr (0/19 = .) (20/39 = 1) (40/59 = 2) (60/80 = 3), generate(Age_Group) ** Labels for categorized variables ** label define Gender_Labels 1 "Male" 2 "Female" label values riagendr Gender_Labels label define Age_Labels 1 "20-39" 2 "40-59" 3 "60+" label values Age_Group Age_Labels ** Define analysis population: adults age 20 and over with a non-missing depression score gen inAnalysis=0 replace inAnalysis=1 if ridageyr >=20 & !missing(Depression_Indicator) ** Specify survey design variables and request Taylor linearized variance estimation ** ** Note: using the MEC Exam Weights (WTMEC2YR), per the analytic notes on the ** Mental Health - Depression Screener (DPQ_H) documentation ** Divide weight by 2 because we are appending 2 survey cycles for 2013-2014 and 2015-2016 gen wtmec4yr = wtmec2yr / 2 svyset [w=wtmec4yr], psu(sdmvpsu) strata(sdmvstra) vce(linearized) ** Sample Size (unweighted) by sex and age for analysis population ** tab riagendr Age_Group if inAnalysis ** Prevalence of depression ** svy, subpop(inAnalysis): mean Depression_Indicator ** Prevalence of depression by gender ** svy, subpop(inAnalysis): mean Depression_Indicator, over(riagendr) ** Compare prevalence of depression between men and women ** lincom [Depression_Indicator]Male - [Depression_Indicator]Female ** Prevalence of depression by age group ** svy, subpop(inAnalysis): mean Depression_Indicator, over(Age_Group) ** Pairwise Comparison Of Age Groups ** lincom [Depression_Indicator]_subpop_1 - [Depression_Indicator]_subpop_2 // 20-39 vs. 40-59 lincom [Depression_Indicator]_subpop_1 - [Depression_Indicator]_subpop_3 // 20-39 vs. 60 and over lincom [Depression_Indicator]_subpop_2 - [Depression_Indicator]_subpop_3 // 40-59 vs. 60 and over ** Prevalence By Gender And Age Group ** svy, subpop(inAnalysis): mean Depression_Indicator, over(riagendr Age_Group) ** Compare Prevalence Between Men And Women By Age Group ** lincom [Depression_Indicator]_subpop_1 - [Depression_Indicator]_subpop_4 // men vs. women: aged 20-39 lincom [Depression_Indicator]_subpop_2 - [Depression_Indicator]_subpop_5 // men vs. women: aged 40-59 lincom [Depression_Indicator]_subpop_3 - [Depression_Indicator]_subpop_6 // men vs. women: aged 60 and over ** Pairwise Comparison of Age Groups By Gender ** lincom [Depression_Indicator]_subpop_1 - [Depression_Indicator]_subpop_2 // 20-39 vs. 40-59 : men lincom [Depression_Indicator]_subpop_1 - [Depression_Indicator]_subpop_3 // 20-39 vs. 60 and over : men lincom [Depression_Indicator]_subpop_2 - [Depression_Indicator]_subpop_3 // 40-59 vs. 60 and over : men lincom [Depression_Indicator]_subpop_4 - [Depression_Indicator]_subpop_5 // 20-39 vs. 40-59 : women lincom [Depression_Indicator]_subpop_4 - [Depression_Indicator]_subpop_6 // 20-39 vs. 60 and over : women lincom [Depression_Indicator]_subpop_5 - [Depression_Indicator]_subpop_6 // 40-59 vs. 60 and over : women ************************************************************ ** Alternative method of testing: pairwise comparisons on a "cell means model" from the reg command ** ** Prevalence By Gender And Age Group ** * specify ibn. for each factor variable and the noconstant option to include all levels of categorical variables in the model * svy, subpop(inAnalysis): reg Depression_Indicator ibn.Age_Group#ibn.riagendr, noconstant ** Pairwise comparison of age groups, among men (riagendr=1) and women (riagendr=2) ** pwcompare Age_Group#1.riagendr, pveffects pwcompare Age_Group#2.riagendr, pveffects ** Pairwise comparison by gender, for each age group *; pwcompare riagendr#1.Age_Group, pveffects pwcompare riagendr#2.Age_Group, pveffects pwcompare riagendr#3.Age_Group, pveffects