The objective of the dietary interview component is to obtain detailed dietary intake information from the NHANES participants. The dietary intake data are used to estimate the types and amounts of foods and beverages consumed during the 24-hour period prior to the interview (midnight to midnight), and to estimate intakes of energy, nutrients, and other food components from those foods and beverages. Following the dietary recall, respondents are asked questions on water consumption during the previous 24-hour period, salt use, and whether the person’s intake on the previous day was usual or unusual. Children 1 to 5 years old and women 16 to 49 years old are also asked about their frequency of fish and shellfish consumption during the past 30 days.
This release of the dietary intake data represents, for the first time, the integration of two nationwide dietary intake surveys - USDA’s Continuing Survey of Food Intakes by Individuals (CSFII) and DHHS’s National Health and Nutrition Examination Survey (NHANES). This new integrated dietary component is collected as part of NHANES and is called What We Eat in America. Under the integrated framework, DHHS is responsible for the sample design and data collection and USDA is responsible for the survey’s dietary data collection methodology, maintenance of the databases used to code and process the data, and data review and processing.
Survey integration of dietary data collection began in NHANES 2002. Because NHANES is on a two-year data release cycle, this first release of the integrated survey includes dietary data collected in 2001 from NHANES plus data from the integrated survey collected in 2002. Collection and processing procedures for the two years were similar. Differences between the two years, along with steps taken to reconcile these differences, are discussed throughout this document.
Data collection for the What We Eat in America 2002 also included a second day recall that was collected by telephone. Because of confidentiality issues concerning the release of single-year data from NHANES, dietary data for the 2002 Day 2 telephone interview will not be publicly released. Only Day 1 interview data are included in the present release.
Restricted data, such as the 2002 Day 2 dietary data, may be made available at the Research Data Center located at the National Center for Health Statistics (NCHS) headquarters in Hyattsville, MD. A research proposal for using the restricted data must be submitted to NCHS for review and approval. Instructions for requesting use of these data are available at: https://www.cdc.gov/rdc/.
Two data files were produced from the Day 1 dietary interview for this release:
- Total Nutrient Intakes File (DRXTOT_B) that consists of daily total nutrient intakes from foods and beverages, total amount of water consumed, and frequency of fish and shellfish consumption for survey participants.
- Individual Foods File (DRXIFF_B) that includes detailed information about the type and amount of individual foods reported by each respondent, as well as amounts of nutrients from each food.
Nutrient intakes reported in these files do not include those obtained from dietary supplements, medications or plain drinking water.
This document provides additional details important to understanding the content of the Individual Foods File (DRXIFF_B). This file includes one record per food for each survey participant. Each record contains food-specific data (food code, food amount, time, eating occasion) and amounts of nutrients from each food in units appropriate to the nutrient. Food records are sequentially numbered.
Text descriptions for each food record are provided in a separate data file called Food Code Format File (DRXFMT_B). SAS code to link the Food Code Format File with the Individual Foods File or to obtain a list of formatted text labels of the food codes is provided in the documentation for the DRXFMT_B file. Expanded food descriptions can be found in the food descriptions component of the USDA Food and Nutrient Database for Dietary Studies (FNDDS). The FNDDS is available for free download from the Food Surveys Research Group (FSRG) website (http://www.ars.usda.gov/ba/bhnrc/fsrg).
All NHANES examined survey participants are eligible for the dietary interview component. However, several questions that follow the 24-hour recall are only asked of a subset of survey participants. Frequency of fish and shellfish consumption is only reported for children 1-5 years old and women 16-49 years of age, and information on the use of table salt is only reported for participants 1 year and older.
Protocol and Procedure
The examination protocol and data collection methods are fully documented in the NHANES Dietary Interviewers Procedures Manuals (In-person interview: https://wwwn.cdc.gov/nchs/data/nhanes/2001-2002/manuals/dietary_year_3.pdf; phone-follow-up interview: https://wwwn.cdc.gov/nchs/data/nhanes/2001-2002/manuals/phone_followup_year_3.pdf).
Proxy interviews were conducted for survey participants less than six years of age. Assisted interviews were conducted with survey participants 6 to 11 years of age. Dietary interviews were conducted in English and Spanish. Translators were used to conduct interviews in other languages.
The in-person interview was conducted in a private room in the NHANES mobile examination center (MEC). A set of standard measuring guides was available in the MEC dietary interview room for the respondent to use for reporting amounts of foods.
In 2001, dietary intake data were collected using the NHANES computer-assisted dietary interview system (CADI), which was also used to collect dietary data for the 1999-2000 collection period. The CADI is a multiple pass recall method which provides instructions to interviewers for recording information about foods. Additional information about the CADI system is provided in the NHANES 1999-2000 Dietary Interviewers Procedures Manual (https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=1999)
In 2002, What We Eat in America data were collected using USDA’s dietary data collection instrument, the Automated Multiple Pass Method (AMPM) (http://www.ars.usda.gov/ba/bhnrc/fsrg). The AMPM was designed to provide a more efficient and accurate means of collecting intakes for large-scale national surveys. The AMPM is a fully computerized recall method that includes an extensive compilation of standardized food-specific questions and possible response options. It features automated routing of questions based on previous answers. The AMPM is updated yearly to capture the changing food supply and to address research needs from the data user community. Additional information about the AMPM is provided in Raper et al (Raper et al., 2004).
Quality Assurance & Quality Control
All dietary interviewers had to complete an intensive one-week training course followed by supervised interview practices before working independently in the field. Retraining sessions were conducted periodically and annually to reinforce the proper protocols and technique.
Interviewers were monitored throughout the data collection period. Monitoring consisted of the following:
- Reviews of data transmittal sheets to verify receipt of data files.
- Reviews of audio taped interviews for approximately 5% of each interviewer’s work; in-person observations were also conducted periodically.
- Interviews were checked for completeness of the recalls, missing information, inconsistent reports, and unclear notes. Written notification and feedback were provided to the interviewers.
In 2001, interviewers reviewed each interview after completion, performing appropriate edits. Interviewers were not required to review interviews collected in 2002 using the USDA’s AMPM because quality control features are built into the software. Incorrect entries are minimized due to automated routing of questions and built-in edits.
Data Processing and Editing
Two similar systems were used to code the intake data for 2001 and 2002. The University of Texas Food Intake Analysis System (FIAS, version 3.99) was used for coding intakes for 2001. For 2002, interview files were imported into Survey Net, a computer-assisted food coding and data management system developed by USDA (Raper et al., 2004). FIAS is a general-use version of the Survey Net software that is available to researchers through the University of Texas.
The USDA FNDDS, version 1, was used for processing the intakes for 2001-2002. The FNDDS includes comprehensive information that can be used to code individual foods and portion sizes and contains nutrient values for calculating nutrient intakes. The FNDDS is available for use in research projects using the 2001-2002 food intake data and in other food intake studies, as well. Additional information (Raper et al., 2004; Bodner-Montville et al, 2006) about the FNDDS and related tools is available at the FSRG website.
Coders were monitored to ensure quality and completeness. Approximately 10 percent of the coder’s work was double-coded and adjudicated, if necessary.
After intake data were coded, various types of reviews were conducted to ensure the quality of the data, including:
- Overall acceptability of each recall. This review determined if the recall met minimum criteria.
Minimum criteria for the 2001 data collection included the following:
Minimum criteria for 2002 data collection included the following:
- Less than 25% of the foods with missing descriptive information.
- Less than 15% of the foods with missing amounts.
- Any meal reported must have at least one identified food.
- The first 4 steps of the 5-step AMPM were completed. Because the AMPM includes automated routing of questions, missing descriptive or amount information cannot occur.
- Foods consumed for each reported meal must be identified.
- Interviewers’ and coders’ questions and comments were reviewed to ensure that they had been accounted for in coding.
- Resolution of unknown foods or food quantities that were reported by respondents but could not be coded to foods in the database.
- Specific edit checks for reasonableness, consistency, and logic. Examples are meals reported at unusual times and extremely large quantities of foods.
- Intakes with extreme levels for individual nutrients.
An overview of quality assurance procedures conducted during the data processing stage is available at the FSRG website (Anand et al., 2006).
During data processing, the following edits were made to ensure the logical consistency and analytic usefulness of the data:
- Adjusted sodium values for certain foods
Sodium values for home-prepared foods are based on the sodium values of recipe ingredients in the FNDDS. The amount of salt in recipes was reduced, or eliminated, in some cases based on questions about salt use in the dietary interview.
- Derived “eaten at home” variable (DRD040Z)
The question “Was this meal/snack eaten at home?” was included in the 2002 interview, but not in the 2001 interview. The answer to this question in 2001 was derived from the answer to a question about where each food was consumed. If the answer was anything other than “home” for any food reported in an eating occasion, then the “eaten at home” variable was coded as “no”.
- Foods eaten in combination
Foods eaten in combination with other foods, such as cereal with milk, are flagged with a combination food number (DRXCCMNM). Foods flagged with the same combination food number at a given meal were eaten together. Foods are categorized by a combination type code (DRDCCMTZ).
The analytic guidelines provided with the 2001-2002 NHANES data release recommend combining 2-year cycles, such as 1999-2000 and 2001-2002, to increase sample size and analytic options. However, the guidelines also advise that the user should verify that data items were collected and reported in a comparable manner in all combined years. Thus, before combining the 1999-2000 and 2001-2002 dietary data, researchers should carefully consider the following information. Between these two time periods, nutrient values for many foods were revised, based on improvements in sampling and analyzing foods. Also, values for new nutrients and food components became available, and units of expression for some existing nutrients were changed. NHANES 2001-2002 nutrient intakes were calculated using USDA’s FNDDS version 1.0, which contains the most up-to-date food composition values available for this time frame. NHANES 1999-2000 nutrient intakes were calculated using an earlier version of the database, the USDA 1994-1998 Survey Nutrient Database. Thus, analyzing merged intake data for these two data sets should be carefully considered for each nutrient. Analyses conducted based on changes in the nutrient databases show that the impact can be significant depending on the nutrient (Anderson et al., 2001; Ahuja et al., 2006).
The Individual Foods File is comprised of food records. In most cases, there are multiple records in the file per survey participant. This file can be linked with other NHANES files by the respondent sequence number (SEQN).
A status code (DRDDRSTZ) is used in the NHANES 2001-2002 dietary interview component to indicate the quality and completeness of the response to the dietary recall section. The dietary recall section status is coded as one of the following:
- Reliable and met the minimum criteria
- Not reliable or not met the minimum criteria
No data on total nutrient intake or individual food consumption is provided for these cases.
- Reported consuming breast-milk
Human milk was reported in some dietary recalls. Few respondents could quantify the human milk intake for their breast-fed infants/children. For those cases, no total nutrient intakes were derived. The foods consumed by nursing infants and children are reported in the Individual Foods File.
- Not Done
The dietary recall section of the interview did not take place due to various reasons (such as came late/left early, refusal, illness, emergency, or equipment failure).
In 2002, a question on the source of each food (where it was obtained, such as store, fast food restaurant, cafeteria) was asked. Because this question was not asked in 2001, the source of each food for 2002 dietary interviews will not be publicly released but may be accessible through the NCHS Research Data Center. Instructions for requesting use of these data at the NCHS Data Research Center are available at https://www.cdc.gov/rdc/.
Sample weights for dietary intake data: The NHANES participants were selected on the basis of a national probability design. In order to increase the number of participants for specific demographic groups, a multi-stage, unequal probability of selection design was implemented. The NHANES oversamples blacks, Mexican Americans, low income whites, adolescents 12-19 years, and persons 60 years and older. Sample weights are constructed that encompass the unequal probabilities of selection, as well as adjustments for non-participation by selected sample persons. In order to produce national, representative estimates, the appropriate sample weights must be used.
For the 2001-2002 NHANES, there were 13,156 persons selected; of these 10,477 were considered respondents to the MEC examination and data collection. However, only 9,883 of the MEC respondents provided complete dietary intakes.
Most analyses of NHANES data use data collected in the MEC and the variable WTMEC2YR should be used for the sample weights. However, for the dietary data, different sample weights are recommended for analysis. Although attempts are made to schedule MEC exams uniformly throughout the week, proportionally more exams occur on weekend days than on weekdays. Because food intake can vary by day of week, use of the MEC weights would disproportionately represent intakes on weekends.
A set of weights WTDRD1 is provided that should be used when an analysis uses the NHANES 2001-2002 dietary recall data (either alone or when nutrient data are used in conjunction with MEC data). The set of weights WTDRD1 is applicable to the 9,883 respondents with dietary data. The weights WTDRD1 were constructed by taking the MEC sample weights (WTMEC2YR) and further adjusting for (a) the additional non-response and (b) the differential allocation by day of the week for the dietary intake data collection. These weights are more variable than the MEC weights, and the sample size is smaller, so estimated standard errors using dietary data and dietary weights are larger than standard errors for similar estimates based on MEC weights. In addition, a set of four-year dietary weight (WTDR4YR) is also provided that should be used for the combined analyses of NHANES 1999-2000 and NHANES 2001-2002 data.
Note that all sample weights are person-level weights and each set of weights should add to the same population control total as the MEC weights (WTMEC2YR). In addition, the MEC weights (WTMEC2YR) are appropriate for use in the analysis of the fish and shellfish consumption data (i.e., variables DRD340, DRD350A-K, DRD350AQ-JQ, DRD360, DRD370A-V, and DRD370AQ-UQ) and the use of table salt data (i.e., variables DBQ095 and DBD100) located in the Total Nutrient Intake File (DRXTOT_B), if no other dietary data are included in the analysis. Additional explanation of sample weights and appropriate uses are included in the NHANES Analytic Guidelines (https://wwwn.cdc.gov/nchs/nhanes/analyticguidelines.aspx). Please also refer to the on-line NHANES Tutorial (https://www.cdc.gov/nchs/tutorials/) for further details on other analytic issues.
Appendix 1: Dietary Interview - Food Code Format File (DRXFMT_B)
This dataset is a technical support file for the Individual Foods File (DRXIFF_B) of the dietary interview component. It provides text descriptions for the food codes used in the Individual Foods File. The source of the text descriptions used in this file is the USDA Food and Nutrient Database for Dietary Studies (FNDDS), version 1. Please refer to the documentation for the Individual Foods File for details information on the dietary interview component and related dietary data files.
The Food Code Format File was created for linking the text descriptions with the food codes used in the Individual Foods File. There are three variables included in the file:
1) FMTNAME: a text field encoding the name of the key variable (i.e., DRDIFDCD) used to link with the Individual Foods File;
2) START: the numeric value of the USDA food codes;
3) VALUE: the text description for the correspondent food code.
The following is an example SAS code to associate the text descriptions with the food codes using the proc format option:
Assuming that the individual foods file (DRXIFF_B) and the Format file (DRXFMT_B) have been copied to a SAS library NHANES:
Options FmtSearch = (NHANES);
Proc Format CntlIn=NHANES.DRXFMT_B Library=NHANES;
Proc DataSets Lib=NHANES;
FORMAT DRDIFDCD DRDIFDCD.; Quit;
Alternatively, data users may use the format statement in proc or data steps to apply the format when needed. For example:
Proc Format CntlIn=NHANES.DRXFMT_B
Proc Freq Data=NHANES.DRXIFF_B; Tables DRDIFDCD; Format DRDIFDCD DRDIFDCD.;
To simply obtain a listing of formatted text labels for each food code, data users can apply the following SAS code:
Options LS=240; Proc Print Data= NHANES.DRXFMT_B; Run;
Note that the text label is up to 60 characters long.
Expanded food descriptions can be found in the food descriptions component1 of the USDA Food and Nutrient Database for Dietary Studies (FNDDS). The FNDDS2 is available from the Food Surveys Research Group (FSRG) website.
1. Food Surveys Research Group. FNDDS Format and Files. Jul 7, 2004. Available at: http://www.barc.usda.gov/bhnrc/foodsurvey/fndds_components.html#nutrients
2. Food Surveys Research Group. Introducing... USDA Food and Nutrient Database for Dietary Studies, 1.0. Jun 21, 2004. Available at: http://www.barc.usda.gov/bhnrc/foodsurvey/fndds_intro.html