The Third National Health and Nutrition Examination Survey (NHANES III), 1988-94, Series 11, No. 9A Data Release. The National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention (CDC) collects, analyzes, and disseminates data on the health status of U.S. residents. The results of surveys, analyses, and studies are made known through a number of data release mechanisms including publications, mainframe computer data files, CD-ROMs (Search and Retrieval Software, Statistical Export and Tabulation System (SETS)), and the Internet. The National Health and Nutrition Examination Survey (NHANES) is a periodic survey conducted by NCHS. The third National Health and Nutrition Examination Survey (NHANES III), conducted from 1988 through 1994, was the seventh in a series of these surveys based on a complex, multi-stage sample plan. It was designed to provide national estimates of the health and nutritional status of the United States' civilian, noninstitutionalized population aged two months and older. This data release, Series 11 No. 9A, contains the NHANES III raw spirometry data file and documentation. This data release does not replace the previous NHANES III data releases. The following table summarizes the NHANES III data releases which are currently available. Table 1. Available NHANES III +----------------------+--------+---------+------------------------------------+ |Data Release Number |Release |Size in |Data Files / Description | | |Date |Megabytes| | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |June | 240 |Raw spirometry data file and | |Series 11, No. 9A, |2001 | |documentation | |ASCII Version (This | | | | |release) | | | | +----------------------+------ -+---------+------------------------------------+ |NHANES III, 1988-94, |TBA | 9.7 |Natality data file and documentation| |Series 11, No. 8A, | | |are limited to births occurring | |ASCII Version | | |within the U.S. residents and | | | | |nonresidents who were sample | | | | |persons ages 2 months and older | | | | |in the NH3 sample. | +----------------------+------ -+---------+------------------------------------+ |NHANES III, 1988-94, |TBA | 78.8 |Multiply Imputed Data Files and | |Multiply Imputed Data | | |reports | |Set, Series 11, No. | | | | |7A, ASCII version | | | | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |January | 2.7 |Healthy Eating Index (HEI) Data File| |Series 11, No. 6A, |2000 | |and documentation includes number | |ASCII Version | | |of servings by Food Guide Pyramid | | | | |food groups and HEI | |----------------------|--------|---------|------------------------------------+ |NHANES III, 1988-94, |April | 54 |Supplemental Nutrition Survey of | |Series 11, No. 5A, |2000 | |older Americans (SNS)contains | |ASCII Version | | |subsets of individual foods | | | | |and total nutrients intake data | | | | |files and documentation | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |Septem- | 0.5 |Priority toxicant reference range | |Series 11, No. 4A, |ber 2000| |study data file and documentation | |ASCII Version | | | | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |July | 33 |Second exam sample files for | |Series 11, No. 3A, |1999 | |dietary recall, examination, | |ASCII Version | | |laboratory, additional laboratory | | | | |analytes and documentation | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |April | 407 |Dietary recall (replacement), | |Series 11, No. 2A, |1998 | |electrocardiography, laboratory | |ASCII Version | | |(additional analytes), and | | | | |vitamins/medicines data files and | | | | |documentation | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |October | 285 |Adult and youth household | |Series 11, No. 1, |1997 | |questionnaire, examination, and | |Revised SETS Version | | |laboratory data files and | |1.22a | | |documentation, plan and operation, | | | | |analytic and reporting guidelines, | | | | |weighting and estimation | | | | |methodology, field operations, | | | | |non-response bias | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |July | 454 |Adult and youth household | |Series 11, No. 1A, |1997 | |questionnaire, dietary recall, | |ASCII Version | | |examination, and laboratory data | | | | |files and documentation | +----------------------+--------+---------+------------------------------------+ |NHANES III, 1988-94, |July | 285 |Adult and youth household | |Series 11, No. 1, |1997 | |questionnaire, examination, and | |SETS Version 1.22a * | | |laboratory data files and | | | | |documentation | +----------------------+--------+---------+------------------------------------+ |NHANES III Reference |October | 152 |Plan and operation, analytic and | |Manuals and Reports |1996 | |reporting guidelines, weighting and | |October 1996 | | |estimation methodology, field | | | | |operations, non-response bias | +----------------------+--------+---------+------------------------------------+ * Do not use this CD-ROM. It had technical problems and has been superseded by the revised SETS version 1.22a, Series 11, No. 1, released in October 1997. The first release of NHANES III data is available on three different CD-ROMs. The first CD-ROM, Series 11, No. 1 SETS Version 1.22a contains data accessible through the Statistical Export and Tabulation System (SETS) retrieval software as well as documentation. This CD-ROM had technical problems and should not be used; it has been superseded by the Series 11, No. 1 Revised SETS Version 1.22a. This revised CD-ROM includes corrections to the SETS software and also contains the NHANES III Reference Manuals and Reports. A third CD-ROM, Series 11, No. 1A, contains the same data and documentation (except the Reference Manuals and Reports) as on the Series 11, No. 1 Revised Sets Version 1.22a CD-ROM plus the expanded dietary recall data and documentation. All data on the Series 11, No. 1A CD-ROM are in ASCII format only. Background information on the procedures, survey components, questionnaires, examination and laboratory methods, and statistical analysis guidelines is available on the NHANES III Reference Manuals and Reports (CD-ROM). All data users are strongly encouraged to review these reference materials and reports before analyzing NHANES III data. Guidelines for Data Users o NHANES III survey design and demographic variables are found on the Household Adult Data File, Household Youth Data File, the Laboratory Data File and the Examination Data File. In preparing a data set for analysis, other data files should be merged with either or both of the Adult Household Data File or the Youth Household Data File to obtain many important analytic variables. o All of the NHANES III public use data files are linked with the common survey participant identification number (SEQN). Merging information from multiple NHANES III data files using this variable ensures that the appropriate information for each survey participant is linked correctly. o NHANES III public use data files do not have the same number of records on each file. The Household Questionnaire Files (divided into two files, Adult and Youth) contain more records than the Examination Data File because not everyone who was interviewed completed the examination. The Laboratory Data File contains data only for persons aged one year and older. The Individual Foods Data File based on the dietary recall, the Prescription Medication Data File, and The Vitamin and Minerals Data File all have multiple records for each person rather than the one record per sample person contained in the other data files. o For each data file, SAS program code with standard variable names and labels is provided as separate text files on the CD-ROM that contains the data files. This SAS program code can be used to create a SAS data set from the data file. o Modifications were made to items in the questionnaires, laboratory, and examination components over the course of the survey; as a result, data may not be available for certain variables for the full six years. In addition, variables may differ by phase since some changes were implemented between phases. Users are encouraged to read the Notes sections of the file documentation carefully for information about changes. o Extremely high and low values have been verified whenever possible, and numerous consistency checks have been performed. Nonetheless, users should examine the range and frequency of values before analyzing data. o Some data were not ready for release at the time of this publication due to continued processing of the data or analysis of laboratory specimens. A listing of those data are available in the general information section of each data file. o Confidential and administrative data are not available or released to the public. Additionally, some variables have been recoded to protect the confidentiality of the survey participants. For example, all age-related variables were recoded to 90+ years for persons who were 90 years of age or older. o Some variable names may differ from those used in the Phase 1 NHANES III Provisional Data Release and some variables included in the Phase 1 provisional release may not appear on these files. Do not use the Phase 1 provisional release; use the current (six-year) release. o Although the data files have been edited carefully, it is possible that errors may still exist. Please notify NCHS staff (301-458-4636) of any suspected errors in the data file or the documentation. Refer to the NCHS website at http://www.cdc.gov/nchs/nhanes.htm for updates to these data files. Analytic Considerations o NHANES III (1988-94) was designed so that the survey's first three years, 1988-91, its last three years, 1991-94, and the entire six years were national probability samples. Analysts are encouraged to use all six years of survey results. o Sample weights are available for analyzing NHANES III data. One of the following three sample weights will be appropriate for nearly all analyses: interviewed sample final weight (WTPFQX6), examined sample final weight (WTPFEX6), and mobile examination center (MEC)- and home-examined sample final weight (WTPFHX6). Choosing which of these sample weights to use in any analysis depends on the variables being used. A good rule of thumb is to use "the least common denominator" approach. In this approach, the user checks the variables of interest. The variable that was collected on the smallest number of persons is the "least common denominator," and the sample weight that applies to that variable is the appropriate one to use for that analysis. For more detailed information, see the Analytic and Reporting Guidelines for NHANES III (U.S. DHHS, 1996). Referencing or Citing NHANES III Data o In publications, please acknowledge NCHS as the original data source. For instance, the reference for the NHANES III Raw Spirometry Data File on this release: U.S. Department of Health and Human Services (DHHS). National Center for Health Statistics. Third National Health and Nutrition Examination Survey, 1988-1994, NHANES III Raw Spirometry Data File (Series 11, No. 9A). Hyattsville, MD: Centers for Disease Control and Prevention, 2001. Problems Using the Data NHANES III is a wonderfully rich source of data and NCHS encourages you to use the data for research and analysis. However, the dataset is large and complex and familiarity with data file manipulation and analysis is required. NCHS does not have the personnel resources to perform analyses, check results, debug programs or do literature review for your work. Thorough review of the extensive documentation on the planning of the survey, analytic guidelines and individual datasets should resolve most questions. If you still have questions after careful review of the documentation, please contact the Data Dissemination Branch at (301)458-4636. -------------------------------------------------------------------------------- The Raw Spirometry Data File The NHANES III Raw Spirometry Curve Data File (NH3SPIRO.CSV) is a single, large, comma-delimited file of about 240 megabytes in size. There are multiple data lines for each applicable NHANES III respondent. The length and number of fields in each data line vary. To read the data using SAS you should specify the options LRECL=4200, DELIMITER=',' and TRUNCOVER on your INFILE statement. Each record may be a different length because each curve may contain a different number of values. The maximum number of values in a curve is 1775. Note that demographic data (for example, age and gender) are not found on the raw spirometry file. If these data are needed, the user should retrieve demographic data from the exam file on a NHANES III CD-ROM, series 11, Release 1a. To make estimates for the US population, the user must use the appropriate sampling weights, which can also be found in the exam file. -------------------------------------------------------------------------------- NHANES III Examination Data File ------------------------------------------------------------------------ SPIROMETRY ------------------------------------------------------------------------ DATA ------------------------------------------------------------------------ Data item Item description and codes ------------------------------------------------------------------------ SEQN Respondent identification number 00003-53623 SPPCURVN Raw Curve Sequence Number 1-22 SPPTIMEH Time of day test was conducted (in hours) 0-21 SPPTIMEM Time of day test was conducted (in minutes) 0-59 SPPCQF Computer determined quality factor (SEE NOTES) 0-255 SPPTQF Technician entered quality factor (SEE NOTES) 0-143 SPPEXTRA Extra breath at end of manuever (SEE NOTES) 0-1 SPPTEMP Spirometer internal temperature (Celsius) 0-255 SPPBARO Spirometer barometric pressure 550-800 SPPFVC FVC, largest value (ml) 0-7845 SPPFEV1 FEV1 at 1.0 second, largest value (ml) 0-65524 SPPFEV3 FEV3 at 3.0 seconds, largest value (ml) 0-65524 SPPFEV6 FEV6 at 6.0 seconds, largest value (ml) 0-65524 SPPPEAK Peak expiratory flow, largest value (ml) 0-17980 SPPMMEF Max.mid-expiratory flow(ml/sec best curve) 0-9600 RAWCURVE Raw Curve data points (SEE NOTES) A comma-delimited list of integers with positive and negative values. -------------------------------------------------------------------------------- Notes There are up to 1775 Raw Curve Data points (ATPS) in each curve. Each data point represents the change in volume (milliliters) over a 10 millisecond time period. When the number of data points exceeds 1525, the sampling interval is reduced to 20 milliseconds. In other words, there are 20 milliseconds between data points when the volume array index is greater than 1525 or 15.25 seconds. Correspondingly, the data points after index 1525 represent the change in volume over a 20 millisecond period. To obtain the volume-time curve, simply sum the volume-change points to calculate a volume-time curve. It is important to note that ALL curves are provided, including those that are unacceptable and therefore should NOT be used in an analysis. Any curve which has a CQF = 1 (large extrapolated volume) or CQF = 4 (cough) is unacceptable and should not be used. In addition, any curve with a TQF >= 10 was "flagged" by a technician as needing to be deleted. A few curves have extra breaths (extra breath = 1) at the end of the maneuver causing a falsely elevated FVC (and possible FEVx) and need to be either deleted or have special processing performed to remove the extra breath. The values provided with the curves are the result of special processing by the quality center review technicians to remove the extra breath --- truncating the maneuver, allowing an accurate FVC and other timed parameters to be obtained. To test reproducibility based on all of the curves performed by the subject, the overall effort of each subject was graded to determine if the effort was sufficient to be used in an analysis. This overall performance grade (within the test reproducibility code) is not included in the raw curve data set but is available in the spirometry summary section of the exam public data set, which is available on release 1a. Therefore, users of the raw curve data set may want to refer to the spirometry summary section of the exam public data set to obtain the test reproducibility code. Specifically, the test reproducibility code is: 0=Both FVC and FEV1 are reproducible; 1=FVC is not reproducible, 2=FEV1 is not reproducible; 3 both FVC and FEV1 are not reproducible; 4 is added to the code if the technician judged the test to be unusable. The curve acceptability/reproducibility code in the raw curve file was calculated by the computer after each maneuver and does not reflect judgements by reviewing technicians. The American Thoracic Society has recommended that test results may be useable even if they are not reproducible as long as there are two acceptable maneuvers. Some subjects may have a test reproducibility code of < 4 (useable results) in the spirometry summary data set when there are only two acceptable or non-reproducible curves. These test results may be useable even though the FVC and/or FEV1 are not reproducible or there are less than 3 acceptable maneuvers. Accordingly, test results from the raw curve data set where a subject's last curve CQF is >= 128 may be useable. Below is an explanation of key items in the file: SPPCQF - Computer Quality Factor The spirometry software calculated the value of the CQF after each maneuver was performed. Summing the following codes, if applicable, was performed to obtain the CQF: Description Code Large extrapolated volume + 1 Late PEF and/or large time to PEF + 2 Cough + 4 No plateau or maneuver < 6 s + 8 Non-reproducible PEF + 16 Non-reproducible FEV1 + 32 Non-reproducible FVC + 64 Less than 3 acceptable curves +128 SPPTQF - Technician Quality Code Technicians had the option to choose from the following codes at the completion of each FVC maneuver. No code was entered if the maneuver was acceptable. The code "100" was entered after each maneuver if the subject was seated, and may be used in combination with any other code by adding that code to "100". For example, TQF=104 would indicate that the subject was seated and there was a false start of test. TQF values in the range 10-99 are included in the data set but should not be used to determine the best FVC, FEV1, etc. Code Description 1 Did not understand test directions 2 Early termination of expiration 3 Sub-maximal effort on expiration 4 False start of test (do not delete) 5 Leak (mouthpiece, etc.) 6 Excessive variability of effort 7 Insufficient inhalation to Total Lung Capacity (TLC) 8 Refused or could not perform additional tests 100 Subject seated for maneuver Curves with the following codes are omitted from further processing (deleted): 10 Nonspecific error 11 Extra breath at end of maneuver 12 Cough 13 Very sub-maximal effort on expiration 14 False start of test 20 if the review center determines the maneuver should be omitted, 20 was be added to the TQF. SPPEXTRA - Extra breath at end of maneuver Some subjects did not wear noseclips and it is possible that they were able to take in an extra breath and exhale a second time at the end of their first maneuver. When this was observed during the quality review, the curve received special processing to remove the extra exhalation at the end of the maneuver. The raw curve file has a code to indicate that an extra breath occurred and the revised volume values (FVC, FEV1, FEV3, etc) are provided in the file. However, if a curve was marked for deletion (TQF >= 10), the extra breath code may NOT have been set by the review technician since the curve was marked for deletion. Code Description 0 No extra breath - FVC valid 1 Extra breath at end of maneuver causing an elevated FVC, curve should not be used or should be terminated earlier RAWCURVE - Raw Curves Each data point (when exhaled time is less than 15.25 seconds) represents the change in volume (milliliters) over the 10 millisecond time period (20 milliseconds between data points when exhalation times is greater than 15.25 sec). So, to obtain the volume-time curve, simply add the volume-change points. For example: Data Point Volume (ml) 0 0 10 10 30 40 80 120 90 210 120 330 The raw curves are at ATPS (ambient temperature pressure saturated) and spirometer temperature and BP are provided to allow the user to correct the values to BTPS (body temperature pressure saturated).