The NIOCCS Industry and Occupation Web API was updated on April 1, 2023.
Systems already using the Web API will require changes:
The NIOCCS web application programming interface (API) codes industry and occupation data in real-time. A software developer can incorporate the NIOCCS web API into a survey or data collection platform.
The web API is free and does not require an account.
The NIOCCS industry and occupation web API can be incorporated into any data collection platform. The web API uses the same coding schemes that NIOCCS uses.
The web API can code to:
The web API works with two different types of inputs:
The web API automatically detects the type of input provided based upon the format of the industry input and codes accordingly.
The Industry and Occupation coding output is returned in JavaScript Object Notation (JSON) format by default, or can be returned in eXtensible Markup Language (XML) format using the web API's type parameter.
We do not retain any information about requests sent to the web API. We do not retain IP addresses or the content of coding requests.
If you have any questions about the Industry and Occupation Coding Web API, please send an email to NIOCCS@cdc.gov.
If you want to be contacted about updates to the Industry and Occupation Coding Web API, please email NIOCCS@cdc.gov.
We will protect your email address, only using email addresses to directly contact users about important updates to the Web API.
The Industry and Occupation Coding Web API is located at:
https://wwwn.cdc.gov/nioccs/IOCode
The format of the Web API call is:
https://wwwn.cdc.gov/nioccs/IOCode?i=INDUSTRY_TEXT&o=OCCUPATION_TEXT&n=NUMBER_CANDIDATES&v=VERSION&c=CODE_SCHEMES&t=TYPE_OF_FORMAT
The only HTTP method supported by the Web API is GET. Parameters must be provided in the request's Query String. Parameters cannot be sent in the Request Body (as is standard for GET methods in Web APIs).
Parameter | Description | Default (if not specified) |
Example |
---|---|---|---|
I | Industry text OR NAICS code for the data to be coded | N/A | "hospital" OR "622110" |
O | Occupation text for the data to be coded | N/A | "nurse" |
N | Number of candidates returned. The default is 1, returning the top industry and occupation code based on the probabilities from the auto-coder machine learning models. | "n=1" | "n=3" Returns top 3 industry and occupation codes |
C |
Flag that determines what type of codes to return. Options are:
|
"c=0" NAICS 2017/SOC 2018 |
"c=2" Both NAICS 2017/SOC 2018 coding schemes and the corresponding Census Industry and Occupation 2018 are returned |
V |
Determines the coding scheme version returned. Options are:
|
"v=18" CDC Census 2018/CDC NAICS 2017/CDC SOC 2018 coding scheme |
"v=18" CDC Census 2018/CDC NAICS 2017/CDC SOC 2018 coding scheme |
U |
Determines whether or not the Web API returns an indicator about whether the NAICS/SOC codes returned are an unexpected code combination based upon data collected by the U.S. Bureau of Labor Statistics (BLS). Options are:
The Web API returns:
The 'Y?' indicator is due to limitations in the data available. BLS doesn't collect data for some industries and occupations:
|
"u=0" No, does not include the unexpected code combination indicator |
"u=1" Yes, includes the unexpected code combination indicator |
T |
Response format type returned. Options are:
|
"t=json" JSON formatted output |
"t=xml" XML formatted output |
An example of a request to code an industry of "hospital" and an occupation of "nurse", returning the top industry and occupation candidates using
the CDC NAICS and CDC SOC coding schemes is:
https://wwwn.cdc.gov/nioccs/IOCode?i=hospital&o=nurse&c=0
The result is the following JSON:
{ "Industry": [{ "Code": "622110", "Title": "General Medical and Surgical Hospitals", "Probability": 0.999999 }], "Occupation": [{ "Code": "29-1141", "Title": "Registered Nurses", "Probability": 0.9999963 }], "Scheme": "NAICS 2017 and SOC 2018" }
Because NAICS/SOC is the default, the “c=0” parameter can be dropped and the same results are returned:
https://wwwn.cdc.gov/nioccs/IOCode?i=hospital&o=nurse
An example of a request to code an industry of “hospital” and an occupation of “nurse”, returning the top industry and occupation candidates using
the CDC Census Industry and Occupation coding scheme is:
https://wwwn.cdc.gov/nioccs/IOCode?i=hospital&o=nurse&c=1
The result is the following JSON:
{ "Industry": [{ "Code": "8191", "Title": "General Medical and Surgical Hospitals, and Specialty (except Psychiatric and Substance Abuse) Hospitals", "Probability": 0.999999166 }], "Occupation": [{ "Code": "3255", "Title": "Registered Nurses", "Probability": 0.9999963 }], "Scheme": "Census Industry and Occupation 2018" }
An example of a request to code an industry of “hospital” and an occupation of “nurse”, returning the top industry and occupation candidates with codes in the CDC NAICS and CDC SOC schemes and corresponding codes in the CDC Census Industry and Occupation coding scheme is:
https://wwwn.cdc.gov/nioccs/IOCode?i=hospital&o=nurse&c=2
The result is the following JSON:
{ "Industry": [{ "NAICSCode": "622110", "NAICSTitle": "General Medical and Surgical Hospitals", "NAICSProbability": 0.999999, "CensusIndustryCode": "8191", "CensusIndustryTitle": "General Medical and Surgical Hospitals, and Specialty (except Psychiatric and Substance Abuse) Hospitals" }], "Occupation": [{ "SOCCode": "29-1141", "SOCTitle": "Registered Nurses", "SOCProbability": 0.9999963, "CensusOccupationCode": "3255", "CensusOccupationTitle": "Registered Nurses" }], "Scheme": "NAICS 2017 and SOC 2018, Census Industry and Occupation 2018" }
An example of a request to code a NAICS industry code of “722” and an occupation narrative of “manager”, returning the top industry and occupation candidates with codes in the CDC NAICS and CDC SOC schemes and corresponding codes in the CDC Census Industry and Occupation coding scheme is:
https://wwwn.cdc.gov/nioccs/IOCode?i=722&o=manager&c=2
The result is the following JSON:
{ "Industry": [{ "NAICSCode": "722", "NAICSTitle": "Food Services and Drinking Places", "NAICSProbability": 1.0, "CensusIndustryCode": "8680", "CensusIndustryTitle": "Restaurants and Other Food Services" }], "Occupation": [{ "SOCCode": "11-9051", "SOCTitle": "Food Service Managers", "SOCProbability": 0.854493558, "CensusOccupationCode": "0310", "CensusOccupationTitle": "Food Service Managers" }], "Scheme": "NAICS 2017 and SOC 2018, Census Industry and Occupation 2018" }
import requests industry_text = 'hospital' occupation_text = 'nurse' use_census = False number_of_candidates_returned = 3 version = '18' # 18 for 2018 or 12 for 2012 url = 'https://wwwn.cdc.gov/nioccs/IOCode' param_list = { 'i' : industry_text, 'o' : occupation_text, 'c' : '1' if use_census else '0', 'v' : version, 'n' : str(number_of_candidates_returned) } response = requests.get(url, params=param_list, verify=False) #verify False done to prevent SSL validation error if (response.status_code == requests.codes.ok): for counter in range(0, len(response.json()['Industry'])): print('Industry Candidate #{} Code is '.format(counter + 1), response.json()['Industry'][counter]['Code']) print('Industry Candidate #{} Title is '.format(counter + 1), response.json()['Industry'][counter]['Title']) print('Industry Candidate #{} Probability is '.format(counter + 1), response.json()['Industry'][counter]['Probability']) print('Occupation Candidate #{} Code is '.format(counter + 1), response.json()['Occupation'][counter]['Code']) print('Occupation Candidate #{} Title is '.format(counter + 1), response.json()['Occupation'][counter]['Title']) print('Occupation Candidate #{} Probability is '.format(counter + 1), response.json()['Occupation'][counter]['Probability']) print('\n')
# instructions to get data from Web JSON in R from this page: https://datascienceplus.com/accessing-web-data-json-in-r-using-httr/ # installed these packages install.packages("httr") install.packages("rlist") install.packages("jsonlite") install.packages("xml2") # this code gets coded I&O results in R calling the NIOSH IO Coding API library(httr) library(jsonlite) url <- "https://wwwn.cdc.gov/nioccs/IOCode" industry_text <- "hospital" occupation_text <- "nurse" number_results <- "3" # defaults to 1 version <- "18" # 18 for 2018 and 12 for 2012 return_census_codes <- "1" # "1" = return Census coding scheme, otherwise NAICS/SOC coding scheme, defaults to NAICS/SOC # in a single line, return top industry and top occupation candidate in NAICS/SOC coding schemes json_results <- fromJSON(content(GET(url, query = list(i = industry_text, o = occupation_text)), as="text")) # in a single line, return top 3 industry and occupation candidates in Census coding scheme json_results <- fromJSON(content(GET(url, query = list(i = industry_text, o = occupation_text, n = number_results, c=return_census, v=version)), as="text")) # more detailed code (if required for checking for HTTP error using http_error(), etc.) #1) get response response <- GET(url, query = list(i = industry_text, o = occupation_text)) #2) check for error http_error(response) # TRUE or FALSE, place for conditional exception handling logic... #3) convert response to text text_results <- content(response, as="text") #4) convert text results to json json_results <- fromJSON(text_results) #5) view results json_results
/************* CALL THE WEB API VIA PROC HTTP ***************/ filename out "YOURPATH\aapl.csv";*OUTPUT FILE FOR JSON RESULTS; proc http url="https://wwwn.cdc.gov/nioccs/IOCode?i=trucking&o=driver&c=1" method="get" out=out; run; *VIEW EXCEL FILE; /************* REFERENCE THE SAS JSON ENGINE (REQUIRES SAS 9.4 M4 UPDATE IF NOT INSTALLED - CONVERT THE JSON IN THE OUTPUT FILE TO A SAS DATASET ***************/ filename resp ' YOURPATH \aapl.csv'; libname jout JSON fileref="resp"; proc print data=jout.alldata;run; /*********** SAMPLE NIOCCS DATA *************/ libname dat " YOURPATH"; data testm2;set dat.n4wsample; rownum=_N_;*ASSIGN A ROW COUNTER VARIABLE FOR THE MACRO LOOP; run; proc print data=testm2;run; /************** GET COUNT OF RECORDS IN THE INPUT DATASET FOR THE MACRO LOOP COUNTER *****************/ filename obsmac " YOURPATH\Obsnum.sas"; %include obsmac; %obsnum(testm2);%let cnobs=&t2nobs; data getall1;run;*INTIALIZE THE MASTER DATASET; %macro getserviceresults; /************* ITERATE THE INPUT DATASET VIA SAS MACRO LOOP ***************/ %do x=1 %to &cnobs; data _null_; set testm2; where rownum=&x; indencoded = urlencode(industry);*USE THE SAS URLENCODE FUNCTION TO HANDLE SPACES AND SPECIAL CHARACTERS FOR THE WEB API; call symput("idtitlemac",industry); call symput("idmac",row_id);*STORE THE ID IN A MACRO VARIABLE FOR LATER USE; call symputx("encoded_ind",indencoded,G); call symput("octitlemac",occupation); occencoded = urlencode(occupation); call symputx("encoded_occ",occencoded,G); run; filename out " YOURPATH\aapl.csv"; /********** PASS IN THE ENCRYPTED VARIABLES TO THE WEB API CALL ***********/ proc http url="https://wwwn.cdc.gov/nioccs/IOCode?i=(&encoded_ind)&o=(&encoded_occ)&c=1" method="get" out=out; run; filename resp ' YOURPATH\aapl.csv'; libname jout JSON fileref="resp"; /*************** PUT THE ID AND INPUT TITLES BACK INTO THE RESULTS DATASET *******************/ data getsc2_&x;length value $500.; set jout.alldata; id=strip(symget("idmac"));industry=strip(symget("idtitlemac"));occupation=strip(symget("octitlemac")); run; /************ CONCATENATE THE WEB API CALL RESULTS DATASET INTO THE MASTER DATASET ***************/ data getall1;set getall1 getsc2_&x;run; %end; data getall1a;set getall1;where not missing(V);run; %mend; %getserviceresults; proc print data=getall1a;run; /*************** TRANSPOSE DATA TO ONE ROW PER ID (STANDARD NIOCCS OUTPUT) ******************/ data getall2;set getall1a;length pram $50.;where v ge 1;pram=strip(P1)||"_"||strip(P2);run; proc sort data=getall2;by id industry occupation pram;run; proc transpose data=getall2 delimiter=_ out=getall3(drop=_name_); by ID industry occupation; id pram; var value; run; proc print data=getall3;run;
The NIOCCS industry and occupation autocoder was built using the gensim library (https://radimrehurek.com/gensim/index.html). 1
The NIOCCS industry and occupation autocoder uses FastText word vectors (https://fasttext.cc/docs/en/aligned-vectors.html).2, 3
Usage Conditions
You are accessing a U.S. Government information system, which includes (1) this computer, (2) this computer network, (3) all computers connected to this network, and (4) all devices and storage media attached to this network or to a computer on this network. This information system is provided for U.S. Government-authorized use only.
CDC.gov Policies and Regulations
CDC.gov Privacy Policy
Disclaimer
The material embodied in this software is provided to you "as-is" and without warranty of any kind, express, implied or otherwise, including without limitation, any warranty of fitness for a particular purpose. In no event shall the Centers for Disease Control and Prevention (CDC) or the United States (U.S.) government be liable to you or anyone else for any direct, special, incidental, indirect or consequential damages of any kind, or any damages whatsoever, including without limitation, loss of profit, loss of use, savings or revenue, or the claims of third parties, whether or not CDC or the U.S. government has been advised of the possibility of such loss, however caused and on any theory of liability, arising out of or in connection with the possession, use or performance of this software.
___________________________
1 [884893] R. Řehůřek, P. Sojka, Software Framework for Topic Modelling with Large Corpora (https://is.muni.cz/publication/884893/en)
2 [1804.07745] A. Joulin, P. Bojanowski, T. Mikolov, H. Jegou, E. Grave, Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion (https://arxiv.org/abs/1804.07745)
3 [1607.04606] P. Bojanowski*, E. Grave*, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information (https://arxiv.org/abs/1607.04606)