Recoding examples in Stata

April 16, 2019

Let’s look at some examples of recoding variables in Stata. First off we’ll recode a numeric variable into categories using the auto dataset:

sysuse auto, clear
gen fuel_efficiency = ""
replace fuel_efficiency = "low" if mpg <= 20
replace fuel_efficiency = "medium" if mpg > 20 & mpg <= 30
replace fuel_efficiency = "high" if mpg > 30

Here we recoded the mpg variable into three categories: low, medium and high based on fuel efficiency.

Looking at another example, let’s recode a string variable into numeric.

sysuse auto, clear
gen make_numeric = .
replace make_numeric = 1 if strpos(make, "Toyota") > 0
replace make_numeric = 2 if strpos(make, "VW") > 0
replace make_numeric = 3 if strpos(make, "Ford") > 0

In this example we recoded car makes from the make variable into numeric variables signifying car brands.

For the third example, we’ll recode missing values. Let’s re-use our newly created make_numeric variable and recode all missing values to 0.

sysuse auto, clear
replace make_numeric = 0 if missing(make_numeric)

Profile picture

Written by Johan Osterberg who lives and works in Gothenburg, Sweden as a developer specialized in e-commerce. Connect with me on Linkedin

2024 © Johan Osterberg