Let’s look at some examples of recoding variables in Stata. First off we’ll recode a numeric variable into categories using the auto dataset:
sysuse auto, clear
gen fuel_efficiency = ""
replace fuel_efficiency = "low" if mpg <= 20
replace fuel_efficiency = "medium" if mpg > 20 & mpg <= 30
replace fuel_efficiency = "high" if mpg > 30
Here we recoded the mpg variable into three categories: low, medium and high based on fuel efficiency.
Looking at another example, let’s recode a string variable into numeric.
sysuse auto, clear
gen make_numeric = .
replace make_numeric = 1 if strpos(make, "Toyota") > 0
replace make_numeric = 2 if strpos(make, "VW") > 0
replace make_numeric = 3 if strpos(make, "Ford") > 0
In this example we recoded car makes from the make variable into numeric variables signifying car brands.
For the third example, we’ll recode missing values. Let’s re-use our newly created make_numeric variable and recode all missing values to 0.
sysuse auto, clear
replace make_numeric = 0 if missing(make_numeric)