Creating binary variables in Stata is useful for many purposes, for example if you quickly need to get an overview of a variable with a large set of values.
Using the nlsw88 training dataset we’ll generate a variable indicating high income. Let’s say the cut-off point will be $25 per hour:
sysuse nlsw88, clear
generate high_income = 1 if wage > 25
replace high_income = 0 if wage <= 25
tab high_income
Here we generated a variable high_income with the value 1 for all observations larger than $25 per hour. Then we replaced the newly created variable for all observations with an hourly wage equal to or less than $25.
In this example we can see that only 56 observations or roughly 2.5% fall into the high income category.
To achieve the same results with the recode command, we’d do something like this:
drop high_income
recode wage (min/25 = 0) (else = 1), gen(high_income)
tab high_income
The same results as tabulating the previous version of the variable should be displayed.