Binary variables in Stata
April 29, 2019
Creating binary variables in Stata is useful for many purposes, for example if you quickly need to get an overview of a variable with a large set of values.
Using the nlsw88 training dataset we’ll generate a variable indicating high income. Let’s say the cut-off point will be $25 per hour:
sysuse nlsw88, clear generate high_income = 1 if wage > 25 replace high_income = 0 if wage <= 25 tab high_income
Here we generated a variable high_income with the value 1 for all observations larger than $25 per hour. Then we replaced the newly created variable for all observations with an hourly wage equal to or less than $25.
In this example we can see that only 56 observations or roughly 2.5% fall into the high income category.
To achieve the same results with the recode command, we’d do something like this:
drop high_income recode wage (min/25 = 0) (else = 1), gen(high_income) tab high_income
The same results as tabulating the previous version of the variable should be displayed.
Written by Johan Osterberg who lives and works in Gothenburg, Sweden as a developer specialized in e-commerce. Connect with me on Linkedin