Using the word function in Stata

May 06, 2019

Instead of using the split command for string manipulation we can use the word function. Looking at a previous example of transforming the make variable of the auto dataset into a categorical variable brand, let’s see if we achieve the same results:

sysuse auto, clear
gen brand = word(make, 1)

tab brand

As demonstrated in this example the word function is simpler and arguably more elegant to use in this case because it just extracts the first word from the make variable, without any looping or use of temporary variables. In this code the function word(make, 1) simply returns the first word from the make variable. Similarly it can be used to extract substrings at other locations from a string.


Profile picture

Written by Johan Osterberg who lives and works in Gothenburg, Sweden as a developer specialized in e-commerce. Connect with me on Linkedin

2024 © Johan Osterberg