Counting unique values in Stata | Johan Osterberg - Product Engineer

Counting unique values in Stata

May 29, 2019

In order to get the unique values of a variable (for example how many times an identifier occurs among observations) there are a few different approaches we can try. Let's load up the nlsw88 dataset and look at the variable representing total work experience (ttl_exp):

sysuse nlsw88, clear
egen unique_ttl = tag(ttl_exp)
tab unique_ttl

Here we used a combination egen() function with the tag option. Essentially we are tagging each first appearance of a value with 1 and each subsequent occurence with 0. Finally we tabulate the new variable to inspect the result.

Tagging unique values

Judging from this it seems we have 1546 unique values among the observations. We can verify by using the codebook command and inspecting the unique values property:

codebook ttl_exp

Codebook unique values

As we can see, the codebook output verifies the number we got as a result of our previous work. Moreover, anytime you just want to get the unique counts directly, it's fair to say that the codebook command provides the simplest way to get the unique counts directly.


Profile picture

Written by Johan Osterberg who lives and works in Gothenburg, Sweden as a developer specialized in e-commerce. Connect with me on Linkedin

2024 © Johan Osterberg