In a previous post we looked briefly at the summarize command, let’s look a bit deeper. When opening a new dataset it’s a great way to get an overiview of the data, or a summary if you will:
sysuse nlsw88, clear
summarize
The syntax of the summarize command is as follows:
command [varlist] [if] [in] [weight] [, options]
For instance we can get summary statistics of wage for southern residents:
summarize wage if south==1
Or we can look at wage levels if total work life experience is less than five years, like so:
summarize wage if ttl_exp < 5
Similarly, in the auto dataset, we can compare the mpg variable for foreign cars:
sysuse auto, clear
summarize mpg if foreign == 1
To get detailed summary statistics of a variable, such as rep78:
summarize rep78, detail
Here we are presented with a detailed summary for the variable rep78, in the form of percentiles, number of observations, mean, standard deviation, skewness and more.