A Very Short Introduction to Stata

The basic philosophy of Stata.

Author

Andrew Grogan-Kaylor

Published

October 25, 2024

The basic philosophy of Stata—“Stata in one sentence”—is:

do_something to_variable(s), options

Often it is not necessary to use any options since the authors of Stata have done such a good job of creating an intuitive language, and of thinking about the defaults. Commands that you actually type are represented in monospace font. x and y refer to variables in your data.

Task Command
Open data use mydata.dta
Look for or find lookfor thing 1
Descriptive statistics summarize x y
Frequencies tabulate x
Recode recode x (old = new)(...), generate(xR)2
Rename rename x z
Keep keep x y z
Drop drop x y z
Correlation corr x y
Regression regress y x z
Logistic Regression logit y x z, or 3
Ordinal Logistic Regression ologit y x z, or 4
Multinomial Logistic Regression mlogit y x z, rr 5
Multilevel Model mixed y x z || group: x
Structural Equation Modeling sem (y <- x m z) (m <- x z)
Histogram histogram x 6
Bar Graph graph bar, over(x)
Bar Graph (of means) graph bar y, over(x)
Pie Chart graph pie, over(x)
Scatterplot twoway scatter y x

Footnotes

  1. lookfor thing looks for any variable with thing in the variable name or variable label. lookfor somethingelse looks for any variable with somethingelse in the variable name or variable label. It is often useful to lookfor abbreviations e.g. lookfor anx instead of lookfor anxiety.↩︎

  2. It is usually best practice, but not required, to recode values of a variable into a new variable, leaving the original variable untouched.↩︎

  3. Here we need to use the , or option to ask for odds ratios instead of logit coefficients.↩︎

  4. Here again we need to use the , or option to ask for odds ratios instead of logit coefficients.↩︎

  5. Here we need to use the , rr option to ask for risk ratios instead of logit coefficients.↩︎

  6. For graphing commands, you can often add options after a ,. e.g. title("title of the graph"), xtitle("title of the x axis"), ytitle("title of the y axis").↩︎