Background

Odds ratios, or coefficients showing the association of the independent variables with the log odds, represent the most immediate output of a logistic regression. However, for a variety of reasons, it may make sense to not only report odds ratios, but also to investigate predicted probabilities.

Get The Data

We start by obtaining simulated data from StataCorp.

. clear all
. graph close _all
. use http://www.stata-press.com/data/r15/margex, clear
(Artificial data for margins)

Describe The Data

The variables are as follows:

. describe

Contains data from http://www.stata-press.com/data/r15/margex.dta
 Observations:         3,000                  Artificial data for margins
    Variables:            11                  27 Nov 2016 14:27
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Variable      Storage   Display    Value
    name         type    format    label      Variable label
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
y               float   %6.1f                 
outcome         byte    %2.0f                 
sex             byte    %6.0f      sexlbl     
group           byte    %2.0f                 
age             float   %3.0f                 
distance        float   %6.2f                 
ycn             float   %6.1f                 
yc              float   %6.1f                 
treatment       byte    %2.0f                 
agegroup        byte    %8.0g      agelab     
arm             byte    %8.0g                 
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Sorted by: group

Estimate Logistic Regression (logit)

We then run a logistic regression model in which outcome is the dependent variable. sex, age and group are the independent variables.

. logit outcome i.sex c.age i.group, or

Iteration 0:   log likelihood = -1366.0718  
Iteration 1:   log likelihood = -1111.4595  
Iteration 2:   log likelihood =  -1069.588  
Iteration 3:   log likelihood =      -1068  
Iteration 4:   log likelihood = -1067.9941  
Iteration 5:   log likelihood = -1067.9941  

Logistic regression                                     Number of obs =  3,000
                                                        LR chi2(4)    = 596.16
                                                        Prob > chi2   = 0.0000
Log likelihood = -1067.9941                             Pseudo R2     = 0.2182

─────────────┬────────────────────────────────────────────────────────────────
     outcome │ Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         sex │
     female  │    1.64734    .221973     3.70   0.000      1.26499    2.145258
         age │    1.09444   .0070921    13.93   0.000     1.080628    1.108429
             │
       group │
          2  │   .5568139   .0751806    -4.34   0.000     .4273478     .725502
          3  │   .2566074   .0747822    -4.67   0.000     .1449462    .4542885
             │
       _cons │   .0038757   .0013558   -15.87   0.000     .0019524    .0076933
─────────────┴────────────────────────────────────────────────────────────────
Note: _cons estimates baseline odds.

Margins (margins)

We use the margins command to estimate predicted probabilities at different values of sex and age.

. margins sex, at(age = (20 30 40 50 60))

Predictive margins                                       Number of obs = 3,000
Model VCE: OIM

Expression: Pr(outcome), predict()
1._at: age = 20
2._at: age = 30
3._at: age = 40
4._at: age = 50
5._at: age = 60

─────────────┬────────────────────────────────────────────────────────────────
             │            Delta-method
             │     Margin   std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
     _at#sex │
     1#male  │   .0153934   .0031264     4.92   0.000     .0092657    .0215211
   1#female  │   .0250609   .0046143     5.43   0.000     .0160171    .0341048
     2#male  │   .0369626   .0054588     6.77   0.000     .0262635    .0476616
   2#female  │   .0592151   .0072711     8.14   0.000     .0449639    .0734663
     3#male  │   .0856677   .0088815     9.65   0.000     .0682603    .1030751
   3#female  │   .1325688   .0097333    13.62   0.000     .1134919    .1516458
     4#male  │   .1844578    .015461    11.93   0.000     .1541547    .2147608
   4#female  │   .2677423    .015609    17.15   0.000     .2371493    .2983353
     5#male  │    .349279    .029326    11.91   0.000     .2918012    .4067569
   5#female  │   .4622129   .0303129    15.25   0.000     .4028007    .5216251
─────────────┴────────────────────────────────────────────────────────────────

Plotting Margins (marginsplot)

margins provides a lot of results, which can be difficult to understand. Therefore, we use marginsplot to plot these margins results. The key command is marginsplot, which could be used on its own. I have simply added the Michigan graph scheme, as well as some options to improve the graphic design of the plot.

. marginsplot, scheme(michigan)

Variables that uniquely identify margins: age sex
. graph export mymarginsplot.png, width(500) replace
file mymarginsplot.png saved as PNG format
Graph of Predicted Margins

Graph of Predicted Margins

Predicted Probabilities (predict)

Predicted probabilities are each participant's individual predicted probability of experiencing depression based upon the independent variables included in the model. We often denote such predicted probabilities with \(\hat{y}\)

. predict yhat
(option pr assumed; Pr(outcome))

yhat is a variable in the data, just like any other variable, and we can summarize and graph it.

. twoway (lowess yhat age if sex == 0) ///
> (lowess yhat age if sex == 1), ///
> title("Predicted Probabilities of Outcome") ///
> legend(order(1 "male" 2 "female")) ///
> scheme(michigan)
. graph export mytwoway.png, width(500) replace
file mytwoway.png saved as PNG format
Graph of Predicted Probabilities

Graph of Predicted Probabilities

Rerun margins, Posting Results

We again employ the margins command, this time using the post option so that the results of the margins command are posted as an estimation result. This will allow us to employ the test command to statistically test different margins against each other.

. margins sex, at(age = (20 30 40 50 60)) post

Predictive margins                                       Number of obs = 3,000
Model VCE: OIM

Expression: Pr(outcome), predict()
1._at: age = 20
2._at: age = 30
3._at: age = 40
4._at: age = 50
5._at: age = 60

─────────────┬────────────────────────────────────────────────────────────────
             │            Delta-method
             │     Margin   std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
     _at#sex │
     1#male  │   .0153934   .0031264     4.92   0.000     .0092657    .0215211
   1#female  │   .0250609   .0046143     5.43   0.000     .0160171    .0341048
     2#male  │   .0369626   .0054588     6.77   0.000     .0262635    .0476616
   2#female  │   .0592151   .0072711     8.14   0.000     .0449639    .0734663
     3#male  │   .0856677   .0088815     9.65   0.000     .0682603    .1030751
   3#female  │   .1325688   .0097333    13.62   0.000     .1134919    .1516458
     4#male  │   .1844578    .015461    11.93   0.000     .1541547    .2147608
   4#female  │   .2677423    .015609    17.15   0.000     .2371493    .2983353
     5#male  │    .349279    .029326    11.91   0.000     .2918012    .4067569
   5#female  │   .4622129   .0303129    15.25   0.000     .4028007    .5216251
─────────────┴────────────────────────────────────────────────────────────────

margins with coeflegend

We follow up by using the margins command with the coeflegend option to see the way in which Stata has labeled the different margins.

. margins, coeflegend

Predictive margins                                       Number of obs = 3,000
Model VCE: OIM

Expression: Pr(outcome), predict()
1._at: age = 20
2._at: age = 30
3._at: age = 40
4._at: age = 50
5._at: age = 60

─────────────┬────────────────────────────────────────────────────────────────
             │     Margin   Legend
─────────────┼────────────────────────────────────────────────────────────────
     _at#sex │
     1#male  │   .0153934  _b[1bn._at#0bn.sex]
   1#female  │   .0250609  _b[1bn._at#1.sex]
     2#male  │   .0369626  _b[2._at#0bn.sex]
   2#female  │   .0592151  _b[2._at#1.sex]
     3#male  │   .0856677  _b[3._at#0bn.sex]
   3#female  │   .1325688  _b[3._at#1.sex]
     4#male  │   .1844578  _b[4._at#0bn.sex]
   4#female  │   .2677423  _b[4._at#1.sex]
     5#male  │    .349279  _b[5._at#0bn.sex]
   5#female  │   .4622129  _b[5._at#1.sex]
─────────────┴────────────────────────────────────────────────────────────────

Testing Margins Against Each Other

Lastly, we can test margins against eachother, e.g. the margins at age 20 for men and women, and again at age 50 for men and women.

. test _b[1bn._at#0bn.sex] = _b[1bn._at#1.sex] // male and female at age 20

 ( 1)  1bn._at#0bn.sex - 1bn._at#1.sex = 0

           chi2(  1) =   10.62
         Prob > chi2 =    0.0011
. test _b[4._at#0bn.sex] = _b[4._at#1.sex] // male and female at age 50

 ( 1)  4._at#0bn.sex - 4._at#1.sex = 0

           chi2(  1) =   13.85
         Prob > chi2 =    0.0002