use simulated_multilevel_data.dta, clear
In another post, I mentioned that for a few recent projects, I have had to quickly get up to speed on new data. In that other post, I focused on the idea of using a pairs plot in R.
In this post, I want to focus on a similar idea. But instead of a pairs plot, I focus on using a regression model with a number of risk and protective factors predicting an outcome. Here I use Stata instead of R, and use Stata’s powerful margins
and marginsplot
command to get a visual idea of the way that these different risk and protective factors are associated with the outcome.
Using margins
means that I’m going to use some complicated looking syntax to get some results that are actually pretty straightforward.
For this example, I use the data from my Multilevel Workshop site.
Get The Data
Regression Model
Because these data are clustered by country, I use the mixed
command to get accurate standard errors and p values.
identity || country:
mixed outcome warmth physical_punishment i.intervention i.
Running /Users/agrogan/Desktop/GitHub/agrogan1.github.io/posts/exploring-data/profile
> .do ...
Performing EM optimization ...
Performing gradient-based optimization:
Iteration 0: Log likelihood = -9627.1393
Iteration 1: Log likelihood = -9627.1393
Computing standard errors ...
Mixed-effects ML regression Number of obs = 3,000
Group variable: country Number of groups = 30
Obs per group:
min = 100
avg = 100.0
max = 100
Wald chi2(4) = 373.21
Log likelihood = -9627.1393 Prob > chi2 = 0.0000
------------------------------------------------------------------------------------
outcome | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------------+----------------------------------------------------------------
warmth | .8342119 .0574661 14.52 0.000 .7215805 .9468434
physical_punishm~t | -.9934352 .0798217 -12.45 0.000 -1.149883 -.8369874
1.intervention | .6462874 .2175097 2.97 0.003 .2199762 1.072598
1.identity | -.3104984 .2170594 -1.43 0.153 -.7359271 .1149302
_cons | 51.79936 .4777788 108.42 0.000 50.86293 52.73579
------------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
country: Identity |
var(_cons) | 3.377381 .9627211 1.931725 5.904929
-----------------------------+------------------------------------------------
var(Residual) | 35.04285 .9093623 33.3051 36.87127
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 204.68 Prob >= chibar2 = 0.0000
Were the data not clustered, I could have just as easily used regress
.
regress outcome warmth physical_punishment i.intervention i.identity
Use margins
to Get Marginal Effects
margins
in Stata is a very versatile (and sometimes confusing) command. Here I use slightly complicated syntax, to get a very simple set of results. I use margins
, and the dydx(*)
option to get marginal effects, which here are simply the \(\beta\) coefficients reported above.
We want to “let margins
know about the regression coefficients”.
As we will see below, letting margins
calculate the regression coefficients lets us graph the results using marginsplot
.
Technically speaking, dydx(*)
is asking Stata to calculate the marginal effects, \(\frac{\partial y}{\partial x}\) for all (*
) independent variables in the model. In a linear model, with no interaction terms, \(\frac{\partial y}{\partial x} = \beta\).
Again, we are essentially using slightly complicated syntax to get simple results.
dydx(*) margins,
Running /Users/agrogan/Desktop/GitHub/agrogan1.github.io/posts/exploring-data/profile
> .do ...
Average marginal effects Number of obs = 3,000
Expression: Linear prediction, fixed portion, predict()
dy/dx wrt: warmth physical_punishment 1.intervention 1.identity
------------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
-------------------+----------------------------------------------------------------
warmth | .8342119 .0574661 14.52 0.000 .7215805 .9468434
physical_punishm~t | -.9934352 .0798217 -12.45 0.000 -1.149883 -.8369874
1.intervention | .6462874 .2175097 2.97 0.003 .2199762 1.072598
1.identity | -.3104984 .2170594 -1.43 0.153 -.7359271 .1149302
------------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
Use marginsplot
To Visualize the Marginal Effects
First I use marginsplot
with no options to visualize the marginal effects. This produces a pretty non-intuitive graph.
marginsplot
graph export "marginsplot1.png", replace
Use marginsplot
With Better Options
I add some options to clarify the graph. I recast
the graph as a scatter
plot. I flip the graph to be horizontal
. I also add a reference line at \(x = 0\) to make more clear which coefficients are significant, and which coefficients are not significant.
///
marginsplot, recast(scatter) /// recast as scatterplot
/// flip to horizontal
horizontal xline(0, lcolor(red)) // line at x = 0 in red
graph export "marginsplot2.png", replace
Conclusion
In conclusion:
- I ran a regression model using
mixed
(I could have usedregress
). - I transferred the values of the \(\beta\) regression coefficients to
margins
. - I used
marginsplot
with a few options to see the results.
These steps: mixed
or regress
; margins
; marginsplot
, could be repeated for almost any analysis.
Results suggest that identity
is not associated with the outcome
. The intervention
is associated with improvements in the outcome
. physical_punishment
is associated with a worsened outcome
. Parental warmth
is associated with an improved outcome
.