clearallsetseed 3846 // set random seedquietlysetobs 10 // 10 observationsgenerate id = _n// id numberquietly expand 3 // expand by 3sort id // sort by idbysort id: generate t = _n// time variablegenerate x = rnormal(10, 3) // random normal variablegeneratew = rbinomial(1, .3) // random binomial variablegeneratee = rnormal(0, 1) // random errorgeneratey = x + w + e// regression equationdrope// drop errorlist// list out the datasave longitudinal.dta, replace
Performing EM optimization Performing gradient-based optimization:
Iteration 0: Log likelihood = -41.789697
Iteration 1: Log likelihood = -41.654948
Iteration 2: Log likelihood = -41.653312
Iteration 3: Log likelihood = -41.65331
Computing standard errors ...
Mixed-effects ML regression Number of obs = 30
Group variable: id Number of groups = 10
Obs per group:
min = 3
avg = 3.0
max = 3
Wald chi2(2) = 236.62
Log likelihood = -41.65331 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
y | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
x | .938261 .0626023 14.99 0.000 .8155627 1.060959
1.w | 1.682743 .3765298 4.47 0.000 .9447577 2.420728
_cons | .3540235 .6672312 0.53 0.596 -.9537257 1.661773
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
id: Identity |
var(_cons) | 1.66e-15 1.53e-11 0 .
-----------------------------+------------------------------------------------
var(Residual) | .9408329 .2429279 .5671891 1.56062
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 1.4e-14 Prob >= chibar2 = 1.0000
3 Fixed Effects
We assume that the \(u_{0i}\) are in fact, estimable. However, we end up estimating \(y_{it} - \bar y_i = \beta_1 (x_{it} - \bar x_i) + \beta_2 (w_{it} - \bar w_i) + (e_{it} - \bar e_i)\). The \(u_{0i}\) have dropped out of this equation.
Fixed-effects (within) regression Number of obs = 30
Group variable: id Number of groups = 10
R-squared: Obs per group:
Within = 0.9142 min = 3
Between = 0.8102 avg = 3.0
Overall = 0.8673 max = 3
F(2, 18) = 95.93
corr(u_i, Xb) = -0.3779 Prob > F = 0.0000
------------------------------------------------------------------------------
y | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
x | .987199 .0714222 13.82 0.000 .8371465 1.137252
1.w | 2.757344 .5380926 5.12 0.000 1.626853 3.887834
_cons | -.4908022 .8026548 -0.61 0.549 -2.177117 1.195513
-------------+----------------------------------------------------------------
sigma_u | .87126686
sigma_e | .93451278
rho | .46501875 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 18) = 1.59 Prob > F = 0.1919
In cross-lagged regression, we need the data to be in wide format rather than long format.
Code
use longitudinal.dta, clearreshapewidey x w, i(id) j(t) // reshape data to widesave longitudinalWIDE.dta, replace
(j = 1 2 3)
Data Long -> Wide
-----------------------------------------------------------------------------
Number of observations 30 -> 10
Number of variables 5 -> 10
j variable (3 values) t -> (dropped)
xij variables:
y -> y1 y2 y3
x -> x1 x2 x3
w -> w1 w2 w3
-----------------------------------------------------------------------------
file longitudinalWIDE.dta saved
Method <-c("Multilevel Modeling","Fixed Effects","Cross Lagged Regression")`Control for Time Invariant Observed`<-c("yes","yes","yes")`Control for Time Varying Observed`<-c("yes","yes","yes")`Control for Time Invariant Unobserved`<-c("partially","yes","no")`Control for Time Varying Unobserved`<-c("no","no","no")`Estimate Reciprocal Causality`<-c("no","no","yes")`Control for Earlier or Baseline y`<-c("automatic","automatic","must explicitly specify")mytable <-data.frame(Method,`Control for Time Invariant Observed`,`Control for Time Varying Observed`,`Control for Time Invariant Unobserved`,`Control for Time Varying Unobserved`,`Estimate Reciprocal Causality`,`Control for Earlier or Baseline y`,check.names =FALSE)pander::pander(mytable)
Table continues below
Method
Control for Time Invariant Observed
Multilevel Modeling
yes
Fixed Effects
yes
Cross Lagged Regression
yes
Table continues below
Control for Time Varying Observed
Control for Time Invariant Unobserved
yes
partially
yes
yes
yes
no
Table continues below
Control for Time Varying Unobserved
Estimate Reciprocal Causality
no
no
no
no
no
yes
Control for Earlier or Baseline y
automatic
automatic
must explicitly specify
Footnotes
Some of the decisions in this table are arguable.↩︎