Demonstration of Clustered Data
Show the code
library(lme4) # MLM
library(ggplot2) # beautiful graphs
library(gganimate) # animated ggplots
library(plotly) # animated graphs
# library(broom)
library(pander) # nice tables
library(sjPlot) # nice tables for MLM
1 Grouped and Individual Data
“The data were generated from random numbers, and there is no relation between X and Y at all. Firstly, values of X and Y were generated for each ‘subject,’ then a further random number was added to make the individual observation.”
From Bland and Altman, BMJ, 1994, 308, 896.
So… we follow their procedure.
Show the code
set.seed(3846) # random seed
<- seq(1, 5) # number of groups
g
<- rnorm(5, 100, 10) # group x
x
<- rnorm(5, 100, 10) # group y
y
# 5 obs / group
<- seq(1, 25)
id
<- rep(g, 1, each=5)
group_num
<- rep(x, each = 5) # x for the group
x_group
<- rep(y, each = 5) # y for the group
y_group
<- rnorm(25, 0, 1) # random noise
x_noise
<- rnorm(25, 0, 1) # random noise
y_noise
<- x_group + x_noise # individual = group + noise
x_individual
<- y_group + y_noise # individual = group + noise
y_individual
<- data.frame(id,
mydata
group_num,
x_group,
y_group,
x_individual, y_individual)
The animation below illustrates the process of simulating the data.
Show the code
ggplot(mydata,
aes(color = factor(group_num),
label = group_num)) +
geom_point(aes(x = x_group,
y = y_group),
size = 10,
show.legend = FALSE) +
geom_text(aes(x = x_group,
y = y_group),
color = "white",
show.legend = FALSE) +
geom_point(aes(x = x_individual,
y = y_individual),
show.legend = FALSE) +
transition_layers(layer_length = 1,
transition_length = 7) +
enter_fade() +
enter_grow() +
exit_fade() +
exit_shrink() +
labs(title = "Illustrating the Process of Simulating the Data",
x = "x",
y = "y") +
theme_minimal() +
scale_color_viridis_d(name = "group")
2 Analyses
2.1 OLS
Show the code
<- lm(y_individual ~ x_individual, data = mydata)
myOLS
# summary(myOLS)
::tab_model(myOLS,
sjPlotshow.se = TRUE,
show.ci = FALSE,
show.stat = TRUE)
y_individual | ||||
Predictors | Estimates | std. Error | Statistic | p |
(Intercept) | 4.49 | 12.44 | 0.36 | 0.722 |
x individual | 1.05 | 0.14 | 7.75 | <0.001 |
Observations | 25 | |||
R2 / R2 adjusted | 0.723 / 0.711 |
Show the code
# pander(tidy(myOLS))
2.2 MLM
Show the code
<- lmer(y_individual ~ x_individual + (1 | group_num),
myMLM data = mydata)
# summary(myMLM)
::tab_model(myMLM,
sjPlotshow.se = TRUE,
show.ci = FALSE,
show.stat = TRUE)
y_individual | ||||
---|---|---|---|---|
Predictors | Estimates | std. Error | Statistic | p |
(Intercept) | 98.71 | 15.74 | 6.27 | <0.001 |
x individual | 0.02 | 0.16 | 0.12 | 0.903 |
Random Effects | ||||
σ2 | 0.62 | |||
τ00 group_num | 97.39 | |||
ICC | 0.99 | |||
N group_num | 5 | |||
Observations | 25 | |||
Marginal R2 / Conditional R2 | 0.000 / 0.994 |
Show the code
# pander(tidy(myMLM))
2.3 Compare OLS and MLM
Show the code
tab_model(myOLS, myMLM,
dv.labels = c("OLS", "MLM"),
show.se = TRUE,
show.ci = FALSE,
show.stat = TRUE)
OLS | MLM | |||||||
---|---|---|---|---|---|---|---|---|
Predictors | Estimates | std. Error | Statistic | p | Estimates | std. Error | Statistic | p |
(Intercept) | 4.49 | 12.44 | 0.36 | 0.722 | 98.71 | 15.74 | 6.27 | <0.001 |
x individual | 1.05 | 0.14 | 7.75 | <0.001 | 0.02 | 0.16 | 0.12 | 0.903 |
Random Effects | ||||||||
σ2 | 0.62 | |||||||
τ00 | 97.39 group_num | |||||||
ICC | 0.99 | |||||||
N | 5 group_num | |||||||
Observations | 25 | 25 | ||||||
R2 / R2 adjusted | 0.723 / 0.711 | 0.000 / 0.994 |