Multilevel and Longitudinal Modeling (SW 864)
An Interdisciplinary Doctoral Statistics Course
Multilevel models have become a standard statistical tool for quantitative research on neighborhoods, communities and schools. The cross-sectional multilevel model is appropriate for situations in which respondents are clustered inside larger social or geographic units e.g. people in geographic areas, residents in neighborhoods, or children in classrooms and/or schools. Cross-sectional multilevel models enable multilevel thinking at multiple levels (units of analysis) and conceptualizing effects at the country, community, family, and individual levels.
Stata will be the statistical software employed in this course, and students will need access to Stata to participate.
You will need to already have Stata, to purchase Stata from https://www.stata.com/, or to use https://its.umich.edu/computing/computers-software/virtual-sites to access Stata.
Participation in this class requires that you have a personal laptop computer that is powerful enough to run standard statistical software (e.g. Stata) at a reasonable speed. The School of Social Work is not able to provide laptop computers.
This course requires a solid understanding of ordinary least squares regression (OLS) as a starting point. Generally this means that one has taken the introductory graduate level 2 semester sequence in statistics in a discipline like Psychology or Sociology.
Previously this course was numbered SW 832. The new number for the course is SW 864.
The first part of the course is all about the idea of “nesting”: e.g. students nested in classrooms, residents nested in neighborhoods, study participants nested in cities. The second part of the course extends these ideas to longitudinal data, thinking about situations where we have repeated measures on an outcome of interest, like anxiety, depression or substance abuse. Course assignments focus on individually focused student data driven projects, so for students considering the course, it is useful to have an available data set has both some kind of “nesting” as well as repeated measures of some outcome of interest.
NB that the major assignments in this class are two short empirical papers where students apply the concepts of multilevel modeling. This means that students who are interested in this class should start thinking about appropriate data sets to use in the course. Data for the first of two papers should be suitable for a cross-sectional multilevel analysis. This data for the first paper should contain some indicator of where study participants live or study, whether that is a Census tract, neighborhood, state, country, classroom or school, etc.. Data for the second of two papers should be suitable for a multilevel longitudinal analysis. This data for the second paper should contain repeated measures of some outcome on the same individuals over at least 2 time points.
Perhaps surprisingly, the multilevel model for cross-sectional data can easily accommodate longitudinal data where study participants are observed repeatedly over time. Longitudinal multilevel models represent a different kind of multilevel thinking where we extend the algebra of the multilevel model to think about events or occasions, nested inside individuals.
Further, while this is sometimes not recognized, multilevel models for longitudinal data are closely related to other important longitudinal data models, such as fixed effects regression, an important technique for controlling for unobserved variables. Multilevel models are also closely related to statistical models for meta-analysis of multiple studies.
This course focuses on the use of multilevel and longitudinal data analysis for social work and social science research. The course is conceptualized as covering the following topics:
- The multilevel model for cross-sectional data.
- The extension of multilevel modeling to longitudinal research (i.e. growth trajectory models).
- Other panel data models such as fixed effects and random effects models.
- Possible additional topics based on student interest (models for dichotomous outcomes; Bayesian approaches to multilevel modeling; power analysis for single level and multilevel models; meta-analysis)
The models discussed in this course are primarily intended, and best developed, for continuous outcomes, so are most appropriate for students with research questions that can in some way be conceptualized as continuous outcomes.
The first half of the semester will be devoted to a discussion of cross-sectional and clustered data. The second half of the semester will focus on data with repeated measures on at least the dependent variable. Thus, it is very beneficial for students taking the course to locate suitable data for each half of the course prior to the beginning of the semester.
The course is intended for students who have some familiarity with ordinary least squares regression, since the OLS model will be used as a starting point for discussion.
The treatment of these topics involves discussion of the underlying statistical theory, but is also focused on the application of these models with real data. Stata is intended to be the primary statistical software used in this course.
Depending upon student interest, some reference may be made to the analysis of these models in other statistical packages such as R, STAN (brms
) or SAS. Instruction is a mixture of in class lecture and discussion, and time spent in a lab session, working on your own laptop, working through actual data problems with statistical software. Typically, students write two data driven papers for the course, one due about the middle of the semester, and one due at the end of the semester. It is my intent that the papers should be based upon data from your own discipline(s) and interests. Frequently papers from the course go on to serve as the foundation for dissertation chapters or articles submitted for publication.