Show the code
load("social-service-agency.RData") # simulated data
Andy Grogan-Kaylor
January 16, 2024
ggplot2
is a powerful graphing library that can make beautiful graphs. ggplot2
can also help us to understand ideas of an underlying “grammar of graphics”.
However, ggplot can be difficult to learn. I am thinking that one way to better understand ggplot2
might be to see how this graphing library could be applied to a concrete example of comparing program outcomes.
In this example, program is a factor and mental health at time 2 is numeric.
The mental health variables are scaled to have an average of 100. Lower numbers indicate lower mental health, while higher numbers indicate higher mental health.
There is a lot of code below. This is where we are setting up the grammatical logic of the graphing approach.
Devoting some time to setting up the initial logic of the plot will pay dividends in terms of exploring multiple geometries later on.
Note that I am adding optional
scale_...
andtheme...
arguments just to make the graphs look a little nicer, but these are not an essential part of the code.
myplot1 <- ggplot(clients, # the data I am using
aes(x = program, # x is program
y = mental_health_T2, # y is mental health
color = program, # color is also program
fill = program)) + # fill is also program
labs(y = "mental health at time 2") + # labels
scale_color_viridis_d() + # beautiful colors
scale_fill_viridis_d() + # beautiful fills
theme_minimal() + # minimal theme
theme(axis.text.x = element_text(size = rel(.75))) # smaller labels
Now that we have devoted a lot of code to setting up the grammar of the graph, it is a relatively simple matter to try out different geom
etries. The geom
etries show the average value.
The segments connecting the x axis with the points, require their own geom
etry that has its own aes
thetic.
An extra element of the aes
thetic is required for lines.
A line chart is likely not an appropriate way to show these program outcomes as a line chart is more appropriate when the x axis represents some kind of time trend.
Now that we have devoted a lot of code to setting up the grammar of the graph, it is a relatively simple matter to try out different geom
etries. The geom
etries show the distribution of all values.
Again, there is a lot of code below. This is where we are setting up the grammatical logic of the graphing approach.
myplot2 <- ggplot(clients, # the data I am using
aes(x = mental_health_T2, # x is mental health
fill = program)) + # fill is program
facet_wrap(~program) + # facet on this variable
labs(x = "mental health at time 2") + # labels
scale_color_viridis_d() + # beautiful colors
scale_fill_viridis_d() + # beautiful fills
theme_bw() # bw theme makes facets more clear
However, now that we have devoted a lot of code to setting up the grammar of the graph, it is again a relatively simple matter to try out different geom
etries.
One last time, there is a lot of code below. This is where we are setting up the grammatical logic of the graphing approach.
And again, now that we have devoted a lot of code to setting up the grammar of the graph, it is again a relatively simple matter to try out different geom
etries.1
It is important to use (alpha = ...)
to create transparency with these geom
s.↩︎