class: center, middle, inverse, title-slide # Should Everything Be A Bar Graph? ## Use
⇦
and
⇨
to navigate. ### Andy Grogan-Kaylor ### 2021-10-16 --- <style type="text/css"> @import url('https://fonts.googleapis.com/css2?family=Montserrat&display=swap'); .title-slide { color: #ffcb05; background-color: #00274C; } .title-slide h1 { color: #ffcb05; } pre { white-space: pre-wrap; } h1, h2, h3 { font-family: 'Montserrat', sans-serif; } body { font-family: 'Montserrat', sans-serif; } .author, .date { font-family: 'Montserrat', sans-serif; } </style> # How To Navigate This Presentation * Use the <span style="font-size:100px">⇦</span> and <span style="font-size:100px">⇨</span> keys to move through the presentation. * Press *o* for *panel overview*. * This presentation plays a *tone* when each new item appears. Turn the volume down if you find these tones annoying. --- class: animated, slideInRight # Introduction There are many different kinds of data visualization. In the language of the `ggplot` package for R, there are many different kinds of `geom`etries that we can apply to data. <div class="figure"> <img src="banner.png" alt="Multiple Geometries for Data Visualization" width="4002" /> <p class="caption">Multiple Geometries for Data Visualization</p> </div> However, after a number of years of working on data visualization, I have started to think about the advantages of bar graphs. While *not every visualization needs to be a bar graph*, it sometimes seems as though many data visualizations *would work well as a bar graph*. --- class: animated, slideInRight # These Slides In this slide deck, I use the `ggplot` package to develop bar graphs with `geom_bar`, one of the bar graph `geom`etries. After building a basic bar graph, I apply many formatting ideas, most of which I have learned from the blog of Cedric Scherer ([https://www.cedricscherer.com/](https://www.cedricscherer.com/)). --- class: animated, slideInRight # Simulate Some Data ```r N = 30 group <- c(rep("A", N), rep("B", N), rep("C", N)) # group variable # mycount <- c(10, 20, 50) # count in each group outcome <- c(rep(rnorm(N, 10, 1)), rep(rnorm(N, 20, 1)), rep(rnorm(N, 15, 1))) mydata <- data.frame(group, outcome) # make a data frame ``` --- class: animated, slideInRight # Replay The Data ```r head(mydata) # show the top of the data ``` ``` group outcome 1 A 10.326064 2 A 11.165971 3 A 9.267804 4 A 11.133150 5 A 10.812310 6 A 8.227487 ``` --- class: animated, slideInRight # Call The Library ```r library(ggplot2) # call the library ``` --- class: animated, slideInRight # Set Up The "Logic" Of The Plot ```r p0 <- ggplot(mydata, # the data I am using aes(x = group, # x is the group y = outcome)) # y is the mean outcome in each group ``` --- class: animated, slideInRight # Basic Bar Graph ```r p0 + # basic plot geom_bar(stat = "summary", fun = "mean") # bars with group means ``` <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="432" /> --- class: animated, slideInRight # Bar Graph With Color Fill --- count: false .panel1-graph1-non_seq[ ```r p0 + geom_bar( aes( ), stat = "summary", fun = "mean") # bars with group means ``` ] .panel2-graph1-non_seq[ <img src="index_files/figure-html/graph1_non_seq_01_output-1.png" width="432" /> ] --- count: false .panel1-graph1-non_seq[ ```r p0 + geom_bar( aes( * fill = group ), stat = "summary", fun = "mean") # bars with group means ``` ] .panel2-graph1-non_seq[ <img src="index_files/figure-html/graph1_non_seq_02_output-1.png" width="432" /> ] <style> .panel1-graph1-non_seq { color: black; width: 78.4%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-graph1-non_seq { color: black; width: 19.6%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-graph1-non_seq { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: animated, slideInRight, inverse, center # Some Tweaks --- class: animated, slideInRight # Call Library For Better Colors ```r library(ggsci) # science based color palettes ``` --- class: animated, slideInRight # Calculate the *Mean*, *Minimum* and *Maximum* outcomes by group. ```r library(dplyr) # data wrangling mynewdata <- mydata %>% group_by(group) %>% mutate(meanoutcome = mean(outcome)) %>% # mean outcome by group mutate(minoutcome = min(outcome)) %>% # minimum outcome by group mutate(maxoutcome = max(outcome)) # max outcome by group ``` --- class: animated, slideInRight, inverse, center # Bar Graph With Tweaks... --- count: false .panel1-graph2-auto[ ```r *ggplot(mynewdata, # the data I am using * aes(x = group, # x is the group * y = outcome)) # y is the outcome in each group ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_01_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group * geom_bar(aes(fill = group), # bars w/ color fill for group * stat = "summary", * fun = "mean") # bars with group means ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_02_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means * coord_flip() # flip the plot ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_03_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot * labs(title = "Group B Has The Highest \nMean Outcome", # informative title * x = "Group", # better axis labels * y = "Outcome") ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_04_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + * scale_fill_aaas() # better color fill scale ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_05_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale * theme_minimal() # better theme ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_06_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale theme_minimal() + # better theme * geom_text(aes(label = group, # label the bars * y = maxoutcome + 1), # position the labels * size = 10) ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_07_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale theme_minimal() + # better theme geom_text(aes(label = group, # label the bars y = maxoutcome + 1), # position the labels size = 10) + * theme(legend.position = "none") # turn off legend ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_08_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale theme_minimal() + # better theme geom_text(aes(label = group, # label the bars y = maxoutcome + 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend * theme(title = element_text(size = rel(2))) # bigger title text ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_09_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale theme_minimal() + # better theme geom_text(aes(label = group, # label the bars y = maxoutcome + 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend theme(title = element_text(size = rel(2))) + # bigger title text * theme(axis.text = element_text(size = rel(2))) # bigger axis text ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_10_output-1.png" width="432" /> ] --- count: false .panel1-graph2-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_bar(aes(fill = group), # bars w/ color fill for group stat = "summary", fun = "mean") + # bars with group means coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_fill_aaas() + # better color fill scale theme_minimal() + # better theme geom_text(aes(label = group, # label the bars y = maxoutcome + 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend theme(title = element_text(size = rel(2))) + # bigger title text theme(axis.text = element_text(size = rel(2))) # bigger axis text ``` ] .panel2-graph2-auto[ <img src="index_files/figure-html/graph2_auto_11_output-1.png" width="432" /> ] <style> .panel1-graph2-auto { color: black; width: 78.4%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-graph2-auto { color: black; width: 19.6%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-graph2-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: animated, slideInRight, inverse, center # A Last Tweak: Show The Distribution This example moves beyond a bar graph. --- count: false .panel1-graph3-auto[ ```r *ggplot(mynewdata, # the data I am using * aes(x = group, # x is the group * y = outcome)) # y is the outcome in each group ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_01_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group * geom_point(aes(color = group), # points w/ color for group * size = 5, # size * alpha = .5) # transparency ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_02_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency * geom_point(aes(color = group, # points w/ color for group * y = meanoutcome), # mean outcome * size = 20) # size ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_03_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size * coord_flip() # flip the plot ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_04_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot * labs(title = "Group B Has The Highest \nMean Outcome", # informative title * x = "Group", # better axis labels * y = "Outcome") ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_05_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + * scale_color_aaas() # better color scale ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_06_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale * theme_minimal() # better theme ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_07_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale theme_minimal() + # better theme * geom_text(aes(label = group, # label the groups * color = group, # color labels by group * y = minoutcome - 1), # position the labels * size = 10) ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_08_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale theme_minimal() + # better theme geom_text(aes(label = group, # label the groups color = group, # color labels by group y = minoutcome - 1), # position the labels size = 10) + * theme(legend.position = "none") # turn off legend ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_09_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale theme_minimal() + # better theme geom_text(aes(label = group, # label the groups color = group, # color labels by group y = minoutcome - 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend * theme(title = element_text(size = rel(2))) # bigger title text ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_10_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale theme_minimal() + # better theme geom_text(aes(label = group, # label the groups color = group, # color labels by group y = minoutcome - 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend theme(title = element_text(size = rel(2))) + # bigger title text * theme(axis.text = element_text(size = rel(2))) # bigger axis text ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_11_output-1.png" width="432" /> ] --- count: false .panel1-graph3-auto[ ```r ggplot(mynewdata, # the data I am using aes(x = group, # x is the group y = outcome)) + # y is the outcome in each group geom_point(aes(color = group), # points w/ color for group size = 5, # size alpha = .5) + # transparency geom_point(aes(color = group, # points w/ color for group y = meanoutcome), # mean outcome size = 20) + # size coord_flip() + # flip the plot labs(title = "Group B Has The Highest \nMean Outcome", # informative title x = "Group", # better axis labels y = "Outcome") + scale_color_aaas() + # better color scale theme_minimal() + # better theme geom_text(aes(label = group, # label the groups color = group, # color labels by group y = minoutcome - 1), # position the labels size = 10) + theme(legend.position = "none") + # turn off legend theme(title = element_text(size = rel(2))) + # bigger title text theme(axis.text = element_text(size = rel(2))) # bigger axis text ``` ] .panel2-graph3-auto[ <img src="index_files/figure-html/graph3_auto_12_output-1.png" width="432" /> ] <style> .panel1-graph3-auto { color: black; width: 78.4%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-graph3-auto { color: black; width: 19.6%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-graph3-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style>