Variables & Visualization

What Is The Story You Are Trying To Tell?

Andy Grogan-Kaylor

2024-10-31

Possibilities

possible visualizations

possible visualizations

As you move forward through this presentation you can press b to make text bigger, or s to make text smaller.

Background

Data Often Come From A Survey Questionnaire.

hypothetical questionnaire

hypothetical questionnaire

What is Data?

A data set is nothing more than a series of rows and columns that contain answers to responses to a survey.

Hypothetical Data
person Q1 Q2 Q3
1 1 0 100
2 2 0 200
3 1 1 -9

Some Notes on Data

Missing Data

What are Variables?

Variable Types

A Data Visualization Strategy

Once we have discerned the type of variable that have, there are two followup questions we may ask before deciding upon a chart strategy:

More On Strategy

Simulated Data

This example uses simulated data on social work clients, of the kind that a social service agency might collect.

Simulated Data
age mental_health group neighborhood
25.42 105.8 Group B Neighborhood B
25.55 93.27 Group A Neighborhood B
23.18 131.3 Group A Neighborhood B
25.07 112.9 Group A Neighborhood B
51.61 110 Group A Neighborhood B

Show One Thing At A Time

We start by visualizing one indicator at a time.

Continuous Variable

Sometimes the most interesting visualizations, are visualizations that give us a sense of the maximum, minimum, and average values. For example, the histogram and dotplot display information on age.

Histogram and Dotplot

Histogram and Dotplot

Categorical Variable

We would use a slightly different visualization, for example, a barchart, when our data are grouped into categories.

Barchart

Barchart

Show The Relationship Of Two Things

Our task becomes somewhat more complicated when we want to understand the relationship of one thing to another thing.

Categorical by Categorical

Here, for example, we visualize two categorical variables, neighborhood, by group. In this graph, the height of the bars represents the count of observations.

Barchart

Barchart

Continuous by Continuous

Here, we visualize two continuous variables, mental health, by age.

Scatterplot

Scatterplot

Continuous by Categorical

Last, we visualize a continuous variable by a categorical variable, mental health, by group. In this graph, the height of the bars represents the mean score.

BarChart

BarChart

Show Where Something Is

Sometimes our task is different. We want to visualize information, but add information on spatial location, using a map.

Map

Map

Map

Credits

Graphics made with the ggplot2 graphing library created by Hadley Wickham.