Notes for November 6th/8th: Geom Raster

Contents

geom_tile() and geom_raster()

First let’s set up some libraries and load some data.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()  masks stats::filter()
## ✖ purrr::is_null() masks testthat::is_null()
## ✖ dplyr::lag()     masks stats::lag()
## ✖ dplyr::matches() masks tidyr::matches(), testthat::matches()
library(socviz)
## 
## Attaching package: 'socviz'
## The following object is masked from 'package:kjhutils':
## 
##     %nin%
library(demog)

Here’s the birth rate data:

okboomer
## # A tibble: 1,644 x 12
##     year month n_days births total_pop births_pct births_pct_day date      
##    <dbl> <dbl>  <dbl>  <dbl>     <dbl>      <dbl>          <dbl> <date>    
##  1  1938     1     31  51820  41215000    0.00126           40.6 1938-01-01
##  2  1938     2     28  47421  41215000    0.00115           41.1 1938-02-01
##  3  1938     3     31  54887  41215000    0.00133           43.0 1938-03-01
##  4  1938     4     30  54623  41215000    0.00133           44.2 1938-04-01
##  5  1938     5     31  56853  41215000    0.00138           44.5 1938-05-01
##  6  1938     6     30  53145  41215000    0.00129           43.0 1938-06-01
##  7  1938     7     31  53214  41215000    0.00129           41.6 1938-07-01
##  8  1938     8     31  50444  41215000    0.00122           39.5 1938-08-01
##  9  1938     9     30  50545  41215000    0.00123           40.9 1938-09-01
## 10  1938    10     31  50079  41215000    0.00122           39.2 1938-10-01
## # … with 1,634 more rows, and 4 more variables: seasonal <dbl>, trend <dbl>,
## #   remainder <dbl>, country <chr>

We can make a time series or trend plot of this data with geom_line():

okboomer %>%
    filter(country == "United States") %>%
    ggplot(aes(x = date, y = births_pct_day)) +
    geom_line(size = 0.9) +
    labs(x = "Year",
         y = "Average daily births per million") 

But we can also use geom_tile() to create an interesting plot. To get it right, we will convert years and months to factors or categorical variables, like this:

okboomer <- okboomer %>%
    mutate(year_fct = factor(year,  
                             levels = rev(unique(year)), 
                             ordered = TRUE),
           month_fct = factor(month,
                              levels = c(1:12),
                              labels = c("Jan", "Feb", "Mar", "Apr",
                                    "May", "Jun", "Jul", "Aug",
                                    "Sep", "Oct", "Nov", "Dec"),
                              ordered = TRUE)) %>%
    select(year, month, 
           year_fct, month_fct, everything())

We choose the levels and ordering of the categories considering the plot we want to make. In this case we’ll make the plot tall rather than wide by putting year on the y axis.

okboomer %>%
    filter(country == "United States") %>%
    ggplot(aes(x = month_fct, y = year_fct)) +
    geom_tile(mapping = aes(fill = births_pct_day), 
              color = "white") + 
    scale_x_discrete(position = "top") +              
    scale_y_discrete(breaks = seq(1940, 2010, 5)) +    
    scale_fill_viridis_c(option = "B") + 
    labs(x = NULL, y = NULL, fill = NULL, title = "Monthly Birth Rates",
         subtitle = "Average births per million people per day.",
         caption = "Data: US Census Bureau.")