Problem set 4: California Vaccination Exemptions

Due by 6:00 PM on Friday, November 1, 2019

In this Problem Set, we’ll be looking at some data on California vaccination exemptions

Until recently, in the state of California it was possible to obtain a “Personal Belief Exemption” to avoid the requirement of vaccinating your child before they began school. The dataset you’ll examine in this dataset represents records of exemption rates amongst kindergarten classes in California schools in 2015.

The data is available as an R package. To install it, do the following.

If you haven’t already, install and load drat:

  1. Install the drat package with install.packages(drat)
  2. Load it with library(drat)
  3. Add the repository where the data is: drat::addRepo("kjhealy")

You can now install cavax with

  1. install.packages("cavax")

Create a project for the assignment, as before

Open the project in RStudio and make an Rmd file for the analysis called something like vax.Rmd

Load the required libraries

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()  masks stats::filter()
## ✖ purrr::is_null() masks testthat::is_null()
## ✖ dplyr::lag()     masks stats::lag()
## ✖ dplyr::matches() masks tidyr::matches(), testthat::matches()
library(socviz)
## 
## Attaching package: 'socviz'
## The following object is masked from 'package:kjhutils':
## 
##     %nin%
library(ggbeeswarm)
library(cavax)

Take a look at the data

cavax
## # A tibble: 7,032 x 13
##      code county name  type  district city  enrollment pbe_pct exempt med_exempt
##     <dbl> <chr>  <chr> <chr> <chr>    <chr>      <dbl>   <dbl>  <dbl>      <dbl>
##  1 1.10e5 ALAME… FAME… PUBL… ALAMEDA… NEWA…        109      13  12.8        0   
##  2 6.00e6 ALAME… COX … PUBL… ALAMEDA… OAKL…        115       1   0.87       0.87
##  3 6.00e6 ALAME… LAZE… PUBL… ALAMEDA… OAKL…         40       0   0          0   
##  4 1.24e5 ALAME… YU M… PUBL… ALAMEDA… OAKL…         52      10   9.62       0   
##  5 6.10e6 ALAME… AMEL… PUBL… ALAMEDA… ALAM…        128       2   1.56       0   
##  6 6.11e6 ALAME… BAY … PUBL… ALAMEDA… ALAM…         70       1   1.43       0   
##  7 6.09e6 ALAME… DONA… PUBL… ALAMEDA… ALAM…        100       3   3          0   
##  8 6.09e6 ALAME… EDIS… PUBL… ALAMEDA… ALAM…         70       1   1.43       0   
##  9 6.09e6 ALAME… FRAN… PUBL… ALAMEDA… ALAM…         95       1   1.05       1.05
## 10 6.09e6 ALAME… FRAN… PUBL… ALAMEDA… ALAM…         50       2   2          0   
## # … with 7,022 more rows, and 3 more variables: rel_exempt <dbl>, mwc <fct>,
## #   kind <fct>

You can get a brief summary of each variable in the dataset by looking at the Help file in RStudio for the cavax package, or by looking at the documentation on the package homepage: http://kjhealy.github.io/cavax.

Questions to answer

  1. What is the unit of observation in this dataset?
  2. What is the average size of kindergarten class enrollment in the state of California? What’s the median class size? What’s the range of variability?
  3. What percentage of kids have a PBE exemption, on average?
  4. Explore the structure of variation in PBE exemptions. How does it vary by public and private schools, for instance? Or by county? Or school type? Draw graphs to illustrate the variation you find, and write a sentence or two describing what it looks like to you. Possibly useful geoms you might experiment with include geom_point(), geom_boxplot(), geom_density(), geom_beeswarm() and geom_quasirandom(). The latter two are from the ggbeeswarm package. Read the help for these geoms to see what it is they do.
  5. Can you find any particularly unusual-looking schools, school types, or counties, either with respect to their PBE rates, their size, or both? Why do you think they might be unusual?

Finish

Knit the completed R Markdown file as a Word or PDF document (use the “Knit” button at the top of the script editor window). Save it with a name of the form lastname_firstname_ps04 and upload it to the Sakai dropbox.