Visualizing Social Data
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode

Syllabus

Instructor
Kieran Healy, 268 Reuben-Cooke. Email at kieran.healy@duke.edu.

Course details
Meeting Mondays and Wednesdays at 10:15am to 11:30am in Room 127, Reuben-Cooke.

Contacting me
If you have questions or want to discuss things that we’re doing in class, it’s best to use the course Slack. I strongly encourage you not to be shy about raising queries or asking for help in the #general channel. For other matters (such as absences, or to set up a meeting, etc), email me.

About this Course

This class will do two things. First, it will teach you how to use modern, widely-used tools to create insightful, beautiful, reproducible visualizations of social science data. Second, you will also learn about the theory and practice of efforts to visualize sociological data, and society more generally. We will think about different ways of looking at social science data, about where data comes from in the first place, and about the implications of choosing to represent it in different ways.

By the end of the course you will

  • Understand the basic principles behind effective data visualization.
  • Have a practical sense for why some graphs and figures work well, while others may fail to inform or actively mislead.
  • Know how to create a wide range of plots in R using ggplot2.
  • Know how to refine plots for effective presentation.
  • Have an understanding of issues surrounding the collection and representation of data in the social sciences and beyond.

Core Texts

I recommend (but do not require) you buy two books:

  • Kieran Healy, Data Visualization: A Practical Introduction (Princeton: Princeton University Press, 2019), http://socviz.co/. A draft version is freely-available online. The print version can be purchased at Amazon and other bookshops.
  • Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, California: O’Reilly Media, 2017), http://r4ds.had.co.nz/. A draft version is freely-available online. The print version can be purchased at Amazon and other bookshops.

Software

We will do all of our visualization work in this class using R and use RStudio to manage our code and projects. R is a freely-available programming language that is designed for statistical computing and widely used across the natural and social sciences, as well as in the rapidly-growing world of “data science” generally. RStudio is an integrated development environment, or IDE, for R, a kind of control center from which you can manage the engine-room of R itself. It is also freely available.

If you haven’t used these tools before, don’t worry. The course does not presuppose any familiarity with them. We will get up and running with them during the first week.

Schedule

Date Topic
Jan. 5th Orientation
Jan. 10th/12th Up and running with R
Jan. 17th/19th Ways of Seeing
Jan. 24th/26th Vision, Data, and Design
Jan. 31st/Feb 2nd Showing the Right Numbers
Feb. 7th/9th Midterm projects
Feb. 14th/16th What is data?
Feb. 21st/23rd Pandemic data
Feb. 28th/Mar. 2nd Messy and Missing Data
Mar. 7th/9th (Spring Break)
Mar. 14th/16th Maps
Mar. 21st/23rd Demography
Mar. 28/30th Networks
Apr. 4th/6th Social Theory and Social Data
Apr. 11th/13th Final Projects
Apr. 18th/20th Catch-up

The schedule is likely to change as we go. Links to readings, lecture notes, assignments, and other materials from class will be posted via the Class page.

COVID-19 Considerations

At present, all classes are fully remote until January 18th. I will email you details about the Zoom meetings.

Course policies

  • Attendance is required. I am a reasonable person; if you need to be absent please let me know in advance insofar as that is possible.
  • Do the assigned readings in advance of class.
  • Submit memos, problem sets, or other assigned work when they are due.

Course Slack

During the first week you will receive an invitation to the Slack workgroup for the course. This will be the most convenient way to share information with everyone, to distribute links to readings and code, and to contact me.

Required Work and Grading

Four kinds of work are required: memos, problem sets, a midterm project, and a final project.

  • Reflection Memos are 250 to 500 words long and respond to the assigned reading. Memos are due Sundays by 6:00pm.
  • Problem Sets let you practice your visualization skills. Problem sets are due on Tuesdays by 6:00pm.
  • A Midterm Project is due Friday February 11th.
  • A Final Project is due Monday April 18th.

Reflection memos and problem sets may not be assigned every week.

There is no final exam for the class.

Grading: Memos 20% / Problem Sets 20% / Midterm Project 30% / Final Project 30%.

Duke Community Standard

Like all classes at the university, this course is conducted under the Duke Community Standard. Duke University is a community dedicated to scholarship, leadership, and service and to the principles of honesty, fairness, respect, and accountability. Citizens of this community commit to reflect upon and uphold these principles in all academic and nonacademic endeavors, and to protect and promote a culture of integrity. To uphold the Duke Community Standard you will not lie, cheat, or steal in academic endeavors; you will conduct yourself honorably in all your endeavors; and you will act if the Standard is compromised.