Syllabus

Instructor

Dates and Location

  •   January 11th–April 26th, 2023
  •   Wed/Fri
  •   10:15am-11:30am
  •   Perkins LINK 088 (Classroom 4)

About this course

This will teach you how to use modern, widely-used tools to create insightful, beautiful, reproducible visualizations of social science data. You will also learn about the theory and practice of efforts to visualize social-scientific data, and society more generally. We will think about different ways of looking at data, about where social science data comes from in the first place, and about the implications of choosing to represent it in different ways.

By the end of the course you will

  • Understand the basic principles behind effective data visualization.
  • Have a practical sense for why some graphs and figures work well, while others may fail to inform or actively mislead.
  • Know how to create a wide range of plots in R using ggplot2.
  • Know a fair amount about how to use R for things other than data visualization.
  • Know how to refine plots for effective presentation.
  • Have an understanding of issues surrounding the collection and representation of data in the social sciences and beyond.

Core texts

I recommend (but do not require you buy) three books. Draft versions of all of them are available for free online.

  • Kieran Healy, Data Visualization: A Practical Introduction (Princeton: Princeton University Press, 2019), http://socviz.co/. The print version can be purchased at Amazon and other bookshops.
  • Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, California: O’Reilly Media, 2017), http://r4ds.had.co.nz/. The print version can be purchased at Amazon and other bookshops.
  • Claus E. Wilke, Fundamentals of Data Visualization (Sebastopol, California: O’Reilly Media, 2019), https://serialmentor.com/dataviz/. The print version can be purchased at Amazon and other bookshops.

Software

We will do all of our visualization work in this class using R and use RStudio to manage our code and projects. R is a freely-available programming language that is designed for statistical computing and widely used across the natural and social sciences, as well as in the rapidly-growing world of “data science” generally. RStudio is an integrated development environment, or IDE, for R, a kind of control center from which you can manage the engine-room of R itself. It is also freely available.

If you haven’t used these tools before, don’t worry. The course does not presuppose any familiarity with them. We will get up and running with them during the first week.

Schedule

The weekly schedule can be viewed on its own page. It is likely to change as we go. Links to readings, lecture notes, assignments, and other materials from class will be posted via this website.

Course policies

  • Attendance is required, and important. I am a reasonable person; if you need to be absent please let me know in advance insofar as that is possible.
  • Do the assigned readings in advance of class.
  • Submit problem sets, or other assignments, on time.

Required work and grading

Three kinds of work are required: problem sets and class participation, a midterm project, and a final project.

  • Weekly Class Participation and Problem Sets will let you reflect on any reading and practice your coding and visualization skills. Problem sets are due on Tuesdays by 6:00pm.
  • A Midterm Project.
  • A Final Project.

There is no final exam.

Grade components: Problem Sets and Class Participation: 50% / Midterm Project 20% / Final Project 30%.

How you should approach this course

The material covered in the course has a lot of continuity and it is cumulative. You will be learning a set of practical skills. This means that techniques we learn early on will be necessary for understanding things that come later. It also means that regular practice will help you a lot. So, this is not a “Topic of the week” course where you can tune out for a few weeks while expecting to be able to easily drop back in later. The material we cover each week will not be overwhelming, though. If you participate during class and keep up with the weekly assignments you’ll be in a very strong position to do well in the class.

Duke community standard

Like all classes at the university, this course is conducted under the Duke Community Standard. Duke University is a community dedicated to scholarship, leadership, and service and to the principles of honesty, fairness, respect, and accountability. Citizens of this community commit to reflect upon and uphold these principles in all academic and nonacademic endeavors, and to protect and promote a culture of integrity. To uphold the Duke Community Standard you will not lie, cheat, or steal in academic endeavors; you will conduct yourself honorably in all your endeavors; and you will act if the Standard is compromised.