Syllabus

Course Overview

Questions of cause and effect are central to the study of political science and to the social sciences more broadly. But making inferences about causation from empirical data is a significant challenge. Critically, there is no simple, assumption-free process for learning about a causal relationship from the data alone. Causal inference requires researchers to make assumptions about the underlying data generating process in order to identify and estimate causal effects. The goal of this course is to provide students with a structured statistical framework for articulating the assumptions behind causal research designs and for estimating effects using quantitative data.

The course begins by introducing the counterfactual framework of causal inference as a way of defining causal quantities of interest such as the “average treatment effect.” It then proceeds to illustrate a variety of different designs for identifying and estimating these quantities. We will start with the most basic experimental designs and progress to more complex experimental and observational methods. For each approach, we will discuss the necessary assumptions that a researcher needs to make about the process that generated the data, how to assess whether these assumptions are reasonable, how to interpret the quantity being estimated and ultimately how to conduct the analysis.

This course will involve a combination of lectures and problem sets. Problem sets will contain a mixture of both theoretical and applied questions and serve to reinforce key concepts and allow students to assess their progress and understanding throughout the course. Assignments will involve analysis of data using the programming language. This is a free and open source language for statistical computing that is used extensively for data analysis in many fields. Prior experience with the fundamentals of programming is required.

Prerequisites

This course is the second in the political science graduate methodology sequence. Completing the introductory course prior to this sequence should prepare you for the material in this class. We will rely on some background knowledge of core concepts in probability, statistics and inference as well as experience with statistical programming in . However, there are no strict, specific course pre-requisites as many different disciplines and departments offer introductory statistics classes that cover the relevant material.

In general, you should have had some introduction to probability theory and should be familiar with concepts like the properties of random variables (especially expectation and variance), estimands and estimators, and statistical inference. Familiarity with linear regression is also a plus, but we will be reviewing it during the relevant week.

Please contact the instructor if you are interested in enrolling but are unsure of the requirements.

Logistics

  • Lectures: Mondays/Wednesdays from 9:30am-10:45am

You should attend lectures regularly as they comprise a significant element of the course instruction. Lecture materials will be posted on the course website.

  • Discussion Forum: We will be using Ed as our primary course discussion platform. If you are enrolled in the class on Canvas, you should already have access to the Ed board for this class. Please use this to post questions about the readings/lecture material as well as about the problem sets.

  • Course Materials: Lecture materials, problem sets and tutorial code will be posted on the course website. Problem set solutions will be posted after the due date on Canvas. Links to readings can be found on the course website organized by week.

Textbooks

The course will involve readings from a variety of different textbooks and published papers. The class will not require the purchase of a single, specific, text and all excerpts from textbooks are available online (either directly or through library resources). However, you may wish to obtain some of these texts to use as a personal reference and they may be valuable to you in the future.

Grading

Students’ final grades are based on four components:

Problem Sets (25%)

Students will complete a total of five problem sets throughout the quarter. Problem sets will cover roughly a two-week period of course material. A complete schedule of the assignments can be found on the Assignments page.

The goal of the problem sets is to encourage exploration of the material and to provide you with a clear and credible means of assessing your understanding and progress through the course. As such, problem sets are designed to be challenging and we expect students to find some questions difficult.

Problem sets will be graded on a (+/✓/-) scale:

  • + : Complete and near-perfect work
  • : Generally good work with clear effort shown but with notable errors
  • - : Significantly incomplete work with major conceptual errors and little effort shown

Collaboration Policy

We strongly encourage collaboration between students on the problem sets and highly recommend that students discuss problems with each other either in person or via the discussion forum. However, each student is expected to submit their own write-up of the answers and any relevant code.

Office Hours and Online Discussion

Students should feel free to discuss any questions about the problem sets with the teaching staff during sections and office hours. We also strongly encourage students to post questions about both the problem sets and the assigned readings on the course discussion board and respond to other students’ questions. Responding to other students’ questions will contribute to your participation grade.

Submission Guidelines

Problem sets will be distributed as HTMl and Quarto files (.qmd). You should submit your answers and any relevant R code in the same format: including the Quarto file (.qmd extension) and a corresponding rendered .html file as your submission. You will be submitting your problem sets via Gradescope.

In-person Midterm (30%)

We will have an in-person midterm examination on Wednesday, March 4th, 2026. This exam will cover the material in the first half of the course (experiments and selection-on-observables) The exam will take the form of a standard pen-and-paper timed examination involving both theory and practical analysis of sample code and results.

In-person Final (35%)

The final exam will take place in person during exams week. We have been assigned an exam date of May 4, 2026 at 5:05PM - 7:05PM. Room information will be made available nearer to the end of the semester. The exam will take the same form as the midterm, but will be slightly longer given the additional time available during the exam period.

Participation (10%)

We expect students to take an active role in learning in both lecture and section. Engagement with the teaching staff by asking and answering questions will contribute to this grade as will interaction on the discussion board.

Computing

This course will use the R programming language. This is a free and open source programming language that is available for nearly all computing platforms. You should download and install it from https://www.r-project.org.

Unless you have strong preferences for a specific coding environment, we recommend that you use the free RStudio Desktop Integrated Development Environment (IDE) which you can download from https://rstudio.com/products/rstudio/download/#download.

In addition to base R, we will be frequently using data management and processing tools found in the tidyverse set of packages along with basic graphics and visualization using ggplot2.

See the Resources page for additional information.

Policy on Generative Large Language Models

Large Language Models (LLMs) continue to have an immense impact on the educational field. Over the last several years, we have seen striking growth in the capabilities of these models and it is clear that they will remain a permanent and inextricable presence in all of our lives. This very course website has been built from my earlier .tex syllabus with considerable assistance from Claude’s Opus 4.5 model. Having taught a causal inference course every year since 2020, I have seen first-hand how these tools are rapidly reshaping how students engage with the material - for better and for worse.

The clearest setting where LLM tools have been amplifying user productivity is in programming. For me, the most positive case for LLMs is that they are a kind of “universal interface” in natural language that makes it easier for users to get a computer to do what they want simply by articulating the task in English. Cutting-edge LLMs implement some form of “reasoning” (essentially, having the model generate intermediate tokens into its context window to improve the quality of the final predicted tokens) and models are able to call various software tools as part of the generation process. Modern LLMs are increasingly “agentic” in that they replicate the sort of think-decide-act loop that we would consider to be the role of a typical knowledge worker. In essence, every graduate student can now have their own team of RAs for a comparatively low monthly payment to Anthropic.

Unfortunately, the effectiveness of LLMs remains highly variable across different task domains. My own view is that coding (especially coding for software development) is an optimal use case for LLMs because it is an engineering task. By “engineering,” I mean the the iterative task of solving a problem by brainstorming potential solutions, implementing those solutions, and then subsequently evaluating the solutions with respect to some clearly defined criteria. This is because the goal of engineering is the building of systems. The key components here are both the existence of a well-defined problem and the ability to assess whether the proposed solutions are effective. Indeed, modern agentic LLMs are good in large part because they incorporate a lot of this evaluation by way of the “reasoning” structure and have the ability to write unit tests to check the code. Moreover, software development is amenable to this “try-and-check” sort of reasoning because the sub-tasks typically involve some well-defined, structured output that can be evaluated against what the user knows what they need/want.

Problems begin to accumulate when there is no longer an agent (human or LLM) that can diagnose errors and adjust. Unfortunately, this is the sort of thing that experienced users are better at than less experienced users. Without an understanding of what the LLM is actually doing, you are much less likely to be able to figure out when it has gone wrong. And the fail cases for LLMs are, in my experience, a lot weirder than the usual fail cases for humans. So my most negative assessment of LLMs is that they have severely adversely affected the process of education. While they may supercharge the productivity of those who already have the relevant understanding, they are much less promising in training novices and provide only the illusion of capability.

I am increasingly opposed to the use of LLMs for programming when one is learning to code. In practice, I find that students delegate far too much to the model and spend insufficient time understanding the mechanics of what the code is doing. This makes it actually quite difficult to meaningfully debug outputs and understand how to diagnose errors. Additionally, actually implementing statistical methods in code is a common way for students to actually understand conceptually what the methods do and how they work. As such, I would still discourage LLM-assistance when completing problem sets although I do not strictly rule it out, especially for the more tedious components where you are confident that you understand what the code is doing (e.g. data cleaning).

Additionally, the sort of coding that we do as scientists is typically not software development, so I actually don’t think these tools are (yet) good for writing research code. The task of science I think is distinct from engineering in that we do not have a “try-check-fail-repeat” evaluation loop where the “correctness” of the method and implementation can be assessed from the “correctness” of the output. In fact, we would probably characterize scientists that do this sort of thing as engaging in “questionable research practices” if “correctness” is judged by the “number of publications” objective function. We are not building, we are discovering. A core theme of this class is the importance of outside information and deep knowledge in assessing the feasibility of the assumptions behind a particular research design. The quality of research is determined not by its outputs but rather by its inputs. These are not assumptions that an LLM (for the most part) will provide to you unprompted. An LLM agent can effectively replicate an existing analysis, but it is unlikely to tell you that this was the wrong analysis to begin with. I like to think of LLM agents as amateur improvisers - they can “Yes, and…” you very well, but are much less capable of the “No, but…”

One other common use of LLMs that I have also seen - and perhaps this is more common than direct assistance on assignments - is the use of them as “personal tutors.” The chat interfaces let students ask questions, summarize complex texts, search for other relevant information, and brainstorm (see the above point re: “amateur improviser”) without the need for awkward conversations with other human beings. While I think this is fine for simple questions, especially when the LLM just replaces web search, I would not use LLMs as a complete substitute for your colleagues and for the teaching staff. One of the unfortunate consequences of LLM-proliferation is that students don’t post on discussion boards as often as they used to - even in graduate classes. I feel that this is ultimately detrimental to the sort of community-building and professionalization that this class is designed for. Additionally, one of the benefits of asking the teaching staff is that they are familiar enough with the topic and the context that they can infer a lot of what is unstated or implied by your question and better tailor the response. We will provide additional context unprompted.

With regards to web-search, I would have strongly advised against this a few years ago, but the incorporation of retrieval augmented generation in modern models has reduced the prevalence of entirely made up sources, though not entirely. Since modern web search runs on basically the same underlying vector search architecture, there’s basically no difference in using an LLM vs. just Googling something. I would still encourage going to the retrieved sources directly though and avoiding the generated summaries since those seem to use cheaper/low-quality models. For summaries, tools explicitly built for interacting with a text (e.g. Google’s NotebookLM) I think are potentially more useful, though I have not spent much time exploring them. Even then, I would still encourage the “classical” approach of searching past (and, with Google Scholar, future) citations. Even if LLMs generated perfect summaries, I think there is an inherent benefit of the “old ways” when it comes to professionalization and developing a better understanding of the authors in a given literature. Unlike the rest of the internet, academia is still a link(citation)-based culture and it is worth leveraging that researcher-provided context to guide your reading. Additionally, these authors are real human beings who you will meet at conferences - it’s worth understanding who they are conversing with in order to better understand the discipline.

Lastly, any LLM policy needs to consider its feasibility. It is clear to me that any restrictions on LLM use aside from restricted, in-class pen-and-paper examination use are fundamentally unenforceable. Therefore, with respect to the problem sets, students are permitted to use LLMs in whatever capacity they see fit. I have attempted to design the problem sets such that they contain “out-of-distribution” challenges (e.g. a replication of an existing paper that concludes contrary to the original result) and otherwise general “traps” that try to evaluate deep substantive knowledge of the problem. Over the last two years, I have found on (e.g. take home exams) that LLM-using students produce mediocre but not completely terrible results. Nevertheless, they do make mistakes (and often behave in ways that could be described as “not wrong, just strange”) and it is clear to me which students use them to their detriment. Perhaps this will change in the next year or two - such is the nature of this field. Indeed, my decision to move entirely to in-person assessment was driven by the observation that although take-home exams still provided some variation among students, that variation was dramatically lower than the in-class exams. So in short:

Accommodations and Accessibility

The University of Wisconsin–Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW–Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility.

Students are expected to inform faculty of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized.

I will work either directly with you or in coordination with the McBurney Disability Resource Center to identify and provide reasonable instructional accommodations. Once you are approved for accommodations by the McBurney Center, please be sure to make the relevant selections in McBurney Connect. When I have received your Student Accommodation Letter, I will send you a follow-up e-mail to connect and discuss how the accommodations will be implemented for this course. Disability information, including instructional accommodations as part of a student’s educational record, is confidential and protected under FERPA.

Acknowledgments

This course is indebted to the many wonderful and generous scholars who have developed causal inference curricula in political science departments throughout the world and who have made their course materials available to the public. In particular, I thank Matthew Blackwell, Brandon Stewart, Molly Roberts, Kosuke Imai, Teppei Yamamoto, Jens Hainmueller, Adam Glynn, Gary King, Justin Grimmer, and Edward Kennedy whose lecture notes and syllabi have been immensely valuable in the creation of this course. Special thanks to Molly Offer-Westort, Andy Eggers and Bobby Gulotty who helped in the development of this course in its earlier incarnation as PLSC 30600 at the University of Chicago. I also thank the previous teaching assistants of this course: Arthur Yu, Oscar Cuadros, Zikai Li, and Cindy Wang.

Lastly, thanks to Andrew Heiss and Matt Blackwell for their Quarto website templates, which I have extensively borrowed from in designing this course site.