Week 14, Part 2: Interference and Shift-Share Designs

PS 813 - Causal Inference

Anton Strezhnev

University of Wisconsin-Madison

April 29, 2026

\[ \require{cancel} \]

Today

  • What happens when SUTVA breaks!?
    • Units’ potential outcomes depend on other units’ assigned treatments
  • When do we see this?
    • Experiments on social networks
    • Experiments involving differential exposure to common shocks
  • Key idea: Exposure mapping
    • How do different sorts of treatment assignments translate into particular exposures at the unit level?
    • Estimation/inference possible under known exposure mapping.
    • But be careful! Random experiment \(\leadsto\) non-random exposures - need to adjust!
  • What if I don’t know the exposure mapping?
    • Analyzing as a regular experiment gives an “expected” ATE (marginalizing over the spillovers) (Sävje, Aronow, Hudgens, 2017)
    • But standard errors are definitely wrong!

Setup and notation

  • \(N\) units indexed by \(i = 1, \ldots, N\). Denote the set of units as \(U\).
  • Observed outcome \(Y_i\)
  • We have a randomized experiment that assigns a vector of treatment
    • The full assignment vector \(\mathbf{z} = (z_1, \ldots, z_N)'\) is random with support \(\Omega\)
    • \(\Pr(\mathbf{z} = z) = p_z\) is known
  • Potential outcomes \(y_i(\mathbf{z})\) - what would we observe for unit \(i\) under the complete set of assignments across all units \(\mathbf{z}\)
    • We can write the potential outcomes as: \(y_i(z_i; z_{-i})\)
    • \(z_i\) is the treatment assigned to unit \(i\), \(z_{-i}\) are the treatments assigned to all other units.
  • SUTVA assumes that \(y_i(\mathbf{z}) = y_{i}(z_i)\). Only \(i\)’s status matters for \(i\)’s potential outcomes.
  • But under arbitrary interference \(y_i(z_i; z_{-i}) \neq y_i(z_i; z_{-i}')\)
    • \(2^N\) potential outcomes for each unit - without some structure we’re hopeless.

Exposure mappings

  • Aronow and Samii (2017) consider a setting where we know an exposure mapping \(f: \Omega \times \Theta \to \Delta\)
    • We collapse the vector of assignments \(\mathbf{Z}\) and unit \(i\)’s characteristics \(\theta_i\) to its exposure \(D_i\)

      \[D_i = f(\mathbf{Z}, \theta_i)\]

  • Examples:
    • We randomize a treatment on a social network. Exposure can be defined as whether a “close friend” received treatment (Jones et. al., 2017)
    • We randomize a level of vaccination in a region. Exposure can be defined as the share of residents who are exposed in a unit’s region (Hudgens and Halloran, 2008)

Exposure mappings

  • Our new “consistency” assumption under a known exposure mapping

    \[Y_i = \sum_{k=1}^K I(D_i = d_k) y_i(d_k)\]

  • Essentially, once we know the value of the exposure, all assignments that generate that exposure are equivalent

    • If we think the number of friends treated is the exposure mapping, which friends doesn’t matter.

Causal estimands

  • We define causal estimands as differences across different types of exposure mappings

    \[\mu(d_k) = \frac{1}{N}\sum_{i=1}^N y_i(d_k), \qquad \tau(d_k, d_l) = \mu(d_k) - \mu(d_l)\]

  • Examples

    • What is the average effect of having 5 of your friends exposed to treatment versus zero fixing your own treatment status?
    • What is the average effect of exposing you to treatment holding fixed how many of your friends are treated?
  • Problem:

    • Even if \(\mathbf{z}\) is completely randomized, exposures are non-random
    • This is because unit-level characteristics \(\theta_i\) enter into the exposure mapping.
    • People who have a lot of friends are more likely to have more friends treated

IPW estimation

  • Aronow and Samii (2017) suggest that we can adjust for the probability of observing a given exposure via IPW

  • We need to know the propensity of exposure

    \[\pi_i(d_k) = \Pr(D_i = d_k)\]

  • If the design is known, this can be calculated directly

    • e.g. given \(8\) close friends and \(.5\) probability of treatment, probability that \(4\) are treated is \(\approx .6367\)

IPW estimation

  • Can apply our usual IPW estimators using these known exposure probabilities

  • Our Horvitz-Thompson estimator is just

    \[\widehat{y^T_{HT}}(d_k) = \sum_{i=1}^N I(D_i = d_k)\, \dfrac{Y_i}{\pi_i(d_k)}\]

  • Can also use Hajek estimator (normalize the weights to \(1\))

  • Need a positivity assumption on the exposures

    • It must be possible for all units to receive exposure \(d_k\): \(\pi_i(d_k) > 0 \forall i\)

Variance Estimation

  • Aronow and Samii (2017) propose a conservative variance estimator (conservative for similar reasons to the Neyman estimator)

    \[\begin{align*}\widehat{\text{Var}}\big[\widehat{y^T_{HT}}(d_k)\big] =& \sum_{i \in U} I(D_i = d_k)\, \big[1 - \pi_i(d_k)\big]\, \bigg[\frac{Y_i}{\pi_i(d_k)}\bigg]^2\; \\ &+\; \sum_{i \in U}\sum_{j \in U \setminus i} I(D_i = d_k)\, I(D_j = d_k)\, \frac{\pi_{ij}(d_k) - \pi_i(d_k)\pi_j(d_k)}{\pi_{ij}(d_k)}\, \frac{Y_i\, Y_j}{\pi_i(d_k)\pi_j(d_k)}\end{align*}\]

  • Note the “off-diagonal” term - we need to account for the fact that the joint propensity of receiving exposure \(d_k\) might not be the same as the product ofthe marginals

    • Also, note that when \(\pi_{ij}(d_k) = 0\) for some pairs, we get bias (though Aronow and Samii show how to correct this)

Shift-share designs

  • In economics, a very popular style of design is the shift-share design

    • Origins in Bartik (1991) but popularized by the “China shock” design of Autor et. al. (2013)
  • Consider the linear model

    \[Y_i = \beta z_i + \epsilon_i\]

  • We’re interested in the average effect of \(z_i\) (\(\beta\))

    • \(z_i\) is typically used as an instrument but we’ll focus on the “reduced form” effect here.
  • Example (Autor et. al. (2013))

    • \(Y_i\): manufacturing employment in location \(i\)
    • \(z_i\): an instrument for import growth from China post-WTO entry

Shift-share designs

  • In a shift-share design, \(z_i\) has a particular known structure

    \[z_i = \sum_{k=1}^K s_{ik} \times g_k\]

  • \(g = \{g_1, g_2, \dotsc, g_K\}\) are the shifts

  • \(s_i = \{s_{i1}, s_{i2}, \dotsc, s_{iK}\}\) are the unit-specific shares

  • Example (Autor et. al. (2013))

    • \(g_k\): Industry \(k\)’s overall growth in imports from China
    • \(s_{ik}\) The share of industry \(k\)’s employment in location \(i\)

Identification strategies

  • There are two approaches to identification in the shift-share setting

  • Random shares (Goldsmith-Pinkham et. al., 2020)

    • In this case, the shares themselves act as many exogenous instruments.
    • Each is observed at the level of \(i\), so just do regular IV w/ the shares
  • Random shocks (Borusyak, Hull, Jaravel, 2022)

    • Analogous to an experiment with interference
    • Randomness in the shocks propagates in a non-random way via the shares
    • The shift-share structure is a known exposure mapping
    • With shares that sum to \(1\), can estimate using a shock-level regression.

Example: Effects of the 2019 MFP

  • Gulotty and Strezhnev (2026) look at the effects of Trump I agriculture bailout on 2020 presidential vote share
    • Implement the Borusyak and Hull (2023) approach to identification with general formula instruments.
  • The paper wants to know whether money paid to farming counties increased support for Trump in 2020
    • But payments are non-random! Some places were “hurt” more by the trade war. Some places have more farmers, etc…

    • In the 2019 Market Facilitation Program (MFP), counties were assigned a single per-acre payment rate (our instrument for overall county payments)

    • The instrument had a formula structure that looked a lot like a shift-share!

      \[(\text{Rate})_i^{\text{\$/acre}} = \frac{\sum_{c=1}^C (\text{Acreage})_{i,c}^{\text{acre}} \times (\text{Yield})_{i,c}^{\text{unit/acre}} \times \text{(Rate)}_{c}^{\text{\$/unit}}}{\sum_{c=1}^C \text{(Acreage)}_{i,c}^{\text{acre}}}\]

Example: Effects of the 2019 MFP

  • The payment rate is a function of…
    • non-random county-level shares of each commodity
    • random shocks in the compensation rate assigned to each commodity
  • We adjust for this non-randomness by computing the expected county payment rate and subtracting this from the observed county payment rate.
    • We compute the expectation by assuming shocks are exchangeable at some level.
  • Essentially
    • We assume there was an experiment run at the level of the commodities…
    • …that propagated to farmers/voters through a county-level exposure (the per-acre rate)

Example: Effects of the 2019 MFP

  • Outcome: Difference between 2020 and 2016 Trump county vote share (percentage points)

  • Treatment: 2019 MFP Payments

  • Instrument: Re-centered payment-rate instrument

  • Design-based randomization inference

    • Asymptotics unlikely to be valid with only 12 shocks!
    • Construct a randomization distribution under the null of no effect using permutations of the crop-level shocks.
  • Our test statistic - the sample covariance of \(Y\) and the de-meaned instrument (Borusyak and Hull, 2023)

Example: Effects of the 2019 MFP

Partial regression plot

Permutation test (full specification)

Example: Effects of the 2019 MFP

Partial regression plot

Permutation test (full specification)

Example: Effects of the 2019 MFP

Partial regression plot

Permutation test (full specification)

Conclusion

  • A lot of interesting problems in causal inference deal with interference in exposures
    • Experiments conducted at one level filter down into another level of observation.
  • The trick is to figure out the exposure mapping
    • What’s the function that converts the vector of randomized treatments into the thing that matters?
    • SUTVA is a kind of super-restrictive exposure mapping
  • Exposure mappings depend on non-random unit-level characteristics
    • Adjust using standard covariate-adjustment methods!
  • A general principle for causal inference in observational designs:
    • Find the hidden experiment