Skip to content

Add counterfactual terminology callouts to quasi-experimental notebooks #852

@drbenvincent

Description

@drbenvincent

Summary

Several notebooks in examples/causal_inference/ correctly use the term "counterfactual" in the Rubin/potential outcomes sense — estimating what would have happened to treated units absent treatment. However, readers familiar with Pearl's causal ladder may wonder whether the same L2/L3 confusion flagged in #849 applies here. It does not, but a brief callout in each notebook would clarify this and strengthen the pedagogy.

Proposed change

Add a short :::{admonition} callout to each of the following notebooks explaining:

  1. The notebook uses "counterfactual" in the potential outcomes (Rubin) sense — the unobserved potential outcome $Y(0)$ for treated units
  2. How this differs from Pearl's L3 unit-level counterfactuals (which require abduction of unit-specific exogenous terms)
  3. Why the usage is appropriate for the method

Notebooks and tailored callout content

interrupted_time_series.ipynb

  • The counterfactual here is a forecast — we extrapolate pre-intervention trends to predict what would have happened without the intervention. This is a group-level counterfactual prediction in the potential outcomes framework, not a unit-level L3 counterfactual in Pearl's sense.

difference_in_differences.ipynb

  • The parallel trends assumption lets us use the control group's trajectory as a proxy for the treated group's counterfactual — what would have happened without treatment. This is standard counterfactual reasoning in the potential outcomes framework.

regression_discontinuity.ipynb

  • Near the threshold, units are quasi-randomly assigned to treatment or control. The control group's outcome approximates the counterfactual for treated units at the boundary, enabling a local causal estimate.

excess_deaths.ipynb

  • Same pattern as ITS: the counterfactual is a forecast from a model trained on pre-COVID data, predicting expected deaths if nothing had changed.
  • Additional fix: Line 44 references "the famous do-operator" but the notebook does not use pm.do. This should be reworded to avoid implying the do-operator is being used computationally.

Suggested references for each callout

  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
  • Imbens, G. W., & Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press.
  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press. (For the L1/L2/L3 contrast.)
  • Cross-reference to interventional_what_if_do_operator.ipynb for the detailed L2/L3 distinction.

Acceptance criteria

  • Each of the four notebooks has a callout clarifying the counterfactual terminology
  • excess_deaths.ipynb line 44 do-operator reference is corrected
  • References are included in each callout
  • Existing code and narrative are otherwise preserved
  • All callouts cross-reference the interventional what-if notebook for the full L2/L3 explanation

Context

This issue follows from the review and reframing done in #849 / PR #850, where counterfactuals_do_operator.ipynb was renamed and reframed because it mislabelled L2 (interventional) pm.do outputs as L3 (counterfactual). The four notebooks listed here use "counterfactual" accurately in the Rubin sense, but would benefit from an explicit note making this clear.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions