In this article, I attempt to clarify the use of essential tools in the applied econometrician’s toolkit: Difference-in-Differences (DiD) and Event Study Designs. Inspired mostly by my students, this article breaks down the basic concepts and addresses common misconceptions that often confuse practitioners.
If you wonder why the title focuses on Event Studies while I am also talking about DiD, it is because, when it comes to causal inference, Event Studies are a generalization of Difference-in-Differences.
But before diving in, let me reassure you that if you are confused, there may be good reasons for it. The DiD literature has been booming with new methodologies in recent years, making it challenging to keep up. The origins of Event Study designs don’t help either…
Finance Beginnings
Event studies originated in Finance, developed to assess the impact of specific events, such as earnings announcements or mergers, on stock prices. The event study was pioneered by Ball and Brown (1968) and laid the groundwork for the methodology.
Methodology
In Finance, the event study methodology involves identifying an event window for measuring ‘abnormal returns’, namely the difference between actual and expected returns.
Finance Application
In the context of finance, the methodology typically involves the following steps:
- Identifying a specific event of interest, such as a company’s earnings announcement or a merger.
- Determining an “event window,” or the time period surrounding the event during which the stock price might be affected.
- Calculating the “abnormal return” of the stock by comparing its actual performance during the event window to the performance of a benchmark, such as a market index or industry average.
- Assessing the statistical significance of the abnormal return to determine whether the event had an impact on the stock price.
This methodological approach has since evolved and expanded into other fields, most notably economics, where it has been adapted to suit a broader range of research questions and contexts.
Adaptation in Economics
Economists use Event Studies to causally evaluate the impact of economic shocks, and other significant policy changes.
Before explaining how Event Studies are used for causal inference, we need to touch upon Difference-in-Differences.
Differences-in-Differences (DiD) Approach
The DiD approach typically involves i) a policy adoption or an economic shock, ii) two time periods, iii) two groups, and iv) a parallel trends assumption.
Let me clarify each of them here below:
- i) A policy adoption may be: the use of AI in the classroom in some schools; expansion of public kindergartens in some municipalities; internet availability in some areas; cash transfers to households, etc.
- ii) We denote “pre-treatment” or “pre-period” as the period before the policy is implemented and “post-treatment” as the period after the policy implementation.
- iii) We call as “treatment group” the units that are affected by the policy, and “control group” units that are not. Both treatment and control groups are composed of several units of individuals, firms, schools, or municipalities, etc.
- iv) The parallel trends assumption is fundamental for the DiD approach. It assumes that in the absence of treatment, treatment and control groups follow similar trends over time.
A common misconception about the DiD approach is that we need random assignment.
In practice, we don’t. Although random assignment is ideal, the parallel trends assumption is sufficient for estimating causally the effect of the treatment on the outcome of interest.
Randomization, however, ensures that differences between the groups before the intervention are zero, and non-statistically significant. (Although by chance they may be different.)
Background
Imagine a scenario in which AI becomes available in the year 2023 and some schools immediately adopt AI as a tool in their teaching and learning processes, while other schools do not. The aim is to understand the impact of AI adoption on student emotional intelligence (EI) scores.
Data
- Treatment Group: Schools that adopted AI in 2023.
- Control Group: Schools that did not adopt AI in 2023.
- Pre-Treatment: Academic year before 2023.
- Post-Treatment: Academic year 2023–2024.
Methodology
- Pre-Treatment Comparison: Measure student scores for both treatment and control schools before AI adoption.
- Post-Treatment Comparison: Measure student scores for both treatment and control schools after AI adoption.
- Calculate Differences:
- Difference in test scores for treatment schools between pre-treatment and post-treatment.
- Difference in test scores for control schools between pre-treatment and post-treatment.
The DiD estimate is the difference between the two differences calculated above. It estimates the causal impact of AI adoption on EI scores.
A Graphical Example
The figure below plots the emotional intelligence scores in the vertical axis, whereas the horizontal axis measures time. Our time is linear and composed of pre- and post-treatment.
The Counterfactual Group 2 measures what would have happened had Group 2 not received treatment. Ideally, we would like to measure Contrafactual Group 2, which are scores for Group 2 in the absence of treatment, and compare it with observed scores for Group 2, or those observed once the group receives treatment. (This is the main issue in causal inference, we can’t observe the same group with and without treatment.)
If we’re tempted to do the naive comparison between the outcomes of Group 1 and Group 2 post-treatment, we would get an estimate that won’t be correct, it will be biased, namely delta OLS in the figure.
The difference-in-differences estimator allows us to estimate the causal effect of AI adoption, shown geometrically in the figure as delta ATT.
The plot indicates that schools where students had lower emotional intelligence scores initially adopted AI. Post-treatment, the scores of the treatment group almost caught up with the control group, where the average EI score was higher in the pre-period. The plot suggests that in the absence of treatment, scores would have increased for both groups — common parallel trends. With treatment, however, the gap in scores between Group 2 and Group 1 is closing.