How to Run an Event Study in R

Tutorial

How to Run an Event Study in R

A step-by-step, reproducible walk-through: from raw prices to abnormal returns, CAR, CAAR, and the Patell and BMP tests, with runnable R code.

← All methodology

In short

To run an event study in R, estimate a normal-return model (usually the market model, lm(r_i ~ r_m)) over a pre-event estimation window, subtract the predicted return from the realised return on each event-window day to get abnormal returns, sum them into CAR per firm and average into CAAR across firms, then test significance with a cross-sectional t-test, the Patell Z, or the Boehmer-Musumeci-Poulsen (BMP) test. The free ARC calculator performs every one of these steps from a CSV upload if you would rather not code.

What an event study answers, and when to use it

An event study measures whether a specific event moved a security's price beyond what normal market movement would predict. The quantity of interest is the abnormal return: the realised return minus the return the security would have earned had the event not occurred. Typical applications are mergers and acquisitions, earnings announcements, regulatory decisions, index inclusions, and ESG or climate shocks (MacKinlay, 1997).

The mechanics are the same in every case. You estimate a model of normal returns on a quiet pre-event period, apply it to the event period, and test whether the residual abnormal returns are jointly different from zero. This tutorial implements that pipeline in base R plus one regression call. The same logic is covered model-by-model on Expected-return models and test-by-test on Significance tests.

The data and windows you need

You need three things: event dates (one per firm or per event), a price series for each firm, and a price series for a market index used as the benchmark. From prices you compute returns. Two non-overlapping windows structure the analysis:

  • Estimation window [T0, T1]: a quiet pre-event period, typically 120 to 250 trading days, used to fit the normal-return model.
  • Event window [T1+1, T2]: the days around the event over which abnormal returns are measured, for example [-1, +1] or [-5, +5] in event time.

A gap of a few days between the windows is common so that pre-event leakage does not contaminate the estimation of normal returns. We use simple (discrete) returns throughout; log returns are an equally valid convention as long as you are consistent.

Key definition

The abnormal return of firm i on day t is $AR_{i,t} = R_{i,t} - E(R_{i,t})$, where $R_{i,t}$ is the realised return and $E(R_{i,t})$ is the normal return predicted by the estimation-window model.

Step 1: Load and clean price data, compute returns

Start from a long-format price table with columns firm, date, and price, plus a market index series. Compute simple daily returns $R_t = P_t / P_{t-1} - 1$ within each firm.

library(dplyr)

# prices: data.frame(firm, date, price); market: data.frame(date, mkt_price)
# event_dates: data.frame(firm, event_date)

simple_returns - function(p) p / dplyr::lag(p) - 1

stock_ret - prices |>
  arrange(firm, date) |>
  group_by(firm) |>
  mutate(ret = simple_returns(price)) |>
  ungroup()

mkt_ret - market |>
  arrange(date) |>
  mutate(mkt = simple_returns(mkt_price)) |>
  select(date, mkt)

# Join the market return onto every firm-day and drop the first (NA) return.
panel - stock_ret |>
  inner_join(mkt_ret, by = "date") |>
  inner_join(event_dates, by = "firm") |>
  filter(!is.na(ret), !is.na(mkt))

Step summary. Compute simple returns per firm with $R_t = P_t/P_{t-1} - 1$, attach the matching market return to every firm-day, and drop the leading NA return.

Step 2: Estimate the normal-return (market) model

The market model regresses each firm's return on the market return over the estimation window:

$$R_{i,t} = \alpha_i + \beta_i R_{m,t} + \varepsilon_{i,t}, \qquad t \in [T_0, T_1].$$

The estimated $\hat\alpha_i$ and $\hat\beta_i$ are then held fixed and used to predict normal returns in the event window. Below we work in event time: rel_day = 0 is the event date, negative values are before, positive after. The estimation window is days [-250, -11] and the event window is [-5, +5] (adjust to your design).

# Add event-time index: trading-day offset from the event date per firm.
panel - panel |>
  group_by(firm) |>
  arrange(date) |>
  mutate(rel_day = row_number() - match(TRUE, date >= event_date)) |>
  ungroup()

EST_WIN - c(-250, -11)   # estimation window (event time)
EVT_WIN - c(-5, 5)       # event window (event time)

# Fit the market model per firm on the estimation window; keep alpha, beta,
# the residual standard deviation, and the estimation-window length.
fit_one - function(df) {
  est - df |> filter(rel_day >= EST_WIN[1], rel_day = EST_WIN[2])
  m   - lm(ret ~ mkt, data = est)
  data.frame(
    alpha   = coef(m)[["(Intercept)"]],
    beta    = coef(m)[["mkt"]],
    s_ar    = summary(m)$sigma,       # residual SD = sqrt(SSR/(M-2))
    M       = nrow(est),              # non-missing estimation-window obs
    mkt_bar = mean(est$mkt),
    mkt_ss  = sum((est$mkt - mean(est$mkt))^2)
  )
}

params - panel |>
  group_by(firm) |>
  group_modify(~ fit_one(.x)) |>
  ungroup()

Step summary. Fit lm(ret ~ mkt) per firm over the estimation window and store $\hat\alpha_i$, $\hat\beta_i$, the residual standard deviation $S_{AR_i}$, and the estimation-window length $M_i$.

Step 3: Compute abnormal returns, then CAR and CAAR

Abnormal returns in the event window are the realised returns minus the model prediction:

$$AR_{i,t} = R_{i,t} - \big(\hat\alpha_i + \hat\beta_i R_{m,t}\big).$$

The cumulative abnormal return for firm i over the event window is $CAR_i = \sum_{t \in [T_1+1, T_2]} AR_{i,t}$, and the cumulative average abnormal return is $CAAR = \frac{1}{N}\sum_{i=1}^{N} CAR_i$.

# Abnormal returns in the event window.
evt - panel |>
  filter(rel_day >= EVT_WIN[1], rel_day = EVT_WIN[2]) |>
  inner_join(params, by = "firm") |>
  mutate(ar = ret - (alpha + beta * mkt))

# Average abnormal return (AAR) per event-time day.
aar - evt |>
  group_by(rel_day) |>
  summarise(aar = mean(ar), n = dplyr::n(), .groups = "drop")

# CAR per firm over the whole event window, then CAAR across firms.
car  - evt |> group_by(firm) |> summarise(car = sum(ar), .groups = "drop")
caar - mean(car$car)
N    - nrow(car)

Step summary. $AR_{i,t}=R_{i,t}-(\hat\alpha_i+\hat\beta_i R_{m,t})$; sum over the event window for $CAR_i$, average across firms for CAAR.

Step 4: Significance testing in R

The null is $H_0: E(CAAR) = 0$. Three tests in increasing robustness:

(a) Cross-sectional t-test. Uses the dispersion of $CAR_i$ across firms. Robust to event-induced volatility, assumes cross-sectional independence:

$$t = \sqrt{N}\,\frac{CAAR}{S_{CAAR}}, \qquad S_{CAAR}^2 = \frac{1}{N-1}\sum_{i=1}^{N}\big(CAR_i - CAAR\big)^2.$$

# (a) Cross-sectional t-test on CAR.
s_caar - sd(car$car)                 # sample SD across firms
t_cs   - sqrt(N) * caar / s_caar
p_cs   - 2 * pt(-abs(t_cs), df = N - 1)

(b) Patell Z. Standardises each abnormal return by its forecast-error-adjusted standard deviation before aggregating, which corrects for estimation error and unequal firm variances (Patell, 1976). For the event window, with $L_2$ event-window days:

$$z = \frac{1}{\sqrt{N}}\sum_{i=1}^{N}\frac{CSAR_i}{S_{CSAR_i}}, \quad CSAR_i = \sum_{t} SAR_{i,t}, \quad S_{CSAR_i}^2 = L_2\,\frac{M_i-2}{M_i-4},$$

where each daily standardised abnormal return is $SAR_{i,t} = AR_{i,t}/S_{AR_{i,t}}$ and the forecast-error correction inflates the variance for days whose market return is far from the estimation-window mean:

$$S_{AR_{i,t}}^2 = S_{AR_i}^2\left(1 + \frac{1}{M_i} + \frac{(R_{m,t}-\bar R_m)^2}{\sum_{s=T_0}^{T_1}(R_{m,s}-\bar R_m)^2}\right).$$

# (b) Patell Z on CAR. s_ar is the estimation-window residual SD per firm.
evt - evt |>
  mutate(
    fe_var = s_ar^2 * (1 + 1/M + (mkt - mkt_bar)^2 / mkt_ss),  # forecast-error variance
    sar    = ar / sqrt(fe_var)                                # standardised AR
  )

patell - evt |>
  group_by(firm) |>
  summarise(
    csar   = sum(sar),
    L2     = dplyr::n(),
    M      = first(M),
    var_cs = dplyr::n() * (first(M) - 2) / (first(M) - 4),    # L2 * (M-2)/(M-4)
    .groups = "drop"
  ) |>
  mutate(z_i = csar / sqrt(var_cs))

z_patell - sum(patell$z_i) / sqrt(nrow(patell))
p_patell - 2 * pnorm(-abs(z_patell))

(c) BMP (standardized cross-sectional) test. Combines Patell standardisation with a cross-sectional variance, so it is robust to event-induced volatility, the main weakness of the plain Patell Z (Boehmer, Musumeci & Poulsen, 1991):

$$t_{BMP} = \sqrt{N}\,\frac{\overline{SCAR}}{S_{\overline{SCAR}}}, \quad \overline{SCAR}=\frac{1}{N}\sum_i SCAR_i, \quad SCAR_i = \frac{CAR_i}{S_{CAR_i}}.$$

# (c) BMP test. S_CAR_i^2 scales the residual variance to the event window
# with its own forecast-error correction (see /significance-tests for the
# full S_CAR_i expression).
evt_sum - evt |>
  group_by(firm) |>
  summarise(
    car   = sum(ar),
    L2    = dplyr::n(),
    s_ar  = first(s_ar),
    M     = first(M),
    fe_sum = sum((mkt - first(mkt_bar))^2) / first(mkt_ss),
    .groups = "drop"
  ) |>
  mutate(
    s_car = sqrt(s_ar^2 * (L2 + L2 / M + fe_sum)),
    scar  = car / s_car
  )

t_bmp - sqrt(nrow(evt_sum)) * mean(evt_sum$scar) / sd(evt_sum$scar)
p_bmp - 2 * pt(-abs(t_bmp), df = nrow(evt_sum) - 1)

cat(sprintf("CAAR = %.4f | t(cs) = %.2f (p=%.3f) | Patell Z = %.2f (p=%.3f) | BMP t = %.2f (p=%.3f)\n",
            caar, t_cs, p_cs, z_patell, p_patell, t_bmp, p_bmp))

Step summary. Report the cross-sectional t-test for a quick read, the Patell Z to control for estimation error and unequal variances, and the BMP test when the event itself may have changed return volatility.

Run these exact statistics on your own data, free. Upload your CSV to the ARC calculator and it returns AR, CAR, CAAR, the cross-sectional t, Patell Z and BMP, with no R required.

Run it free in ARC →

Step 5: Interpreting output and common pitfalls

A significant positive CAAR means the sample earned returns above the model benchmark around the event; a negative CAAR means below. Read sign, magnitude (in percent), and the test statistic together. Watch for four pitfalls:

  • Event clustering. If events share calendar dates, abnormal returns are cross-sectionally correlated and the plain Patell Z over-rejects. Use the BMP test or the Kolari-Pynnonen adjusted statistics (see Significance tests).
  • Thin trading. Illiquid stocks have stale prices that bias beta toward zero; consider a longer estimation window or a non-parametric rank test.
  • Confounding events. An earnings release on the same day as your event contaminates the abnormal return. Screen the sample and report robustness on a clean subsample.
  • Window length. Long windows accumulate model error and reduce power; keep the event window as short as the hypothesis allows.

Run it without code: the free ARC calculator

If you would rather not maintain R code, the Abnormal Return Calculator (ARC) runs this entire pipeline from a CSV upload: pick the return model on Expected-return models, choose the test battery on Significance tests, upload your request and price files, and download AR, CAR, CAAR and every test statistic. It is the same maths as the code above, validated against the published test definitions.

FAQ

How long should the estimation window be?

Most daily event studies use 120 to 250 trading days, ending a few days before the event window so pre-event leakage does not contaminate the normal-return estimate. Shorter windows raise estimation error; very long windows risk beta instability.

Single firm or multiple firms?

The pipeline is identical. With one firm you test $CAR_i$ directly with a time-series t-test; with many firms you average into CAAR and use a cross-sectional t-test, Patell Z, or BMP, which gain power from the cross-section.

Which return model should I choose?

The market model is the standard default. For short daily windows it performs about as well as multi-factor models because abnormal returns are dominated by the event-period return. See Choosing a return model for the full decision logic.

Can I do this without writing code?

Yes. The free ARC calculator runs the same estimation, abnormal-return, and significance-testing pipeline from a CSV upload and returns AR, CAR, CAAR, the Patell Z and the BMP statistic.