Methodology

Significance tests

Parametric and non-parametric battery for distinguishing observed abnormal returns from zero. Applied researchers typically run both.

In short

Significance tests tell you whether an event study's abnormal returns are statistically different from zero, not just large by chance. In short: report the BMP (standardized cross-sectional) test as your headline statistic because it stays valid when the event itself changes return volatility; add the Patell Z and a non-parametric rank or generalized-sign test as robustness; the plain cross-sectional t-test is the simplest but the least robust. Jump to a test below, or run the full battery free on your own data in ARC.

Which test should I use?

Pick by what your sample looks like. Most researchers run one parametric and one non-parametric test side by side and report both. Every test below is computed by ARC.

Your situationRecommended test(s)Why
A single firm and a single eventT-test (event day or window)The simplest case. Use the standardized versions if you have an estimation window.
Many firms, returns roughly normal, no special concernsPatell Z, or Cross-sectional TStandard parametric workhorses for the average effect (AAR, CAAR).
Variance rises around the event (event-induced volatility)Standardized Cross-Sectional T (BMP), or Generalized Rank TBoth stay valid when the event itself moves the variance.
Events cluster in calendar time or industry (cross-sectional correlation)Adjusted Patell Z, Adjusted Standardized Cross-Sectional T, or Generalized Rank TThe Kolari-Pynnonen adjustments correct for correlation across overlapping events.
Returns are skewed, have outliers, or the sample is smallGeneralized Rank T or the Permutation test; Skewness-Corrected T if staying parametricNon-parametric tests make no normality assumption and resist outliers.
A long event windowGeneralized Sign Z or Generalized Rank TPlain rank tests lose power over long windows; these hold up better.
You want one robust default for both AAR and CAARGeneralized Rank TIt handles cross-sectional and serial correlation and event-induced volatility, and works for both.

See the full formula and null distribution for each test below, or read how to interpret a CAAR and a Patell Z.

Significance tests decide whether the abnormal returns around an event are statistically distinguishable from zero. Parametric tests (t-test, Patell Z, BMP) assume abnormal returns are approximately normal and gain reliability as the sample grows; non-parametric tests (sign, generalized sign, Corrado rank) use only the signs or ranks of abnormal returns and stay valid under non-normality, skewness, and heteroskedasticity. Because the two families fail under different conditions, applied researchers typically report at least one of each and flag any disagreement (Brown and Warner, 1985; Kothari and Warner, 2007). The spin-off study of Schipper and Smith (1983) is one applied example of this practice.

What is being tested

The null hypothesis is that the event has no effect on returns: \(H_0: E(AR) = 0\) at the firm-day level, or \(H_0: E(CAR) = 0\) and \(H_0: E(CAAR) = 0\) at the cumulative level. A test can fail in two directions. It can over-reject, declaring an effect that is not there (a specification problem, controlled by the test's size), or it can under-reject, missing a real effect (a power problem). Most of the refinements below exist to control over-rejection without sacrificing too much power.

A subtler caveat is the joint-hypothesis, or bad-model, problem: every abnormal return is measured relative to a chosen return-generating model (constant-mean, market model, or a factor model), so a "significant" result jointly tests the event and the model used to define normal returns (Kothari and Warner, 2007; Fama, 1998). The tests on this page are designed for short event windows of a few days, where this concern is mild and the statistics are well behaved. Long-horizon inference (buy-and-hold abnormal returns, calendar-time portfolios) is a separate and substantially harder problem, sketched in the related-approaches section below.

Foundations

For a single instance, the random variable is the abnormal return on the event day (\(AR\)) or the cumulative abnormal return over the event window (\(CAR\)). Abnormal returns are the residuals of a benchmark model fitted on a pre-event estimation window that is disjoint from the event window (MacKinlay, 1997). Across \(N\) firms or repeated events:

\[ AAR_{t} = \frac{1}{N} \sum_{i=1}^{N} AR_{i,t} \quad CAR_{i} = \sum_{t=T_{1}+1}^{T_{2}} AR_{i,t} \quad CAAR = \frac{1}{N} \sum_{i=1}^{N} CAR_{i} \]

The estimation window \(\{T_0, \ldots, T_1\}\) has length \(L_1 = T_1 - T_0 + 1\); the event window \(\{T_1+1, \ldots, T_2\}\) has length \(L_2 = T_2 - T_1\). The firm-level estimation-window variance:

\[ S^{2}_{AR_{i}} = \frac{1}{M_{i} - K} \sum_{t=T_{0}}^{T_{1}} AR_{i,t}^{2} \]

where \(M_i\) is non-missing returns in the estimation window and \(K\) is the model's degrees of freedom (1 for constant mean, 2 for market model, 4 for FF3). There is no \(\overline{AR}\) term inside the sum because estimation-window residuals from a model with an intercept are mean-zero by construction. The cumulative variance \(S^{2}_{CAR_{i}} = L_{2}\, S^{2}_{AR_{i}}\) used below assumes the event-window abnormal returns are serially uncorrelated and homoskedastic; in practice it understates the true \(CAR\) variance, which also absorbs estimation error and serial correlation (precisely what the generalized rank test addresses). A reliable Patell or rank variance also requires \(M_i > K + 2\), the origin of the rule that parametric tests need a reasonably long estimation window and at least a few dozen firms.

Notation key. \(M_i\): estimation-window observations for firm \(i\). \(L_1\): estimation-window length. \(L_2\): event-window length. \(K\): model parameters (degrees of freedom). \(SAR\): standardized abnormal return (Patell). \(ASAR\): summed (aggregated) standardized abnormal return. \(SCAR\): standardized cumulative abnormal return. \(GSAR\): generalized standardized abnormal return (rank-test input). \(U_{i,t}\): demeaned scaled rank. \(\bar{r}\): average pairwise cross-correlation of estimation-window abnormal returns.

Why so many test statistics?

The tests are best understood as an evolutionary chain in which each newer test neutralizes a specific failure of the previous one. The plain t-test over-rejects when firms have heterogeneous variances, so Patell (1976) standardizes each abnormal return by its own forecast error before aggregating. Patell still assumes the event does not change return variance, but events routinely spike announcement-day volatility, which makes Patell over-reject; the BMP test of Boehmer, Musumeci, and Poulsen (1991) fixes this by re-dividing the standardized returns by their cross-sectional spread. BMP in turn assumes the abnormal returns are cross-sectionally independent, which fails when events cluster on the same calendar dates; the adjusted tests of Kolari and Pynnönen (2010) deflate the statistic for that correlation. Finally, all parametric tests can break under non-normality, motivating the rank and sign tests, and rank tests themselves lose power on multi-day cumulative returns, motivating the generalized rank test.

Clustering matters more than intuition suggests. Even a small average pairwise correlation \(\bar{r}\) among abnormal returns inflates the true standard error of the mean by a factor of approximately \(\sqrt{1 + (N-1)\bar{r}}\) (Kothari and Warner, 2007, eq. 10). With \(\bar{r} = 0.02\) and \(N = 100\) the true standard error is about \(1.4\) times the naive estimate, so a test that ignores the correlation rejects the null far more often than its nominal size. This inflation factor is exactly what the \((1 + (N-1)\bar{r})\) denominators in the adjusted tests are correcting.

When to use which test

The matrix below summarizes the robustness profile of each test; "robust to event-induced variance" means the test holds its size when the event itself raises return volatility, and "robust to cross-sectional correlation" means it holds its size when events cluster in calendar time.

TestNull distributionAssumes normality?Robust to event-induced variance?Robust to cross-sectional correlation?Handles non-normality?
Cross-Sectional T\(t_{N-1}\)YesYesNoNo
CDA T\(t\)YesNoPartial (clustering)No
Patell Z\(N(0,1)\)YesNoNoNo
Adjusted Patell Z\(N(0,1)\)YesNoYesNo
BMP\(t_{N-1}\)YesYesNoNo
Adjusted BMP\(t_{N-1}\)YesYesYesNo
Sign Z\(N(0,1)\)NoPartialNoYes
Generalized Sign Z\(N(0,1)\)NoPartialNoYes
Corrado Rank Z\(N(0,1)\)NoNoNoYes
Generalized Rank T\(t_{L_1 - 1}\)NoYesYesYes
Generalized Rank Z\(N(0,1)\)NoYesYesYes

As scenario guidance: for a standard study, report the Cross-Sectional T plus the Patell Z and one non-parametric test. If event-induced variance is likely (most announcement studies), use BMP together with a rank test. If event dates cluster, use the Adjusted BMP or a calendar-time approach. Under thin trading without a variance change, the Corrado rank test is best specified and most powerful, but when variance also rises it becomes misspecified and the generalized sign test is the fallback (Cowan and Sergeant, 1996). For maximum robustness, the generalized rank test of Kolari and Pynnönen (2011) is robust to non-normality, event-induced variance, and clustering simultaneously. As a default progression, the modern literature treats the plain Patell Z and plain cross-sectional t as over-rejecting under conditions that are the norm rather than the exception, and BMP, Adjusted BMP, and the generalized rank test as the preferred defaults.

Parametric tests

T Test

Asks whether a single firm's abnormal return is large relative to its own estimation-window scatter. Tests \(H_0: E(AR_{i,0}) = 0\) (event day) or \(H_0: E(CAR_i) = 0\) (event window). Simplest test; sensitive to cross-sectional and event-induced volatility and to non-normality (Brown and Warner, 1985). Distributed Student-t.

\[ t = \frac{AR_{i,t}}{S_{AR_{i}}}, \qquad t = \frac{CAR_{i}}{S_{CAR_{i}}}, \quad S^{2}_{CAR_{i}} = L_{2}\, S^{2}_{AR_{i}} \]

Cross-Sectional T (CSect T)

Asks whether the average abnormal return is large relative to its spread across firms, which lets the data report any event-induced volatility itself. Tests \(H_0: E(AAR_0) = 0\) or \(H_0: E(CAAR) = 0\); assumes cross-sectional independence. Distributed \(t_{N-1}\).

\[ t = \sqrt{N}\, \frac{AAR_{0}}{S_{AAR,0}}, \quad S^{2}_{AAR,0} = \frac{1}{N-1} \sum_{i=1}^{N}(AR_{i,0} - AAR_{0})^{2} \] \[ t = \sqrt{N}\, \frac{CAAR}{S_{CAAR}}, \quad S^{2}_{CAAR} = \frac{1}{N-1} \sum_{i=1}^{N}(CAR_{i} - CAAR)^{2} \]

Crude Dependence Adjusted T (CDA T)

Collapses cross-sectional dependence into a single number by using the time-series standard deviation of the portfolio mean over the estimation window, at the cost of ignoring firm-level variance differences. Same null. Originates with Brown and Warner (1985).

\[ t = \sqrt{N}\, \frac{AAR_{0}}{S_{AAR}}, \quad S^{2}_{AAR} = \frac{1}{M-1} \sum_{t=T_{0}}^{T_{1}}\!\left( AAR_{t} - \overline{AAR} \right)^{2} \]

Patell Z (standardized residual test)

Divides each abnormal return by its own forecast-error-corrected standard deviation so that noisier firms count less, then aggregates. The correction inflates the estimation-window variance to reflect that the event-window value is an out-of-sample prediction. Controls for cross-firm differences in residual variance; sensitive to event-induced volatility and to cross-sectional correlation (Patell, 1976). Distributed \(N(0,1)\).

\[ SAR_{i,0} = \frac{AR_{i,0}}{S_{AR_{i,0}}}, \quad S^{2}_{AR_{i,0}} = S^{2}_{AR_{i}}\!\left( 1 + \frac{1}{M_{i}} + \frac{(R_{m,0} - \overline{R}_{m})^{2}}{\sum_{t=T_{0}}^{T_{1}}(R_{m,t} - \overline{R}_{m})^{2}} \right) \]

The aggregated statistic divides the sum of standardized returns by the standard deviation of that sum, so \(S_{ASAR}\) is the scale of \(ASAR_0\), not a per-firm quantity. The variance term is written generically in the model degrees of freedom \(K\) (it reduces to \((M_i - 2)/(M_i - 4)\) for the market model):

\[ z = \frac{ASAR_{0}}{\sqrt{\sum_{i=1}^{N} \frac{M_{i}-K}{M_{i}-K-2}}}, \quad ASAR_{0} = \sum_{i=1}^{N} SAR_{i,0} \]

Adjusted Patell Z (Kolari-Pynnönen 2010)

Deflates the Patell statistic for cross-sectional correlation among estimation-period abnormal returns, the bias that arises when events cluster on the same dates (Kolari and Pynnönen, 2010). Distributed \(N(0,1)\).

\[ z_{\text{adj}} = z \cdot \sqrt{\frac{1 - \bar{r}}{1 + (N-1)\,\bar{r}}} \]

where \(\bar{r}\) is the average pairwise sample cross-correlation of estimation-period abnormal returns. (Note the \((1 - \bar{r})\) numerator: it is part of the correct Kolari-Pynnönen form, not an optional refinement.)

Standardized Cross-Sectional T (BMP, Boehmer-Musumeci-Poulsen 1991)

Takes the Patell-standardized abnormal returns and re-divides them by their cross-sectional standard deviation in the event window, so any common volatility spike caused by the event is differenced out. This robustness to event-induced variance is the reason BMP has displaced the plain Patell test as the default short-window parametric test; it still assumes cross-sectional independence (Boehmer, Musumeci, and Poulsen, 1991). Distributed \(t_{N-1}\). In its textbook form the statistic is the cross-sectional mean of the standardized returns over their standard deviation:

\[ t = \sqrt{N}\, \frac{\overline{SAR}_{0}}{S(SAR_{0})}, \quad S^{2}(SAR_{0}) = \frac{1}{N-1} \sum_{i=1}^{N}\!\left( SAR_{i,0} - \overline{SAR}_{0} \right)^{2} \]

(This is algebraically identical to \(t = ASAR_0 / (\sqrt{N}\, S(SAR_0))\) because \(ASAR_0 = N\,\overline{SAR}_0\).) For the event window the standardized cumulative return uses the forecast-error-corrected \(CAR\) standard deviation:

\[ t = \sqrt{N}\, \frac{\overline{SCAR}}{S(\overline{SCAR})}, \quad SCAR_{i} = \frac{CAR_{i}}{S(CAR_{i})}, \quad S^{2}(CAR_{i}) = L_{2}\, S^{2}_{AR_{i,0}} \]

Adjusted Standardized Cross-Sectional T (Kolari-Pynnönen)

Augments the BMP test with the same cross-correlation adjustment as the Adjusted Patell, restoring correct size under event clustering (Kolari and Pynnönen, 2010). Distributed \(t_{N-1}\).

\[ t_{\text{adj}} = t \cdot \sqrt{\frac{1 - \bar{r}}{1 + (N-1)\,\bar{r}}} \]

Skewness Corrected T (Hall 1992)

Applies a polynomial transformation in the sample skewness so the t-statistic's distribution is closer to normal, improving size when abnormal returns are skewed (relevant for longer windows and smaller samples). The same skewness-adjusted polynomial originates with Johnson (1978).

\[ t = \sqrt{N}\, \left( S + \tfrac{1}{3}\,\gamma\, S^{2} + \tfrac{1}{27}\,\gamma^{2}\, S^{3} + \tfrac{1}{6N}\,\gamma \right) \] \[ \gamma = \frac{N}{(N-2)(N-1)} \sum_{i=1}^{N} \frac{(AR_{i,0} - AAR_{0})^{3}}{S^{3}_{AAR,0}}, \quad S = \frac{AAR_{0}}{S_{AAR,0}} \]

Non-parametric tests

Sign Z

Asks whether the count of positive abnormal returns exceeds what chance alone (a 50/50 split) would produce; robust to skewness, weaker for longer windows. The basic sign test is classical and in event studies traces to Brown and Warner (1985). Distributed \(N(0,1)\).

\[ z = \frac{w - N \cdot 0.5}{\sqrt{N \cdot 0.5 \cdot 0.5}} \]

where \(w\) is the number of positive \(AR_{i,0}\) (or positive \(CAR_i\)).

Generalized Sign Z (Cowan 1992)

Replaces the 0.5 benchmark with \(\widehat{p}\), the empirical positive fraction estimated from the estimation window, which corrects for any natural asymmetry in the return distribution (Cowan, 1992). Following the standard implementation, \(\widehat{p}\) is computed as the average across firms of each firm's estimation-window positive fraction. Distributed \(N(0,1)\).

\[ z = \frac{w - N \cdot \widehat{p}}{\sqrt{N \cdot \widehat{p}(1 - \widehat{p})}}, \quad \widehat{p} = \frac{1}{N}\sum_{i=1}^{N}\frac{1}{M_i}\sum_{t=T_0}^{T_1}\mathbf{1}\!\left[AR_{i,t} > 0\right] \]

Corrado Rank Test (Corrado 1989)

Pools each firm's estimation and event window, converts every abnormal return to a scaled rank, and asks whether the event-day mean rank sits unusually far from its midpoint of 0.5. Its distinctive strengths are that it does not require the cross-sectional return distribution to be symmetric (an advantage over the sign test) and that it is better specified and more powerful than the t-test under non-normality (Corrado, 1989). Its limitation is that power deteriorates for multi-day cumulative windows and it can be misspecified under event-induced variance, which is what motivates the generalized rank test that follows. Distributed \(N(0,1)\).

\[ K_{i,t} = \frac{\operatorname{rank}(AR_{i,t})}{1 + M_{i} + L_{2,i}}, \quad \bar{K}_{t} = \frac{1}{N_{t}} \sum_{i=1}^{N} K_{i,t} \] \[ z = \frac{\bar{K}_{0} - 0.5}{S_{\bar{K}}}, \quad S_{\bar{K}}^{2} = \frac{1}{L_{1}+L_{2}} \sum_{t=T_{0}}^{T_{2}}\!\left( \bar{K}_{t} - 0.5 \right)^{2} \]

Generalized Rank T (Kolari-Pynnönen 2011)

Accounts for cross-sectional and serial correlation as well as event-induced volatility, the most robust of the lot (Kolari and Pynnönen, 2011). It works on generalized standardized abnormal returns (\(GSAR\)): the cumulative-return window is re-standardized by the cross-sectional standard deviation of the standardized abnormal returns, so for a single event day \(GSAR\) equals the Patell \(SAR\). The event day is indexed \(L_1 + 1\), one position after the \(L_1\) estimation-window observations. Ranks of the \(GSAR\) are demeaned and scaled, then transformed; the resulting statistic is Student-t with \(L_1 - 1\) (that is, \(T - 2\)) degrees of freedom.

\[ U_{i,t} = \frac{\operatorname{rank}(GSAR_{i,t})}{L_{1}+2} - 0.5, \quad \bar{U}_{t} = \frac{1}{N} \sum_{i=1}^{N} U_{i,t} \] \[ t = Z \cdot \sqrt{\frac{L_{1} - 1}{L_{1} - Z^{2}}}, \quad Z = \frac{\bar{U}_{L_{1}+1}}{S_{\bar{U}}} \]

Generalized Rank Z

Simpler normal-approximation variant of the Generalized Rank T, using the closed-form variance of the demeaned scaled rank. Distributed \(N(0,1)\).

\[ z = \frac{\bar{U}_{L_{1}+1}}{S_{\bar{U}_{L_{1}+1}}}, \quad S_{\bar{U}_{L_{1}+1}}^{2} = \frac{L_{1}}{12\, N\,(L_{1}+2)} \]

Wilcoxon Signed-Rank (Wilcoxon 1945)

Tests whether the distribution of \(AR_{i,0}\) is symmetric about zero (its central location is zero), using signed ranks so that both the sign and the magnitude of each abnormal return enter. It is preferred over the plain sign test when magnitudes are informative and over the Corrado rank test when only the event-day cross-section is of interest. The statistic is the sum of signed ranks, with a large-sample normal approximation; it is not available for the \(CAAR\) null.

\[ W = \sum_{i=1}^{N} \operatorname{sgn}(AR_{i,0}) \cdot \operatorname{rank}(|AR_{i,0}|), \quad E[W] = 0, \quad \operatorname{Var}[W] = \frac{N(N+1)(2N+1)}{6} \]

Permutation (randomization) Test

Resampling-based and distribution-free: the chosen statistic is recomputed many times under reshuffled event-day labels (or re-drawn pseudo-event days), and the \(p\)-value is the fraction of resampled statistics at least as extreme as the observed one. Concretely, recompute the statistic \(B\) times and set \(p = \#\{\,|\text{stat}^{*}| \ge |\text{stat}_{\text{obs}}|\,\}/B\). The approach is robust to non-normality (unlike the t-test) but computationally more expensive; it follows the randomization argument of Corrado (1989) and the bootstrap-in-event-studies literature (Kramer, 2001).

Each test above has a cumulative analog for the \(CAAR\) null: cumulative sign and generalized sign, cumulative BMP and Kolari-Pynnönen, and a cumulative rank statistic (CUMRANK-T/Z), matching the CAAR output of the analysis engine. The Wilcoxon test is the exception, as it is not defined for the CAAR null.

Several methods sit outside this short-window battery and are listed here for completeness. For long horizons, buy-and-hold abnormal returns (BHAR) with a t-test and the calendar-time portfolio (Jaffe-Mandelker) approach are the standard tools, alongside Lyon-Barber-Tsai, the RATS approach (Ibbotson, 1975), jackknife (Giaccotto-Sfiridis), and bootstrap analogs of the parametric tests. The calendar-time portfolio test is also the canonical remedy for cross-sectional dependence when events cluster heavily in calendar time. Once abnormal returns are established as significant, a second stage typically explains their size: regressing \(CAR\) on firm characteristics with heteroskedasticity-robust standard errors, or comparing \(CAR\) across groups with a Welch t-test (two groups) or Welch ANOVA (three or more). These return-model choices (market, Fama-French, mean-adjusted, Scholes-Williams, GARCH) are detailed on the companion abnormal-returns and models page.

Worked example

Consider a study of \(N = 50\) firms with an estimation-window length \(L_1 = 250\) and a one-day event window, where the average event-day abnormal return is \(AAR_0 = 1.2\%\). Suppose the cross-sectional standard deviation of the event-day abnormal returns is \(S_{AAR,0} = 3.3\%\), the cross-sectional standard deviation of the Patell-standardized returns is \(S(SAR_0) = 1.15\), the average standardized return is \(\overline{SAR}_0 = 0.37\), the summed standardized return gives a Patell aggregate, and 34 of the 50 abnormal returns are positive.

  • Cross-Sectional T: \(t = \sqrt{50}\,(0.012/0.033) \approx 2.57\). With \(t_{49}\) critical value \(\approx 2.01\), reject \(H_0\) at 5% two-sided.
  • Patell Z: aggregating the standardized returns yields a Patell \(z \approx 2.6\). Against \(N(0,1)\), \(|z| > 1.96\) rejects \(H_0\) at 5% two-sided.
  • BMP t: \(t = \sqrt{50}\,(0.37/1.15) \approx 2.27\). Against \(t_{49}\), this rejects \(H_0\) at 5%, but by a smaller margin, reflecting that BMP does not assume the event leaves variance unchanged.
  • Sign Z: with \(w = 34\) positives, \(z = (34 - 25)/\sqrt{50 \cdot 0.25} = 9/3.54 \approx 2.55\), which also rejects at 5%.

All four agree here, which is the reassuring case. When tests disagree, the rule of thumb is to favor the more robust statistic that matches the suspected threat (BMP or Adjusted BMP under event-induced variance or clustering, a rank test under non-normality) and to report the disagreement rather than cherry-pick.

References

  1. Brown, S. J., and J. B. Warner. 1985. "Using Daily Stock Returns: The Case of Event Studies." Journal of Financial Economics 14 (1): 3-31. https://doi.org/10.1016/0304-405X(85)90042-X
  2. Corrado, C. J. 1989. "A Nonparametric Test for Abnormal Security-Price Performance in Event Studies." Journal of Financial Economics 23 (2): 385-395. https://doi.org/10.1016/0304-405X(89)90064-0
  3. Cowan, A. R. 1992. "Nonparametric Event Study Tests." Review of Quantitative Finance and Accounting 2 (4): 343-358. https://doi.org/10.1007/BF00939016
  4. Fama, E. F. 1998. "Market Efficiency, Long-Term Returns, and Behavioral Finance." Journal of Financial Economics 49 (3): 283-306. https://doi.org/10.1016/S0304-405X(98)00026-9
  5. Hall, P. 1992. "On the Removal of Skewness by Transformation." Journal of the Royal Statistical Society: Series B (Methodological) 54 (1): 221-228. https://doi.org/10.1111/j.2517-6161.1992.tb01875.x
  6. Johnson, N. J. 1978. "Modified t Tests and Confidence Intervals for Asymmetrical Populations." Journal of the American Statistical Association 73 (363): 536-544. https://doi.org/10.1080/01621459.1978.10480051
  7. Kothari, S. P., and J. B. Warner. 2007. "Econometrics of Event Studies." In Handbook of Empirical Corporate Finance, vol. 1, ch. 1, 3-36. Elsevier. https://doi.org/10.1016/B978-0-444-53265-7.50015-9
  8. Kramer, L. A. 2001. "Alternative Methods for Robust Analysis in Event Study Applications." In Advances in Investment Analysis and Portfolio Management, vol. 8, 109-132.
  9. Patell, J. M. 1976. "Corporate Forecasts of Earnings Per Share and Stock Price Behavior: Empirical Test." Journal of Accounting Research 14 (2): 246-276. https://doi.org/10.2307/2490543
  10. Schipper, K., and K. G. Smith. 1983. "Effects of Recontracting on Shareholder Wealth: The Case of Voluntary Spin-Offs." Journal of Financial Economics 12 (4): 437-467. https://doi.org/10.1016/0304-405X(83)90043-0
  11. Wilcoxon, F. 1945. "Individual Comparisons by Ranking Methods." Biometrics Bulletin 1 (6): 80-83. https://doi.org/10.2307/3001968

Further readings

  1. Boehmer, E., J. Musumeci, and A. B. Poulsen. 1991. "Event-Study Methodology Under Conditions of Event-Induced Variance." Journal of Financial Economics 30 (2): 253-272. https://doi.org/10.1016/0304-405X(91)90032-F
  2. Campbell, C. J., and C. E. Wasley. 1993. "Measuring Security Price Performance Using Daily NASDAQ Returns." Journal of Financial Economics 33 (1): 73-92. https://doi.org/10.1016/0304-405X(93)90025-7
  3. Corrado, C. J., and T. L. Zivney. 1992. "The Specification and Power of the Sign Test in Event Study Hypothesis Tests Using Daily Stock Returns." Journal of Financial and Quantitative Analysis 27 (3): 465-478. https://doi.org/10.2307/2331331
  4. Cowan, A. R., and A. M. A. Sergeant. 1996. "Trading Frequency and Event Study Test Specification." Journal of Banking & Finance 20 (10): 1731-1757. https://doi.org/10.1016/S0378-4266(96)00021-0
  5. Kolari, J. W., and S. Pynnönen. 2010. "Event Study Testing with Cross-sectional Correlation of Abnormal Returns." Review of Financial Studies 23 (11): 3996-4025. https://doi.org/10.1093/rfs/hhq072
  6. Kolari, J. W., and S. Pynnönen. 2011. "Nonparametric Rank Tests for Event Studies." Journal of Empirical Finance 18 (5): 953-971. https://doi.org/10.1016/j.jempfin.2011.08.003
  7. MacKinlay, A. Craig. 1997. "Event Studies in Economics and Finance." Journal of Economic Literature 35 (1): 13-39. https://www.jstor.org/stable/2729691

See the full bibliography for all sources cited across the site.

Apply the tests and go deeper

Apply this to your own data, free. The ARC calculator runs every model and test on this page from a CSV upload and returns AR, CAR, CAAR, the Patell Z and BMP.

Run it free in ARC →

Step-by-step tutorials that use these tests:

See also Expected-return models for the benchmark that produces the abnormal returns these tests evaluate.

Last reviewed: June 26, 2026. Maintained by EventStudyTools since 2014.