ARDL Bounds Test: A Step-by-Step Guide

The Johansen and Engle-Granger procedures require every series to be I(1) — yet in practice, unit root tests rarely cooperate: some variables turn out stationary in levels, I(0), others stationary in first differences, I(1). The ARDL bounds test was designed for exactly this situation, allowing cointegration to be tested with a mixed order of integration, in a single equation, and even in small samples. This guide covers the logic of the bounds test, its decision rules, the error-correction model and the diagnostic checks reviewers expect, with thesis and journal practice in mind.

When should you choose the ARDL bounds test?

Three conditions favour ARDL: (1) the variables are a mix of I(0) and I(1); (2) the sample is small — 30 to 80 annual observations is the norm in social science theses; and (3) a single long-run relationship is being modelled. A further advantage is that the model permits different lag lengths for the dependent and independent variables, and a well-chosen lag structure mitigates endogeneity and autocorrelation problems.

One critical warning: ARDL rests on the assumption that no variable is I(2). If a series is integrated of order two, the distribution of the F-statistic breaks down and the test produces meaningless results. The claim that 'ARDL does not require unit root testing' is therefore wrong: the test does not require you to know each order of integration, but it does require you to verify that none of the series is I(2). Reporting ADF, PP and KPSS tests together is consequently a compulsory first step.

How the bounds F-test works: lower bound, upper bound, inconclusive zone

The bounds test takes the conditional error-correction form of the ARDL model and uses an F-test for the joint null that the coefficients on the lagged level variables are zero (H₀: no long-run relationship). This F-statistic does not follow a standard distribution; instead, two sets of critical values are used: a lower bound derived under the assumption that all regressors are I(0), and an upper bound derived assuming they are all I(1). As long as the true orders of integration lie between these extremes, the decision is valid — hence the name.

Bounds-test decision rules
Position of the F-statistic	Decision	Next step
F < lower bound I(0)	Fail to reject H₀: no cointegration	Estimate a short-run (differences-only) model
Lower bound ≤ F ≤ upper bound	Inconclusive zone	Cross-check with the t-bounds test on the error-correction term and alternative methods
F > upper bound I(1)	Reject H₀: cointegration exists	Estimate the long-run coefficients and the error-correction model

The critical values depend on the number of regressors (k), the deterministic specification (Case III, unrestricted intercept and no trend, being the most common) and the significance level. The tables published by Pesaran and co-authors are asymptotic; for studies with fewer than about 80 observations, reviewers specifically expect small-sample critical values to be used. An F-statistic landing in the inconclusive zone is not a failure: the t-bounds test on the sign and significance of the error-correction coefficient then becomes the tie-breaker.

k = 3, Case III, 5% significance level; the computed F value is illustrative

The error-correction model and long-run coefficients

Once cointegration is established, the model delivers two outputs. The first is the error-correction coefficient (ECT): it should be negative, significant and typically between 0 and −1. An estimate of ECT = −0.38 (p < 0.01), for example, says that 38% of any disequilibrium is eliminated in the following period. Values between −1 and −2 indicate oscillating yet still convergent adjustment; a positive coefficient, or one beyond −2, calls for a re-examination of the specification.

The second output is the set of long-run coefficients: each is derived by dividing the coefficient on a regressor's lagged level by the coefficient on the dependent variable's lagged level and reversing the sign, with standard errors obtained via the delta method. Software reports this transformation automatically, but presenting the short-run (difference) and long-run coefficients in separate tables markedly improves readability. Lag lengths are selected with AIC or SC; a maximum of 2 lags for annual data and 4–8 for quarterly data are the usual starting points.

Diagnostics: the validity insurance for your results

Breusch-Godfrey LM test: the residuals must be free of autocorrelation (p > 0.05). Serial correlation invalidates the bounds F-distribution; the first remedy is to increase the lag length.
CUSUM and CUSUMSQ: these test the stability of the coefficients across the sample; the plots must stay within the 5% confidence bands. A CUSUMSQ breach points to variance instability and hence a possible structural break.
Breusch-Pagan / White: heteroscedasticity checks; report robust standard errors if a problem is found.
Ramsey RESET and Jarque-Bera: functional form and residual normality — present them together, especially in small samples.

On the software side, EViews offers ARDL through its menus, complete with automatic lag selection and bounds-test tables, and remains the most popular choice; in Stata, the user-written ardl command ships with the bounds critical values built in; the R ARDL package provides a fully transparent, reproducible workflow. Whichever tool you pick, the write-up must state the specification in full — the lag structure, the deterministic case and the source of the critical values.

ARDL does not make unit root testing unnecessary; it asks you to verify not the answer, but that the question can be asked at all.

Frequently Asked Questions

How many observations does the ARDL bounds test need?

ARDL is the most reliable cointegration approach for small samples and can be applied with as few as around 30 observations. Below roughly 80 observations, however, small-sample critical values must be used in place of the asymptotic tables.

What should I do if the F-statistic falls in the inconclusive zone?

First consult the t-bounds test on the error-correction term: a negative coefficient with a t-statistic beyond the upper bound strengthens the case for cointegration. Running sensitivity checks with alternative lag lengths and reporting the result explicitly as inconclusive is the most honest approach.

Can I use ARDL if one of my series is I(2)?

No; with an I(2) variable present, the bounds critical values are invalid. Options include entering the series in first differences so that it is I(1) in the model, or redefining the variable, for instance as a ratio instead of a level.

What does Celsus offer for ARDL analysis?

We run the complete workflow in EViews, Stata or R: unit root pretesting, lag selection, the bounds test, derivation of long-run coefficients, the error-correction model and the full diagnostic battery including CUSUM. Findings are delivered in thesis- and journal-ready tables with interpretation and fully reproducible code.