The variance of a financial time series is anything but constant: calm spells give way to turbulent ones, then markets settle again. The constant-variance assumption behind ordinary least squares therefore breaks down for daily equity, exchange-rate or cryptocurrency returns. The GARCH model (generalised autoregressive conditional heteroskedasticity) tackles the problem head-on by letting today's conditional variance depend on its own past, and it has become the workhorse of the volatility chapter in finance theses. This guide walks through the full workflow, from testing for ARCH effects to asymmetric extensions and forecasting.
Stylised facts of financial returns
What makes the GARCH family worth estimating is a set of regularities observed in virtually every financial return series. Documenting them descriptively in your own data — a return plot, a histogram, the autocorrelation function of squared returns — strengthens the empirical chapter before any model is fitted:
- Volatility clustering: Large changes tend to be followed by large changes and small changes by small ones; turbulence arrives in waves rather than at random.
- Fat tails: The kurtosis of returns clearly exceeds the Gaussian value of 3; extreme moves occur far more often than the normal distribution predicts.
- Autocorrelation in squared returns: Returns themselves are nearly unpredictable, yet their squares — a proxy for volatility — show strong, persistent autocorrelation.
- The leverage effect: Negative shocks (price falls) raise volatility more than positive shocks of the same size, which motivates the asymmetric models below.
Is there an ARCH effect? The ARCH-LM test
Before estimating anything from the GARCH family, you must show that conditional heteroskedasticity is actually present. The routine is simple: fit an appropriate mean equation first (a constant or a low-order ARMA suffices for most daily returns), then regress the squared residuals on their own q lags. The ARCH-LM test compares the nR² statistic with a χ²(q) distribution: p < 0.05 indicates ARCH effects and justifies the volatility model. A Ljung-Box test on the squared residuals should tell the same story. Fitting a GARCH model to a series with no ARCH effects is the first methodological objection a referee will raise.
From ARCH(q) to GARCH(1,1): building the GARCH model
The original ARCH(q) specification explains conditional variance using past squared shocks alone, but capturing the long memory of financial volatility that way requires many lags and wastes parameters. GARCH(1,1) solves this with a single lagged variance term: σ²ₜ = ω + αε²ₜ₋₁ + βσ²ₜ₋₁. Here α measures the immediate reaction to news, while β measures the memory of volatility. In practice GARCH(1,1) is adequate for the vast majority of daily return series; higher orders rarely deliver a meaningful improvement.
Persistence is measured by the sum α + β and sits at the heart of interpretation. The closer the sum is to 1, the more slowly a shock to volatility dies out; the half-life equals ln(0.5)/ln(α+β). With α + β = 0.98, for instance, it takes roughly 34 trading days for a shock's effect to halve. If the sum reaches or exceeds 1, the unconditional variance is no longer defined (the IGARCH region) — often a symptom of structural breaks or sample problems that must be addressed before any interpretation.
Asymmetry: the leverage effect with EGARCH and GJR-GARCH
Standard GARCH responds only to the size of a shock, never its sign. Yet in equity markets, falls raise volatility more than rises of the same magnitude. GJR-GARCH (TGARCH) captures this asymmetry by adding an indicator term to the variance equation (γε²ₜ₋₁·I[εₜ₋₁<0]): a positive, significant γ is direct evidence of a leverage effect. EGARCH instead models the logarithm of the variance, which separates sign and magnitude effects without imposing positivity constraints on the parameters. Reporting the significance of the asymmetry term in the thesis automatically justifies moving beyond the symmetric model.
| Model | What it captures | When to use it |
|---|---|---|
| ARCH(q) | Past squared shocks | A pedagogical starting point; needs many lags for long memory, so rarely recommended as the final model |
| GARCH(1,1) | Clustering + persistence (with a single variance lag) | The default starting point; adequate for most daily return series |
| GJR-GARCH (TGARCH) | Asymmetry: the extra impact of negative shocks (γ) | When the leverage effect is to be tested in equity or index returns |
| EGARCH | Asymmetry via log variance; no positivity constraints | When asymmetry plus a flexible parameter space is wanted; usually reported alongside GJR |
| GARCH-M | A risk premium in the mean equation | For risk–return trade-off hypotheses, where returns depend on their own volatility |
| IGARCH | Unit persistence (α+β=1), equivalent in structure to EWMA | For extremely persistent series; RiskMetrics-style risk measurement |
Distribution and model selection: Student-t, AIC and BIC
The choice of error distribution matters less for the point estimates than for standard errors and tail-risk measures. For daily returns the normal distribution is usually inadequate; a Student-t or skewed Student-t typically lifts the log-likelihood substantially (an estimated degrees-of-freedom parameter in the 4–8 range confirms the fat tails). Candidate models are compared using the log-likelihood, AIC and BIC, with BIC penalising complexity more harshly in favour of parsimony. Whatever wins must then pass the diagnostics: the squared standardised residuals should show no remaining ARCH effects (re-run the ARCH-LM test), and a sign bias test should confirm that the asymmetry has been fully absorbed.
Forecasting volatility and choosing software
For a thesis, the most valuable output of a GARCH model is the forward-looking volatility forecast. The one-step-ahead forecast comes straight from the variance equation; multi-step forecasts converge towards the unconditional variance ω/(1−α−β) at a speed governed by the persistence. Forecast accuracy is evaluated against realised-volatility proxies (squared returns, or realised variance where available) using MSE and QLIKE loss functions. On the software side, three practical options stand out:
- R — rugarch: The widest range of specifications (ugarchspec/ugarchfit), rich distribution options and fully reproducible code; ideal for a thesis appendix.
- EViews: Menu-driven estimation with built-in diagnostics; the most common interface in econometrics teaching.
- Stata: The arch command family covers GARCH, GJR and EGARCH; convenient for researchers already working in Stata pipelines.
GARCH does not promise to predict returns; it shows that risk itself has a predictable dynamic.
Frequently Asked Questions
How many observations do I need for a GARCH model?
Reliable maximum-likelihood estimation of variance dynamics calls for at least 500 daily observations, and preferably 1,000 or more. Monthly data are usually too short; finding a higher-frequency series should be the first remedy.
What should I do if α+β exceeds 1?
Check for structural breaks and reconsider the sample period first; spurious persistence is common in samples spanning crisis episodes. Adding break dummies, running subsample analyses or moving to an IGARCH framework are the standard remedies.
Should I fit a higher-order model instead of GARCH(1,1)?
Rarely. The applied literature consistently finds that GARCH(1,1) matches or beats higher-order specifications for most series. Let the diagnostics decide: if no ARCH effects remain in the standardised residuals, there is no case for adding lags.
What GARCH services does Celsus provide?
We handle data preparation and return transformations, ARCH-LM pre-testing, estimation in R (rugarch), EViews or Stata, comparison of asymmetric extensions and error distributions, full diagnostics, volatility forecasting and thesis-ready reporting. Every analysis is delivered with reproducible code.