Time Series Notes (6) - Parameter estimation
Residual Analysis
Definition of residuals
Consider an $\text{AR}(1)$ model with a constant term: $Z_t=\phi Z_{t-1}+\theta_0+a_t$, having estimated $\phi$ and $\theta_0$, the residuals are defined as
\[\hat a_t=Z_t-\hat\phi Z_{t-1}-\hat\theta_0\]If the model is correctly specified and the parameter estimates are reasonably close to the true values, then the residuals should have nearly the properties of a white noise: $i.i.d.$ with zero mean and common variances. Deviations from these properties would indicate an indequacy of the fit and we will search for a more appropriate model.
Calculation of residuals
Consider an $\text{ARMA}(1,1)$ model
\[Z_t-\hat\mu=\hat\phi_1(Z_{t-1}-\hat\mu)+\cdots+\hat\phi_p(Z_{t-p}-\hat\mu)+\hat a_t-\hat\theta_1\hat a_{t-1}-\cdots-\hat\theta_q\hat a_{t-q}\]The residuals can be calculated as follows,
\[\hat a_t=\sum_{j=0}^\infty\hat\pi_j(Z_{t-j}-\hat\mu)\]Where $\hat\pi_j$s are functions of $\hat\phi_1,\dots,\hat\phi_p$, and $\hat\theta_1,\dots,\hat\theta_q$, and the initial values $Z_s=\hat\mu$ for $s\le0$.
The invertibility is necessary to calculate the residuals.
Standardized residuals: ${a_t/s}$, where $s^2$ is the sample variance of the residual sequence.
Ways of residual analysis
Method | Target |
---|---|
(1) the time plot of the residual sequence; | to check whether or not there still exist some patterns not yet explained by the fitted model |
(2) the histogram; (3) the quantile-quantile plot; (4) some formal normality test; | to check the possible normality of the residuals |
(5) the correlogram; (6) Ljung-Box test; | to check for the autocorrelations |
- If the time plot of the residuals is like that of a white noise, we can say that there is no more patterns explained by the fitted model, though it is very difficult to see something by this way.
- The histogram and the Q-Q plot always give a very direct graph to show the normality while the normality test such as Shapiro-Wilk normality test gives a numerical result.
- The upper and lower bounds of the correlogram, the sample autocorrelation function (ACF) of the residuals, can be calculated through The Bartlett’s approximation.
Ljung-Box test
For a fitted $\text{ARMA}(p,q)$ model, the Ljung-Box test is modified on Box & Pierce test statistic. Ljung-Box test statistic is defined as
\[Q_*=n(n+2)(\frac{\hat\rho_1^2}{n-1}+\frac{\hat\rho_2^2}{n-2}+\cdots+\frac{\hat\rho_K^2}{n-K})\sim\chi_{K-m}^2,\]where $K$ is predetermined integer, $\hat\rho_j^2$ is the sample ACF of the residuals, and $m=p+q$. The sample size $n$ is that of the corresponding ARMA model.
The degrees of freedom is unchanged with or without the intercept.
Analysis of over-parameterized models
We usually overfit a time series to confirm the selected model. For example, if we want to pick an $\text{AR}(2)$ model, we will use an $\text{AR}(3)$ model to overfit. The original $\text{AR}(2)$ model would be confirmed if:
- the estimate of the additional parameter, $\phi_3$, is not significantly different from zero, and
- the estimates for the parameters in common, $\phi_1$ and $\phi_2$, (and $\theta_0$ if it exists,) do not change significantly from their original estimates.
Some guidelines:
- Check simple models before trying complicated ones.
- When overfitting, do not increase the orders of the $\text{AR}$ and $\text{MA}$ parts of the model simultaneously.
- Extend the model in directions suggested by an analysis of the residuals. If after fitting an $\text{MA}(1)$ model, substantial correlation remains at lag 2 in the residuals, try an $\text{MA}(2)$, not an $\text{ARMA}(1,1)$.