 MAT 3379 INTRODUCTION TO TIME SERIES (WINTER 2021)

## R Examples Part 4 (Estimation and Prediction using MLE and Yule-Walker procedures)

• We use MyTimeSeries MyTimeSeries-AR.txt. I loaded it into R as "MyTimeSeries".
plot.ts(MyTimeSeries);
par(mfrow=c(1,2));
acf(MyTimeSeries); pacf(MyTimeSeries);

MyTimeSeries is stationary; looks like AR(1).
par(mfrow=c(1,2))
qqnorm(MyTimeSeries); hist(MyTimeSeries);

Time series looks normal. I will perform estimation using two methods: Yule-Walker and MLE. We do not need normality for the former, but we do need normality for the latter. I will use ar command since there is no MA part.
fit.ar.yw<-ar(MyTimeSeries,method="yule-walker");
fit.ar.yw;
fit.ar.mle<-ar(MyTimeSeries,method="mle");
fit.ar.mle;

We obtain the selected order using AIC and estimation of parameters using Yule-Walker and MLE. In both cases the selected model is AR(1), but the estimated parameters are different. Recall from lectures that the estimates of the autoregressive parameter phi should be be the same, regardless which method is used. The difference comes from the fact that MLE is implemented in R by a complicated optimization algorithm and hence there are numerical errors. Recall also that the estimates of sigma_Z^2 steming from Yule-Walker and MLE are different.

> fit.ar.yw$order; fit.ar.mle$order	# order selected
 1
 1

> fit.ar.yw$ar; fit.ar.mle$ar   # estimated AR coefficient
 0.6201284
 0.6196839


Do we have a good fit? Look at residuals:
par(mfrow=c(3,2));
plot.ts(fit.ar.yw$resid); plot.ts(fit.ar.mle$resid);
acf(na.omit(fit.ar.yw$resid)); acf(na.omit(fit.ar.mle$resid));
pacf(na.omit(fit.ar.yw$resid)); pacf(na.omit(fit.ar.mle$resid));

Residuals look like iid in both cases. You can check that residuals are calculated properly by typing:
MyTimeSeries.centered=MyTimeSeries-mean=mean(MyTimeSeries);
phi=0.6201284;
MyTimeSeries.centered-phi*MyTimeSeries.centered

This should give you the second entry in fit.ar.yw. Note that the first entry is NA, since it corresponds to X_1-phi X_0 and we do not have X_0. We do prediction:
predict(fit.ar.yw);
predict(fit.ar.mle)

The predicted values of the next observation are -0.4743175 and -0.4760405, respectively, for Yule-Walker and MLE method. Different values stem from the fact that we got slightly different estimates of \phi. Note that the Yule-Walker agrees with the theoretical formula:
mean=mean(MyTimeSeries);
n=length(MyTimeSeries)
phi=0.6201284;
prediction=mean*(1-phi)+phi*MyTimeSeries[n];
prediction

Note that the predict(fit.ar.yw); output agree with what is displayed above. OK, so which model to choose? I am choosing the one obtained via MLE. There is no particular reason for that, since both models are good.
I plot prediction for the next 30 steps, with confidence bands obtained by calculation se - standard error of prediction. You can see that these prediction bounds are quite wide. Message: do not long term prediction.
predict.mle<-predict(fit.ar.mle,n.ahead=30)

par(mfrow=c(1,1))
y.max=max(predict.mle$pred+predict.mle$se);
y.min=min(predict.mle$pred-predict.mle$se);
plot.ts(predict.mle$pred,ylim=c(y.min,y.max)); lines(predict.mle$pred-predict.mle$se,col="red"); lines(predict.mle$pred+predict.mle$se,col="red");  We look closer at AIC. In both estimation methods, yule-walker and mle, the order of AR is selected according to AIC with the maximal order controlled by order.max. Type fit.ar.mle$aic

to see that order 1 gives the lowest AIC value. I mentioned in class that sometime you look at the minimal, not the maximal AIC, depending on software implementation. Here, AIC is shifted, so that the minimal one is zero. It will be different in each software.
You can also use
fit.arma<-arima(MyTimeSeries, order=c(1,0,0))

The method automatically uses MLE to estimate parameters, but you have to select model first!!! You should get very similar answers using this command and ar(MyTimeSeries, method="mle"):
fit.ar.mle;
fit.arma;

• Data Example: We use MyTimeSeries LakeHuron (included in R)
 MyTimeSeries=LakeHuron

plot.ts(MyTimeSeries);
par(mfrow=c(1,2));
acf(MyTimeSeries); pacf(MyTimeSeries);

MyTimeSeries is stationary; looks like AR(2) or ARMA(1,1)
par(mfrow=c(1,2))
qqnorm(MyTimeSeries); hist(MyTimeSeries);

fit.ar.yw<-ar(MyTimeSeries,method="yule-walker");
fit.ar.yw;
fit.ar.mle<-ar(MyTimeSeries,method="mle");
fit.ar.mle;

In both cases the selected order is 2, the estimated parameters are slightly different (again, theoretically they should be the same). Do we have a good fit? Look at residuals:
par(mfrow=c(3,2));
plot.ts(fit.ar.yw$resid); plot.ts(fit.ar.mle$resid);
n=length(fit.ar.yw$resid); m=length(fit.ar.mle$resid);
acf(fit.ar.yw$resid[3:n]); acf(fit.ar.mle$resid[3:m]);
pacf(fit.ar.yw$resid[3:n]); pacf(fit.ar.mle$resid[3:m]);

Residuals look good, there is no dependence left. Both AR(2) are acceptable. Which model to choose? Do "Quality fo prediction" as in R-3.html.
We want to investigate possibility of ARMA(1,1).
fit.arma.1<-arima(MyTimeSeries, order=c(2,0,0));
fit.arma.2<-arima(MyTimeSeries, order=c(1,0,1));

Investigate residuals (note: fit.arma.1 shoudl produce the same output as fit.ar.mle above).
par(mfrow=c(3,2));
plot.ts(fit.arma.1$residuals); plot.ts(fit.arma.2$residuals);
n=length(fit.arma.1$residuals); m=length(fit.arma.2$residuals);
acf(fit.arma.1$residuals[3:n]); acf(fit.arma.2$residuals[3:n]);
pacf(fit.arma.1$residuals[3:n]); pacf(fit.arma.2$residuals[3:n]);

Both models, AR(2) and ARMA(1,1), are acceptable !!! Choose the one with a smaller AIC - type fit.arma.1$aic and fit.arma.2$aic. Prediction:
par(mfrow=c(1,1))
y.max=max(predict.mle$pred+predict.mle$se);
y.min=min(predict.mle$pred-predict.mle$se);
plot.ts(predict.mle$pred,ylim=c(y.min,y.max)); lines(predict.mle$pred-predict.mle$se,col="red"); lines(predict.mle$pred+predict.mle\$se,col="red");