14 Time series

14.1 Main R packages to work with time series

  • astsa
  • forecast
  • stat
  • TSA
  • FitAR
  • FitARMA
  • MTS
  • zoo
  • its
  • tseries
  • fBasics

14.2 Introduction

A time series is a collection of random variables \((X)_{t \in T}\) indexed by time i.e., \[X_1, X_2, \ldots, X_{t-1}, X_t,\] where \(T\) is a set.

14.3 Examples of time series

> x <- round(rnorm(12), 2)
> 
> ts1 <- ts(x, start = 2000, frequency = 1)
> ts1
## Time Series:
## Start = 2000 
## End = 2011 
## Frequency = 1 
##  [1]  0.40 -0.61  0.34 -1.13  1.43  1.98 -0.37 -1.04  0.57 -0.14  2.40 -0.04
> ts2 <- ts(x, start = c(2000, 1), frequency = 12)
> ts2
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 2000  0.40 -0.61  0.34 -1.13  1.43  1.98 -0.37 -1.04  0.57 -0.14  2.40 -0.04
> ts3 <- ts(x, start = c(2000, 6), frequency = 12)
> ts3
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 2000                                0.40 -0.61  0.34 -1.13  1.43  1.98 -0.37
## 2001 -1.04  0.57 -0.14  2.40 -0.04

14.4 Plotting time series

> # univariate time series
> ts4 <- ts(rnorm(100, 1, 0.25), start = 1900, frequency = 1)
> plot(ts4)

> # multivariate time series
> Mat.ts <- matrix(rnorm(200), 100, 2)
> colnames(Mat.ts) <- c("GDP", "Inflation")
> ts5 <- ts(Mat.ts, start = 1900, frequency = 1)
> plot(ts5, main = "Gross domestic product and Inflation")

> plot(ts5, plot.type = "single", col = c("orange", "blue"), lty = c(1, 2))
> legend("bottomright", legend = colnames(Mat.ts), col = c("orange", "blue"), lty = c(1,
+     2), cex = 0.5)

14.5 Useful functions

14.5.1 start and end

> ts6 <- ts(round(rnorm(12), 2), frequency = 12, start = 2000)
> ts6
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 2000 -0.64 -0.43 -0.17  0.61  0.68  0.57 -0.57 -1.36 -0.39  0.28 -0.82 -0.07
> start(ts6)
## [1] 2000    1
> end(ts6)
## [1] 2000   12

14.5.2 window

> ts6
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 2000 -0.64 -0.43 -0.17  0.61  0.68  0.57 -0.57 -1.36 -0.39  0.28 -0.82 -0.07
> window(ts6, start = c(2000, 6), end = c(2000, 9))
##        Jun   Jul   Aug   Sep
## 2000  0.57 -0.57 -1.36 -0.39

14.5.3 cbind: binding time series

> x1 <- round(rnorm(6), 1)
> ts7 <- ts(x1, freq = 12, start = 1959)
> ts7
##       Jan  Feb  Mar  Apr  May  Jun
## 1959 -1.2  0.0  0.1 -0.1 -0.2  1.8
> 
> x2 <- round(rnorm(5), 1)
> ts8 <- ts(x2, freq = 12, start = c(1959, 3))
> ts8
##       Mar  Apr  May  Jun  Jul
## 1959  0.8  1.1 -0.9  0.2  1.2
> 
> cbind(ts7, ts8)
##           ts7  ts8
## Jan 1959 -1.2   NA
## Feb 1959  0.0   NA
## Mar 1959  0.1  0.8
## Apr 1959 -0.1  1.1
## May 1959 -0.2 -0.9
## Jun 1959  1.8  0.2
## Jul 1959   NA  1.2

14.5.4 aggregate: sums by periods

> library(tseries)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
> help(USAccDeaths)
> head(USAccDeaths)
## [1]  9007  8106  8928  9137 10017 10826
> length(USAccDeaths)
## [1] 72
> 
> # sums by quarter
> aggregate(USAccDeaths, nfrequency = 4, FUN = sum)
##       Qtr1  Qtr2  Qtr3  Qtr4
## 1973 26041 29980 31774 28026
## 1974 22769 26648 28686 26519
## 1975 23592 26813 27998 24660
## 1976 22945 25493 27294 25009
## 1977 22475 26295 28241 25911
## 1978 22519 26741 29421 26943
> 
> # sums by semester
> aggregate(USAccDeaths, nfrequency = 2, FUN = sum)
## Time Series:
## Start = c(1973, 1) 
## End = c(1978, 2) 
## Frequency = 2 
##  [1] 56021 59800 49417 55205 50405 52658 48438 52303 48770 54152 49260 56364
> 
> # sum by year
> aggregate(USAccDeaths, nfrequency = 1, FUN = sum)
## Time Series:
## Start = 1973 
## End = 1978 
## Frequency = 1 
## [1] 115821 104622 103063 100741 102922 105624
> 
> # mean by quarter
> aggregate(USAccDeaths, nfrequency = 4, FUN = mean)
##       Qtr1  Qtr2  Qtr3  Qtr4
## 1973  8680  9993 10591  9342
## 1974  7590  8883  9562  8840
## 1975  7864  8938  9333  8220
## 1976  7648  8498  9098  8336
## 1977  7492  8765  9414  8637
## 1978  7506  8914  9807  8981
> 
> # mean by semester
> aggregate(USAccDeaths, nfrequency = 2, FUN = mean)
## Time Series:
## Start = c(1973, 1) 
## End = c(1978, 2) 
## Frequency = 2 
##  [1] 9337 9967 8236 9201 8401 8776 8073 8717 8128 9025 8210 9394
> 
> # mean by year
> aggregate(USAccDeaths, nfrequency = 1, FUN = mean)
## Time Series:
## Start = 1973 
## End = 1978 
## Frequency = 1 
## [1] 9652 8718 8589 8395 8577 8802

14.6 Fundamental concepts

Let \(X_t, ~ t \in T\) be a time series.

14.6.1 Expected value

\[\mathbb{E}[X_t]=\mu_{X_t}.\]

14.6.2 Autocorrelation function (ACF)

\[\rho_X(t_1,t_2)=\frac{\gamma_X(t_1,t_2)}{\sqrt{\gamma_X(t_1,t_1)\gamma_X(t_2,t_2)}}.\]

Properties:

  • \(\rho_X(t_1,t_2)=\rho_X(t_2,t_1)\)
  • \(-1 \leq \rho_X(t_1,t_2) \leq 1\)

Interpretation: ACF is an autocorrelation function which gives us values of autocorrelation of any series with its lagged (delayed) values. The ACF plot shows these values along with the confidence band. In simple terms, it describes how well the present value of the series is related with its past values. A time series can have components like trend, seasonality, cyclic and residual. ACF considers all these components while finding correlations.

Examples:

> acf(lh, main = "ACF of lh")

> acf(ldeaths, main = "ACF of ldeaths")

> acf(presidents, na.action = na.pass, main = "ACF of presidents")

14.6.3 Partial autocorrelation function (PACF)

\[\phi_{kk}=Corr(X_t,X_{t-k} | X_{t-1},\ldots,X_{t-k+1})\] Interpretation: PACF is a partial autocorrelation function. PACF finds correlation conditional to past values.

Examples:

> pacf(lh, main = "PACF of lh")

> pacf(ldeaths, main = "PACF of ldeaths")

> pacf(presidents, na.action = na.pass, main = "PACF of presidents")

14.6.4 White noise

A stochastic process is said to be a white noise \((\varepsilon_t)_{t \in \mathbb{N}_0}\) if

  • \(\mathbb{E}[\varepsilon_t]=0\)
  • \(Var[\varepsilon_t]=\sigma^2\)
  • \(Cov(\varepsilon_t,\varepsilon_s)=0, \quad t \neq s\)
  • \(\varepsilon_t \sim \mathcal{N}(0,\sigma^2)\)
> wn1 <- rnorm(1000, mean = 0, sd = 1)
> plot(wn1, type = "l", xlab = "", ylim = c(-3, 3))

> acf(wn1)

> pacf(wn1)

> wn2 <- rnorm(1000, mean = 0, sd = 0.2)
> plot(wn2, type = "l", xlab = "", ylim = c(-3, 3))

> acf(wn2)

> pacf(wn2)

14.6.5 Random walk

A stochastic process is a random walk if \(X_t=X_{t-1}+\varepsilon_t\).

> rw <- cumsum(sample(c(-1, 1), 1000, TRUE))
> plot(rw, type = "l", xlab = "")

> acf(rw)

> pacf(rw)

14.7 Classical decomposition models

Traditional methods consider the observed TS as a realization of the process \[X_t=f(T_t,S_t,Y_t),\] where

  • \(T_t\) is the trend component: a trend exists when there is a long-term increase or decrease in the data.
  • \(S_t\) is the seasonal component: a seasonal pattern occurs when a time series is affected by seasonal factors such as the time of the year or the day of the week.
  • \(Y_t\) is the error component (usualy a standard white noise)

Models can be

  • additive: \(X_t=T_t+S_t+Y_t\)
  • multiplicative: \(X_t=T_t*S_t*Y_t\)
  • mixtures: \(X_t=(T_t+S_t+Y_t)+(T_t*S_t*Y_t)\)

The estimated TS is \[\hat{Y_t}=X_t-T_t-S_t.\]

14.7.1 Classical decomposition models (trend component)

\[X_t=T_t+Y_t=(2+t/2)+Y_t, \quad Y_t \sim \mathcal{N}(0,10), \quad t=1,\ldots,250\]

> t <- 1:250
> X <- (2 + t/2) + rnorm(250, 0, 10)
> plot.ts(X)

14.7.2 Classical decomposition models (seasonal component)

\[X_t=S_t+Y_t=3*\cos\left(\frac{2 \pi t}{25}\right)+Y_t, \quad Y_t \sim \mathcal{N}(0,4), \quad t=1,\ldots,250\]

> t <- 1:250
> X <- 3 * cos(2 * pi * t/25) + rnorm(250, 0, 4)
> plot.ts(X)

14.7.3 Classical decomposition models (additive model)

\[X_t=T_t+S_t+Y_t=(2+t/10)+3*\cos\left(\frac{2 \pi t}{25}\right)+Y_t, \quad Y_t \sim \mathcal{N}(0,1), \quad t=1,\ldots,250\]

> t <- 1:250
> X <- 2 + t/10 + 3 * cos(2 * pi * t/25) + rnorm(250)
> plot.ts(X)

14.7.4 Classical decomposition models (multiplicative model)

\[X_t=T_t*S_t+Y_t=(2+t/10)*3*\cos\left(\frac{2 \pi t}{25}\right)+Y_t, \quad Y_t \sim \mathcal{N}(0,10), \quad t=1,\ldots,250\]

> t <- 1:250
> X <- (2 + t/10) * 3 * cos(2 * pi * t/25) + rnorm(250, 0, 10)
> plot.ts(X)

14.7.5 Classical decomposition models (mixture model)

\[X_t=T_t+T_t*S_t+Y_t=(2+t/2)+(2+t/10)*3*\cos\left(\frac{2 \pi t}{25}\right)+Y_t, \quad Y_t \sim \mathcal{N}(0,10), t=1,\ldots,250\]

> t <- 1:250
> X <- (2 + t/2) + (2 + t/10) * 3 * cos(2 * pi * t/25) + rnorm(250, 0, 10)
> plot.ts(X)