4.1 Seasonal ARIMA models

Printer-friendly versionPrinter-friendly version

Seasonality in a time series is a regular pattern of changes that repeats over S time periods, where S defines the number of time periods until the pattern repeats again.

For example, there is seasonality in monthly data for which high values tend always to occur in some particular months and low values tend always to occur in other particular months. In this case, S = 12 (months per year) is the span of the periodic seasonal behavior. For quarterly data, S = 4 time periods per year.

In a seasonal ARIMA model, seasonal AR and MA terms predict xt using data values and errors at times with lags that are multiples of S (the span of the seasonality).

  • With monthly data (and S = 12), a seasonal first order autoregressive model would use xt-12 to predict xt.  For instance, if we were selling cooling fans we might predict this August’s sales using last August’s sales.  (This relationship of predicting using last year’s data would hold for any month of the year.)
  • A seasonal second order autoregressive model would use xt-12 and xt-24 to predict xt.  Here we would predict this August’s values from the past two Augusts.
  • A seasonal first order MA(1) model (with S = 12) would use wt-12 as a predictor.  A seasonal second order MA(2) model would use wt-12 and wt-24.


Almost by definition, it may be necessary to examine differenced data when we have seasonality.  Seasonality usually causes the series to be nonstationary because the average values at some particular times within the seasonal span (months, for example) may be different than the average values at other times.  For instance, our sales of cooling fans will always be higher in the summer months.

Seasonal differencing is defined as a difference between a value and a value with lag that is a multiple of S.

  • With S = 12, which may occur with monthly data, a seasonal difference is (1-B12)xt = xt - xt-12.

The differences (from the previous year) may be about the same for each month of the year giving us a stationary series.

  • With S = 4, which may occur with quarterly data, a seasonal difference is (1-B4)xt = xt - xt-4.

Seasonal differencing removes seasonal trend and can also get rid of a seasonal random walk type of nonstationarity.

Non-seasonal differencing:  If trend is present in the data, we may also need non-seasonal differencing.  Often (not always) a first difference (non-seasonal) will “detrend” the data.  That is, we use  (1-B)xt = xt - xt-1 in the presence of trend.

Differencing for Trend and Seasonality: When both trend and seasonality are present, we may need to apply both a non-seasonal first difference and a seasonal difference.

That is, we may need to examine the ACF and PACF of  (1-B12)(1-B)xt = (xt -xt-1) - (xt-12-xt-13).

Removing trend doesn't mean that we have removed the dependency.  We may have removed the mean, μt, part of which may include a periodic component.  In some ways we are breaking the dependency down into recent things that have happened and long-range things that have happened.

Non-seasonal Behavior Will Still Matter

With seasonal data, it is likely that short run non-seasonal components will still contribute to the model.  In the monthly sales of cooling fans mentioned above, for instance, sales in the previous month or two, along with the sales from the same month a year ago, may help predict this month’s sales.

We’ll have to look at the ACF and PACF behavior over the first few lags (less than S) to assess what non-seasonal terms might work in the model.

Seasonal ARIMA Model

The seasonal ARIMA model incorporates both non-seasonal and seasonal factors in a multiplicative model.  One shorthand notation for the model is

ARIMA(p, d, q) × (P, D, Q)S,

with p = non-seasonal AR order, d = non-seasonal differencing, q = non-seasonal MA order, P = seasonal AR order, D = seasonal differencing, Q = seasonal MA order, and S = time span of repeating seasonal pattern.

Without differencing operations, the model could be written more formally as

(1)      Φ(BS)φ(B)(xt - μ) = Θ(BS)θ(B)wt

The non-seasonal components are:

AR:  φ(B) = 1 - φ1B - ... - φpBp

MA:  θ(B) = 1 + θ1B + ... + θqBq

The seasonal components are:

Seasonal AR:  Φ(BS) = 1 - Φ1BS - ... - ΦPBPS

Seasonal MA:  Θ(BS) = 1 + Θ1BS + ... + ΘQBQS

Note that on the left side of equation (1) the seasonal and non-seasonal AR components multiply each other, and on the right side of equation (1) the seasonal and non-seasonal MA components multiply each other.

Example 1: ARIMA (0, 0, 1) × (0, 0, 1)12

The model includes a non-seasonal MA(1) term, a seasonal MA(1) term, no differencing, no AR terms and the seasonal period is S = 12.

The non-seasonal MA(1) polynomial is θ(B) = 1 + θ1B .

The seasonal MA(1) polynomial is Θ(B12) = 1 + Θ1B12.

The model is (xt - μ) = Θ1(B12) θ1(B)wt = (1 + Θ1B12)(1 + θ1B)wt.

When we multiply the two polynomials on the right side, we get

(xt - μ) = (1 + θ1B + Θ1B12 + θ1Θ1B13)wt

            = wt + θ1wt-1 + Θ1wt-12 + θ1Θ1wt-13.

Thus the model has MA terms at lags 1, 12, and 13.  This leads many to think that the identifying ACF for the model will have non-zero autocorrelations only at lags 1, 12, and 13.  There’s a slight surprise here.  There will also be a non-zero autocorrelation at lag 11.  We supply a proof in Appendix 1 for this document.

Example 1 Continued:

We simulated n = 1000 values from an ARIMA (0, 0, 1) × (0, 0, 1)12.  The non-seasonal MA (1) coefficient was θ1=.7.  The seasonal MA(1) coefficient was Θ1=.6.  The sample ACF for the simulated series was as follows:

Ex 1

Note the spikes at lags 1, 11, and 12 in the ACF.  This is characteristic of the ACF for the ARIMA (0, 0, 1) × (0, 0, 1)12.  Because this model has nonseasonal and seasonal MA terms, the PACF tapers nonseasonally, following lag 1, and tapers seasonally, that is near S=12, and again near lag 2*S=24.

Example 2 ARIMA (1, 0, 0) × (1, 0, 0)12

The model includes a non-seasonal AR(1) term, a seasonal AR(1) term, no differencing, no MA terms and the seasonal period is S = 12.

The non-seasonal AR(1) polynomial is φ(B) = 1 - φ1B.

The seasonal AR(1) polynomial is Φ(B12) = 1 - Φ1B12.

The model is (1 - Φ1B12)(1 - φ1B)(xt - μ) = wt.

If we let zt = xt - μ (for simplicity), multiply the two AR components and push all but zt to the right side we get zt = φ1zt-1 + Φ1zt-12 + (-Φ1φ1)zt-13 + wt.

This is an AR model with predictors at lags 1, 12, and 13.

R can be used to determine and plot the PACF for this model, with φ1=.6 and Φ1=.5. That PACF (partial autocorrelation function) is:


It’s not quite what you might expect for an AR, but it almost is.  There are distinct spikes at lags 1, 12, and 13 with a bit of action coming before lag 12.  Then, it cuts off after lag 13.

Note: R commands were

thepacf=ARMAacf (ar = c(.6,0,0,0,0,0,0,0,0,0,0,.5,-.30),lag.max=30,pacf=T)
plot (thepacf,type="h")

Identifying a Seasonal Model

Step 1:  Do a time series plot of the data.  Examine it for features such as trend and seasonality.  You’ll know that you’ve gathered seasonal data (months, quarters, etc,) so look at the pattern across those time units (months, etc.) to see if there is indeed a seasonal pattern.

Step 2:  Do any necessary differencing. The general guidelines are:

  • If there is seasonality and no trend, then take a difference of lag S. For instance, take a 12th difference for monthly data with seasonality.  Seasonality will appear in the ACF by tapering slowly at multiples of S.
  • If there is linear trend and no obvious seasonality, then take a first difference.  If there is a curved trend, consider a transformation of the data before differencing.
  • If there is both trend and seasonality, apply a seasonal difference to the data and then re-evaluate the trend.  If a trend remains, then take first differences.  For instance, if the series is called x, the commands in R would be:

diff12=diff(x, 12)
diff1and12 = diff(diff12, 1)

  • If there is neither obvious trend nor seasonality, don’t take any differences.

Step 3:  Examine the ACF and PACF of the differenced data (if differencing is necessary).

We’re using this information to determine possible models.  This can be tricky going involving some (educated) guessing.  Some basic guidance:

Non-seasonal terms:  Examine the early lags (1, 2, 3, …) to judge non-seasonal terms.  Spikes in the ACF (at low lags) indicate non-seasonal MA terms.  Spikes in the PACF (at low lags) indicated possible non-seasonal AR terms.

Seasonal terms:  Examine the patterns across lags that are multiples of S. For example, for monthly data, look at lags 12, 24, 36, and so on (probably won’t need to look at much more than the first two or three seasonal multiples).  Judge the ACF and PACF at the seasonal lags in the same way you do for the earlier lags.

Step 4:  Estimate the model(s) that might be reasonable on the basis of Step 3.  Don’t forget to include any differencing that you did before looking at the ACF and PACF.  In the software, specify the original series as the data and then indicate the desired differencing when specifying parameters in the arima command that you’re using.

Step 5:  Examine the residuals (with ACF, Box-Pierce, and any other means) to see if the model seems good.  Compare AIC or BIC values if you tried several models.

If things don’t look good here, it’s back to Step 3 (or maybe even Step 2).

Example 3

The data series are a monthly series of a measure of the flow of the Colorado River, at a particular site, for n = 600 consecutive months.

Step 1: A time series plot is


With so many data points, it’s difficult to judge whether there is seasonality.  If it was your job to work on data like this, you probably would know that river flow is seasonal – perhaps likely to be higher in the late spring and early summer, due to snow runoff.

Without this knowledge, we might determine means by month of the year.  Below is a plot of means for the 12 months of the year.  It’s clear that there are monthly differences (seasonality).


Looking back at the time series plot, it’s hard to judge whether there’s any long run trend.  If there is, it’s slight.

Steps 2 and 3:  We might try the idea that there is seasonality, but no trend.  To do this, we can create a variable that gives the 12th differences (seasonal differences), calculated as xt-xt-12.  Then, we look at the ACF and the PACF for the 12th difference series (not the original data). Here they are:


Non-seasonal behavior:  The PACF shows a clear spike at lag 1 and not much else until about lag 11.  This is accompanied by a tapering pattern in the early lags of the ACF.  A non-seasonal AR(1) may be a useful part of the model.

Seasonal behavior:  We look at what’s going on around lags 12, 24, and so on.  In the ACF, there’s a cluster of (negative) spikes around lag 12 and then not much else.   The PACF tapers in multiples of S; that is the PACF has significant lags at 12, 24, 36 and so on.  This is similar to what we saw for a seasonal MA(1) component in Example 1 of this lesson.

Remembering that we’re looking at 12th differences, the model we might try for the original series is ARIMA (1,0,0)×(0,1,1)12.

Step 4:  Minitab results for the ARIMA (1,0,0)× (0,1,1)12:

Final Estimates of Parameters

Type Coef SE Coef T P
AR 1 0.5162 0.0354 14.58 0.000
SMA 12 0.9140 0.0169 53.95 0.000
Constant -0.006502 0.002884 -2.25 0.025

Differencing: 0 regular, 1 seasonal of order 12
Number of observations: Original series 600, after differencing 588
Residuals: SS = 272.886 (backforecasts excluded)
MS = 0.466 DF = 585

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 5.6 10.0 16.3 24.0
DF 9 21 33 45
P-Value 0.781 0.979 0.994 0.996

Things look good.  The Box-Pierce statistics are all non-significant and the estimated coefficients (previous page) are statistically significant.

Step 5 (diagnostics):  We’ve already looked at the Box-Pierce tests.  The ACF of the residuals looks good too:


What doesn’t look perfect is a plot of residuals versus fits.  There’s non-constant variance.


We’ve got three choices for what to do about the non-constant variance: (1) ignore it, (2) go back to step 1 and try a variance stabilizing transformation like log or square root, or (3) use an ARCH model that includes a component for changing variances.  We’ll get to ARCH models later in the course.

Lesson 4.2 for this week will give R guidance and an additional example or two.


Appendix (Optional reading):

Only those interested in theory things need to look at the following.

In Example 1, we promised a proof that $\rho_{11}$ ≠ 0 for ARIMA (0, 0, 1) × (0, 0, 1)12.

A correlation is defined as Covariance/ product of standard deviations.

The covariance between xt and xt-11 = E(xt - μ )(xt-11 - μ).

For the model in Example 1,

xt - μ = wt + θ1wt-1 + Θ1wt-12 + θ1Θ1wt-13

xt-11 - μ = wt-11 + θ1wt-12 + Θ1wt-23 + θ1Θ1wt-24

The covariance between xt and xt-11

(2) E(wt + θ1wt-1 + Θ1wt-12 + θ1Θ1wt-13)(wt-11 + θ1wt-12 + Θ1wt-23 + θ1Θ1wt-24)

The w’s are independent errors.  The expected value of any product involving w’s with different subscripts will be 0.  A covariance between w’s with the same subscripts will be the variance of w.

If you inspect all possible products in expression 2, there will be one product with matching subscripts.  They have lag t – 12.  Thus this expected value (covariance) will be different from 0.

This shows that the lag 11 autocorrelation will be different from 0.  If you look at the more general problem, you can find that only lags 1, 11, 12, and 13 have non-zero autocorrelations for the ARIMA (0, 0, 1) × (0, 0, 1)12.

A seasonal ARIMA model incorporates both non-seasonal and seasonal factors in a multiplicative fashion.