Time Series Seasonality and Trend

Last edited: 2024-10-29 12:30:03

Thumbnail

To apply models like the ARMA model to time series sample data, the seasonality and trend have to be removed. Let's look at how to do that.

Let us say we have time series XX which has both a trend and a seasonality then we can do a decomposition of the time series and write it as:

Xt=mt+st+Yt, X_t = m_t + s_t + Y_t,

where mtm_t is the trend component, sts_t is the seasonal component with period dd and YtY_t is the rest which is weakly stationary and has mean of 0. Since we have singled out a seasonal component we know that st=st+ds_t = s_{t+d} because dd is the period of the seasonality.

Before we discuss how to eliminate the trend and seasonal component, we have to examine a couple of time series operators.

The Lag Operator

The lag operator BB, also called the backshift operator, is defined as:

BXt=Xt1. B X_t = X_{t-1}.

So when taking the lag operator to the power of kk we get:

BkXt=Xtk. B^k X_t = X_{t-k}.

The Difference Operator

The difference operator \nabla is defined as:

Xt=(1B)Xt=XtXt1. \nabla X_t = (1-B) X_t = X_t - X_{t-1}.

So when taking the difference operator to the power of kk and using the binomial theorem we get:

kXt=(1B)kXt=i=0k(ki)(1)iBiXt=i=0k(ki)(1)iXti. \nabla^k X_t = (1-B)^k X_t = \sum_{i=0}^k {k \choose i} (-1)^i B^i X_t = \sum_{i=0}^k {k \choose i} (-1)^i X_{t-i}.

The Lag-d Differencing Operator

The lag-d differencing operator d\nabla_d is defined as:

dXt=(1Bd)Xt=XtXtd. \nabla_d X_t = (1-B^d) X_t = X_t - X_{t-d}.

Removing Seasonality in a Time Series

To eliminate seasonality in a time series we will have to use the lag-d differencing operator. Since the seasonality has a period of dd then:

dX=mtmtd+ststd+YtYtd=dmt+dYt, \nabla_d X = m_t - m_{t-d} + s_t - s_{t-d} + Y_t - Y_{t-d} = \nabla_d m_t + \nabla_d Y_t,

because st=stds_t = s_{t-d}.

Removing Trend in a Time Series

Eliminating trend in a time series is a bit more tricky than removing seasonality. We have to assume that

mt=i=0qaiti, m_t = \sum_{i=0}^q a_i t^i,

where aia_i is some real constant and q<nq < n where nn is the sample size of the data. Lets apply the difference operator to the power of qq on mtm_t:

qmt=i=0q(qi)(1)ij=0qaj(ti)j. \nabla^q m_t = \sum_{i=0}^q {q \choose i} (-1)^i \sum_{j=0}^q a_j (t-i)^j.

Since the sums are finite we can change the order of them, so we get the following

j=0qaji=0q(qi)(1)i(ti)j. \sum_{j=0}^q a_j \sum_{i=0}^q {q \choose i} (-1)^i (t-i)^j.

Now, we focus on the inner sum:

i=0q(qi)(1)i(ti)j. \sum_{i=0}^q {q \choose i} (-1)^i (t-i)^j.

By applying the binomial theorem and properties of polynomial expansions on (ti)j(t-i)^j we get:

(ti)j=k=0j(jk)tjk(i)k. (t-i)^j = \sum_{k=0}^j {j \choose k} t^{j-k} (-i)^k.

By substituting this into our sum and changing order of the sums we get:

i=0q(qi)(1)i(ti)j=k=0j(jk)tjk(1)ki=0q(qi)(1)iik. \sum_{i=0}^q {q \choose i} (-1)^i (t-i)^j = \sum_{k=0}^j {j \choose k} t^{j-k} (-1)^k \sum_{i=0}^q {q \choose i} (-1)^i i^k.

Using the fact that

i=0q(qi)(1)iik=0 for k<q \sum_{i=0}^q {q \choose i} (-1)^i i^k = 0 \text{ for } k < q

we find that the entire expression evaluates to 0 when j<qj < q. When j=qj = q this inner sum evaluates to q!q!. Therefore, the final expression simplifies to:

aqq! a_q q!

and this means that

qmt=aqq!. \nabla^q m_t = a_q q!.

So when we apply q\nabla^q to XtX_t we get

qXt=aqq!+qYt. \nabla^q X_t = a_q q! + \nabla^q Y_t.

Given that YY is regarded as a stationary process with a mean of zero, it can be demonstrated that this property extends to qY\nabla^q Y. Consequently, if s=0s = 0, qX\nabla^q X behaves as a stationary process with a mean of aqq!a_q q!.

Was the post useful? Feel free to donate!

DONATE