On Bayesian Identification of Autoregressive Processes

The main objective of the current study is to handle the identification problem of autoregressive processes from the Bayesian point of view. Two Bayesian identification approaches are considered. They are referred to as the direct and the indirect approaches. The two approaches are employed to solve the Bayesian identification problem of autoregressive processes using three well known priors. These priors are the G prior, the Natural-Conjugate prior and Jeffrey's prior. The theoretical derivations related to the two Bayesian identification approaches are conducted using the above mentioned priors. Moreover, the performance of the two techniques, using each of the three priors, is investigated via comprehensive simulation studies. Simulation results show that the two techniques are adequate to solve the identification problem of autoregressive processes. The increase in the time series length leads to better performance for each technique. The use of different priors doesn't affect the numerical results.


Introduction
The autoregressive processes, denoted by AR(p) for short, are very useful in modeling time series data that arise in many areas of scientific endeavor such as engineering, physics, business, marketing and economics.In practice, the model order p is usually unknown and should be identified or estimated.Identification is the first and one of the most important phases in time series analysis.Several Bayesian and non-Bayesian techniques can be traced in the literature to identify the order p.
Two well known approaches are found in the Bayesian literature to identify models for time series.Diaz and Farah (1981) have developed a direct Bayesian method to identify the order of AR(p) models.Their technique assumes that the order p is a random variable with a known maximum.The technique derives the posterior mass function of the order and selects the order with maximum posterior probability to solve the identification problem.Diaz and Farah have used a Natural-Conjugate prior in their theoretical derivations but they haven't studied its accuracy.Broemeling and Shaarawy (1987) have developed an indirect Bayesian approximate procedure to identify the orders of autoregressive moving average processes, denoted by ARMA(p,q).The technique assumes that the orders are unknown constants with known maximums.The technique derives the joint posterior density of the of the model coefficients.To overcome the problems in the exact analysis of the ARMA models, they approximated the posterior distribution of the maximum number of coefficients by a multivariate-t distribution.Then the significance of coefficients is tested by a series of univariate-t tests in a similar fashion to the backward elimination procedure used in linear regression analysis.After eliminating the insignificant coefficients, the order of the model is determined.Broemeling and Shaarawy have used Jeffreys' prior in their derivations.Nevertheless, they haven't studied the numerical effectiveness of the technique.
The current study restricts attention to the use of both direct and indirect Bayesian identification techniques for autoregressive models.It is well known that, the derivation of both the posterior probability mass function of the order and the posterior density of the coefficients of the AR models, according to Bayes theorem, depends on the likelihood function of the time series and the selected prior distribution.Three well-know priors are considered.They are g prior, asserted by Zellner (1983 and1986), natural-conjugate prior, asserted by Raiffa and Schlaifer (1961), and Jeffreys' prior, asserted by Jeffreys '(1961).The derivation of both the posterior probability mass function of the order and the posterior density of the coefficients of the AR models will be developed using each prior.The numerical effectiveness of each posterior distribution in performing the direct and the indirect Bayesian identification techniques will be investigated via simulation studies.
It is worth noting that, the current article can be considered as the first one to use the g prior in solving the problem of direct and indirect Bayesian identification of autoregressive processes.Moreover, the it inspects the numerical effectiveness of using g prior in solving the problem of Bayesian identification of autoregressive processes via comprehensive simulation studies.In addition, the article inspects the numerical effectiveness of using the natural-conjugate prior in solving the Bayesian identification problem of autoregressive processes via simulation.Furthermore, the article compares the performance of the above mentioned two priors with the performance of using Jeffreys' prior in solving the problem.The main difficulty in employing both the natural-conjugate and the g priors is the existence of some hyper-parameters that need to be estimated.
The rest of the article is structured as follows: In section 2, a review of the literature is given.Section 3 is devoted to discussing the autoregressive processes.In section 4, basic concepts of the Bayesian identification techniques are explained.In section 5, the g prior is used to conduct the direct and the indirect Bayesian identification techniques for autoregressive processes.The use of the natural-conjugate prior and Jeffreys' prior to employ the direct and indirect Bayesian identification techniques for autoregressive processes is shown in section 5. Finally, section 6 is devoted to conduct comprehensive simulation studies to investigate and compare the effectiveness of the considered three priors in solving the problem of direct and indirect Bayesian identification of autoregressive processes.

Review of the Literature
The literature of time series identification is vast.It is divided into two parts: Bayesian and non-Bayesian.The most popular non Bayesian approach to identify the orders of ARMA(p,q) models is developed by Box and Jenkins (1970).Their methodology is based on matching the sample autocorrelation and partial autocorrelation functions with their theoretical counterparts.Their technique is explained in many references such as Chatfield (1980), Priestley (1981), Tong (1990), Harvey (1993), Wei (2005), Box et al. (2008) and Liu (2009).Another non Bayesian approach, known as the automatic approach, is based on fitting all possible models and computing a certain criterion for each model and choosing the model which minimizes the proposed criterion.For more details about the automatic approach, the reader is referred to Akaike (1973Akaike ( , 1974)), Hannan and Quinn (1979), Mills and Prasad (1992) and Beveridge and Oickle (1994).
Regarding the literature of Bayesian identification techniques, we can divide them into two types, namely numerical and analytical ones.Numerical techniques include the techniques that depend on numerical integrations and the sampling based methods such as Markov Chain Monte Carlo (MCMC) methods.Monahan (1983) introduced a numerical integration algorithm in order to calculate the posterior probabilities for the orders of ARMA models.Moreover, the MCMC methods were used by Barnett et al. (1996) to estimate the order of AR processes and by Philippe (2006) to estimate the orders of ARMA models.
On the other hand, Diaz and Farah (1981) have proposed a direct analytical Bayesian technique for the identification of autoregressive models, see Broemeling (1985).Corrections were asserted to the derivation of Diaz and Farah by Daif et al. (2003).These corrections enabled them to study the numerical effectiveness of the technique and to compare its numerical effectiveness with the indirect technique developed by Broemeling and Shaarawy (1988).
It is worth mentioning that, the direct analytical technique was extended to seasonal autoregressive models by Shaarawy and Ali (2003).Moreover, it has been extended to moving average models by Shaarawy et al. (2007), to mixed ARMA models by Ali (2009) and to seasonal moving average models by Shaarawy et al. (2011).In addition, the problem of identifying a bivariate autoregressive process using the direct Bayesian technique has been developed by Shaarawy et al. (2006).Furthermore, the direct technique was used to identify multivariate autoregressive and moving average models by Shaarawy and Ali (2008) and (2012) respectively.
On the other hand, an indirect Bayesian analytical identification technique, asserted by Broemeling and Sharaawy (1987), has been used to conduct a complete Bayesian analysis for ARMA models by Broemeling and Sharaawy (1988).Then, Shaarawy (1993) has used the technique to develope a complete Bayesian analysis for multivariate ARMA models.Alshawadfi (2004) has applied the technique to the transfer function models.Moreover, Alshawadfi (2008) has used the technique to identify ARMAX models.
It is worth noting that Shaarawy et al. (2007) and Shaarawy and Ali (2012) have used the indirect technique as an intermediate step in conducting the direct technique to identify both univariate and multivariate moving average models.They have compared its performance with the direct technique in both cases.In all the above mentioned studies, either natural-conjugate prior or Jeffreys' prior or both were used in theoretical derivations, whereas, Jeffreys' prior was used only in the simulation studies.None of the above mentioned studies has used the g prior in the theoretical derivations or in the simulation studies.Moreover, none of these studies checked the numerical effectiveness of the performance of the natural-conjugate prior in solving the identification problem.
Regarding the prior selection process, El Zayat (2007) conducted a comprehensive survey for known informative and non informative priors.She employed a comprehensive simulation study to check the numerical effectiveness of the used prior in the estimation problem of autoregressive models of order one.She has used the above mentioned three priors in the simulation.Furthermore, Shaarawy et al. ( 2010) have employed different priors to solve the Bayesian prediction problem of autoregressive processes of order p.They have checked the numerical effectiveness of each prior via simulation.

Autoregressive Models
The autoregressive model of order p, denoted by AR(p) model can be written as (see Box and Jenkins (1970)), Where, B is the backshift operator defined as B r y t = y t-r and The AR(p) model can also be written as follows (see Shaarawy et al. ( 2010)), Where yt is the t th time series observation, t= 1,2,…,n.i  's are the coefficients, i=1,2,…,p.εt is the t th unobserved random error, t= 1,2,…,n.It is assumed to be independent identically normally distributed N(0,τ -1 ), where τ is the precision parameter.
The stationarity conditions of the model (3.1) are such that the roots of     lie outside the unit circle (Box and Jenkins (1970)).

Special cases of AR(p) model are AR(1) which has the form
The AR(1) model is stationary if , t= 1,2,…,n. (3.4) The AR(2) model is stationary if , and

Basic Concepts
Derivations of posterior densities depend on the likelihood function and the form of the prior distribution which are both defined hereafter.The likelihood function for the AR(p) model is conditioned on the first p observations in the time series and can be written in the form (see Broemeling (1985)) Where, τ is the precision parameter, Y = [yp+1 yp+2 … yn ]' is the vector of observations such that, n > p, γ is the coefficients' vector defined as, ] ...
X is an (n-p)×p matrix of regressors such that the t th row vector Xt is represented by, Regarding g prior, it is important for analytical purposes to consider the form introduced by Zellner (1986).Consider the following GLM: Where, Y is a vector of n observations, X is an n×k non-stochastic design matrix of rank k, β is the k×1 coefficients' vector, and U is the errors' vector following the N(0, σ 2 I n ) distribution, where σ 2 is unknown.The g prior for β and σ has the form: Where Such that,  is an anticipated value of β and g is an initially given value.Fernàndez et al (2001) concluded, based on simulation studies, that the most reasonable choices of the value of g are It should be noted that when the sample under consideration follows a normal distribution, the normal-gamma prior is the natural-conjugate one and is written as follows (See Raiffa and Schlaifer (1961)): Where α, β, µ and V are the hyper-parameters of the prior distribution.
In addition to the above two priors, Jeffreys' prior introduced by Jeffreys (1961) is given by: Furthermore, in the direct Bayesian identification, the marginal prior mass function of the order p is assumed to be uniform such that, (See Daif et al ( 2003)) Where, m is the assumed maximum order.

Bayesian Identification of AR Models
Two main approaches are introduced in the literature to develop the Bayesian identification.The two approaches are called the direct and the indirect approaches.In this section, the Bayesian identification techniques are presented for pure AR models.The basic theoretical contribution of this study is the development of the above mentioned Bayesian identification techniques using the g prior which haven't been used before in employing the Bayesian identification of autoregressive processes.The section is divided into two subsections.In subsection (5.1), the direct Bayesian identification technique is developed for AR processes, whereas, subsection (5.2) is devoted to the indirect Bayesian technique.

Direct Bayesian Identification of AR Models
The direct identification technique, introduced by Diaz Where, Y = [ yp+1 , ..., yn]', γ is the coefficients vector defined in (4.2), τ is the precision parameter and Where, X is the matrix of regressors with t th row Xt defined in (4.3).

Direct Bayesian Identification using G Prior
Let Sn = [y1, y2 ,… ,yn] be as defined above and γ be the coefficients' vector defined in (4.2).Furthermore, assume the prior of the parameters γ and σ is a g prior in the form (4.7) and the prior of the order p is uniform in the form (4.10).Thus the joint g prior of the AR model parameters has the form Where, σ > 0, p =1, …,m and A1 is the matrix defined in (5.2).
The conditional likelihood function (5.1) can be written in terms of σ instead of τ as follows Where, A1 ,B1 and C are defined in (5.2).The following theorem asserts the posterior probability mass function of the order p of autoregressive processes using g prior.

Theorem 5.1
Using the joint g prior (5.3) and the conditional likelihood function (5.4), the marginal posterior probability mass function of the order p of AR model has the form Where, Such that, g is defined in (4.6),  is an anticipated value of γ and A1, B1 and C are defined in (5.2).

Proof
Combining the g prior (5.3) with the conditional likelihood function in (5.4), the joint posterior distribution of the model parameters has the form Expanding the quadratic term in the exponent and reorganizing the terms, the exponent will be in the form and * C be as defined in (5.6).Then, Completing the squares in the exponent of (5.8), it will be in the form, Integrating (5.7) with respect to γ, one gets the joint posterior distribution of p and σ in the form Note that (5.9) as a function of σ is an inverted gamma density of second type, when integrated with respect to σ, one gets the marginal posterior mass function of p as given in (5.5).

Direct Identification using Natural-Conjugate and Jeffreys' Priors
Using the natural-conjugate prior (4. Where, Where, A1, B1 and C are defined in (5.2).
On the other hand, using Jeffreys' prior defined in (4.9), Daif et al. (2003) have also proved that the marginal posterior probability mass function of the order p of AR processes has the form , p=l,2,...,m (5.12) Where, A1, B1 and C are defined in (5.2).

Indirect Bayesian Identification of AR Models
The indirect technique proposed by Broemeling and Shaarawy (1987) focuses on the marginal posterior distribution of the coefficients of the ARMA model.The technique assumes that the orders p and q are unknown constants with known maximums.Using some approximation to overcome the difficulty in developing the exact Bayesian analysis of ARMA models, the technique derives the approximate marginal posterior density of the maximum number of coefficients which is the multivariate t distribution.After that, a sequence of tests of significance is followed to eliminate the insignificant terms.By this way the model orders p and q are determined.This section employs the indirect technique for pure AR models, which is a special case of the work of Broemeling and Shaarawy (1987).The work of Broemeling and Shaarawy (1987) used both the natural-conjugate and Jeffreys' prior.In the current work we conduct the derivations using g prior in addition.
The technique is employed to AR models in two steps.First, given the highest model order, the marginal posterior distribution of the model's coefficients is derived.Then, a sequence of tests of significance for the coefficients is followed to determine the order p.

Indirect Bayesian Identification using G Prior
Let m (m < n) be the known maximum of p. Suppose that there is a time series with n observations Sn=[y1 y2 … yn] with unknown initial values y0, y-1 ,… ,y1-m be the vector of coefficients.Moreover, assume that the prior of the model parameters γ and σ is a g prior defined in (4.6), that can be written for AR(m) models in the form Where, X is the matrix of regressors of the model with t th row Xt in the form The conditional likelihood function, conditioned on the first m observations, can be written in matrix notation as, (5.15) Where, A1, B1 and C are defined in (5.2) and Y = [ym+1 ym+2 … yn ]' is the vector of observations and X is the matrix of regressors with t th row Xt defined in (5.14).
The following theorem asserts the form of the marginal posterior density of the coefficients of AR(m) model using g prior.

Theorem 5.2
Using the g prior (5.13) and the conditional likelihood function (5.15), the marginal posterior density of the maximum number of coefficients of AR model is a multivariate-t distribution with (n-m) degrees of freedom, location vector 16) and precision matrix Where, A1, B1 and C were defined in (5.2).

Proof
Multiplying the g prior (5.13) by the conditional likelihood function (5.15) one gets the posterior distribution of the model parameters in the form Expanding the quadratic term in the exponent of (5.19) and reorganizing the terms, the exponent will be in the form and 

3
C be as defined in (5.18).Then, the joint posterior distribution in (5.19) will be in the form (5.20) Completing the squares in the exponent of (5.20), the exponent becomes in the form Note that (5.20) as a function of σ is an inverted gamma density of second type, when integrated with respect to σ, one gets the marginal posterior distribution of the coefficients' vector γ (m) which is an m-dimensional multivariate-t distribution with n-m degrees of freedom, location vector and precision matrix as given in (5.16) and (5.17) respectively.

Indirect Identification using Natural-Conjugate and Jeffreys' Priors
Under the same assumptions of section (5.2.1) and using the natural-conjugate prior (4.7),Daif et al. (2003) have proved that the marginal posterior distribution of the maximum number of coefficients of AR model is an m-dimensional multivariate-t distribution with (n+2α-m) degrees of freedom, location vector C are defined in (5.11).
On the other hand, using Jeffreys' prior defined in (4.8), Daif et al. (2003) have also proved that the marginal posterior distribution of the maximum number of coefficients of AR model is an m-dimensional multivariate-t distribution with (n-2m) degrees of freedom, location vector Where, A1, B1 and C are defined in (5.2).

The Indirect Bayesian Approach
Since γ (m) has a multivariate-t distribution, any single component of this vector has a univariate-t distribution and the conditional distribution of any component given any other component is a univariate-t distribution.Then one can do a backward elimination procedure to identify an initial value for the order p as follows: 1.

4.
The procedure is continued in this fashion until the hypothesis φp = 0 is rejected for some p where 0 < p ≤ m.The value p is the indirect Bayesian solution to the identification problem.

Effectiveness Studies
The main objective of the current simulation studies is to investigate the effect of the selected prior on the used Bayesian identification technique for AR(p) models.The studies compare the performance of the three priors defined in previous chapters using four different AR models and different time series lengths.This section presents the main stages of the proposed simulation studies.The effectiveness criterion used in the study to evaluate and compare the performance of the above mentioned two Bayesian identification techniques, using each of the above mentioned three priors, is the percentage of correct identification.Both the setup and steps of the simulation studies will be explained in details.Finally the results will be displayed and discussed.

Simulation design
This study aims at assessing the effect of the selected prior on the performance and numerical effectiveness of the proposed techniques in identifying the order of AR processes.In order to achieve this goal, four simulation studies were conducted.The indirect and direct techniques were employed, using the three priors of interest, to identify the orders of AR(1) and AR(2) sources with different parameter values.The parameters were chosen in some cases inside the stationarity domain while in other cases near the boundaries.All computations were performed using Matlab 7.1.
For estimating the hyperparameters of the informative priors, there are many methods available in the literature.The current study uses the training sample method to estimate the hyperparameters (See Shaarawy et al. (2010)).For each time series, a training sample of size either 10 (for short time series) or 10% of the considered time series observations is used in this consequence.For instance, if the time series length is smaller than or equal to 100 the training sample starts from y1 to y10, while if the length is 200 the training sample starts from y1 to y20 and so on.Then, the estimated parameters are used to conduct the simulation studies.
The effectiveness criterion, mentioned above, is observed with respect to the time series length as well as the coefficients values.In all simulation studies the precision parameter τ was set to be 2.
These random variates are used to generate 500 observations from AR(1) model with coefficient . The first 200 observations are ignored to overcome initialization effect.Then the next 300 observations are used in the simulation study.The second step is done for the first 30 observations of the realization.A certain maximum order m is assumed to be 4.The direct technique is used to identify the time series.It is applied three times using the three mentioned priors.Moreover, the indirect technique is also applied three times.The third step is to repeat the second step for the first 50 observations including the first 30 observations.The same process is repeated for the first 100 observations of the realization including the first 50 observations.Then, the process is repeated for the first 200 observations, and finally for the 300 observations of the realization.The generation and identification processes mentioned above are repeated 500 times and the percentage of correct identification is computed for each technique in each case.Simulation II is done in a similar manner using 8 .0 1   , while simulations III and IV are done similarly using AR(2) models with two different parameter sets (0.9,-0.3) and (0.1,0.8) respectively.The results of the four simulation studies are presented and discussed in the following section.

Simulation Results
This section presents and explains the results of the four simulation studies.After the generation of 500 series from a specific AR(1) or AR(2) source, both the direct and indirect Bayesian identification techniques are performed.
With respect to the indirect technique, the parameters of the posterior density of the model's coefficients with a maximum order m = 4 are calculated.Then, by constructing a series of HPD regions for the coefficients with 5% level of significance, a sequence of back elimination significance tests is performed to select the order of the model.While for the direct technique, the marginal posterior mass function of p =1,2,... ,m is calculated for each series assuming m = 4 with different prior functions for the model parameters.The AR model with the greatest probability is selected as the identified model.For both the indirect and direct techniques, the percentage of correct identification is computed.
The results of each simulation study are presented in one of the following four tables.The rows of each table present the time series lengths and the used identification technique.While the columns present the three proposed priors.Source: simulated data Table (6.3)shows the results of simulation III.From the table, one finds that the percentages of the correct identification are low for short time series (n = 30) in both the direct and indirect techniques whatever be the used prior.However, they increase as the time series length increases.The direct technique gives slightly better results than the indirect one for short time series, but the results become equivalent for longer time series.The differences in the performance of the three priors vanish as the time series gets longer.Source: simulated data Table (6.4)shows the results of simulation IV.From the table, the results are high and stable for all cases.Moreover, the performances of both the two techniques and the three priors are equivalent in all cases.The results are similar to those of simulation I and II.

Conclusion
The general conclusions achieved from the above mentioned four simulation studies are as follows: First: the two Bayesian identification techniques succeed in identifying the right model for AR sources.
Second: there is no remarkable difference in the goodness of the identification technique, when using different priors in its derivation.
Finally: we can recommend the use of the non-informative Jeffreys' prior to solve the Bayesian identification problem since it overcomes the problem of estimating the hyperparameters of the informative priors and its goodness is equivalent to informative priors.

6 )
Using the forms (4.4), (4.5) and (4.6),Shaarawy et al. (2010) have derived the form of the g prior for AR(p) models as 8),Daif et al. (2003) have proved that the marginal posterior probability mass function of the order p of autoregressive processes has the form Conditioning on the first p observations, the conditional likelihood function has the form and Farah(1981)for AR(p) models, assumes that the order p of an autoregressive process is an unknown random variable with known maximum, namely m.The problem is to find the posterior probability mass function of p. Assume that one has a time series of n observations Sn= [ y1 , y2,..., yn].

Table ( 6.1): The percentages of correct identification for the direct and indirect techniques using AR(1) model with
Table(6.1)shows the results of simulation I. From the table, one observes that the percentages of the correct identification are high for both identification techniques for all time series lengths and priors.This means that the two techniques succeed in identifying an appropriate model for the generated time series no matter what are the time series length and the chosen prior.Moreover, there is no remarkable difference in the effect of the three considered priors on the performance of the two identification techniques at each time series length.

Table ( 6.2): The percentages of correct identification for the direct and indirect techniques using AR(1) model with
Table(6.2)shows the results of simulation II.From the table, one observes that the results are similar to the results of simulation I.