A New Heteroskedastic Consistent Covariance Matrix Estimator using Deviance Measure

In this article we propose a new heteroskedastic consistent covariance matrix estimator, HC6, based on deviance measure. We have studied and compared the finite sample behavior of the new test and compared it with other this kind of estimators, HC1, HC3 and HC4m, which are used in case of leverage observations. Simulation study is conducted to study the effect of various levels of heteroskedasticity on the size and power of quasi-t test with HC estimators. Results show that the test statistic based on our new suggested estimator has better asymptotic approximation and less size distortion as compared to other estimators for small sample sizes when high level of heteroskedasticity is present in data.


Introduction
In regression analysis, the presence of heteroskedasticity in the data leads to inefficient estimates of ordinary least square (OLS) estimates. In this situation the covariance matrix estimate of OLS estimates become biased and does not remain consistent. Thus the inconsistency of the covariance matrix fails to provide the asymptotically valid inference. The problem becomes more severe with the increased level of heteroskedasticity. In regression analysis, it is very common among practitioners to use the point estimates computed from OLS method even if they suspect the presence of heteroskedasticity in the data. However in order to perform inference about the parameters of the regression model, it is important to use a heteroskedasticity consistent estimate of covariance matrix.
Several authors have suggested covariance matrix estimators which are consistent in case of both homoskedastic and heteroskedastic error variances. The most commonly used heteroskedastic consistent estimator was suggested by White (1980), named as HC 0 . This estimator is widely used in literature but various studies showed that HC 0 can be severely biased for small samples. It tends to underestimate the true variance which in turn results in poor performance of associated quasi t-statistic see, e.g MacKinnon and White (1985), Cribari-Neto and Zarkos (1999), Cribari-Neto and Zarkos (2001 (1985), Suggested alternative HCCMEs called HC 1 , HC 2 . Later Davidson and MacKinnon (1993) suggested another alternative estimator, called as HC 3 , which is an approximation of the jackknife estimator. The simulation results in Long and Ervin (2000) showed that HC 3 performed the best among the other available such estimators. Cribari-Neto and Zarkos (2001) showed that the presence of high leverage observation is more critical for HCCMEs. So Cribari-Neto (2007) proposed a new version of the HCCME called as HC 4 . Their numerical results showed that the inference about regression parameter using HC 4 is more reliable but it showed large amount of bias. Later Cribari-Neto et al., (2007), Cribari-Neto and da Silva (2011) suggested two new versions of HCCMEs denoted as HC 5 and HC 4m , which have lesser bias relative to HC 4 . In this article we propose a new estimator, called as HC 6 . It performs well in case of small sample especially when heteroskedasticity level is high. The simulations results show that quasi t-test for the inference about regression parameters based on new estimator has better approximation of asymptotic distribution when heteroskedasticity is high and sample size is small.
The rest of the paper is organized as follows: we introduce the model and covariance matrix estimators in Section 2. In section 3, we propose a new HCCME, HC 6 , based on measure of deviance. The results and discussion are reported in Section 4. In Section 5, we study the application to real life data. The concluding remarks are given in Section 6.

The Model and Estimators
The regression model considered is, where, X is the n × k matrix of independent variables, Y is n × 1 vector of dependent variable and  is the n × 1 column vector of error term and ) ,..., , ( is the vector of parameters need to be estimated. We assume that ) , 0 ( is the vector of OLS residuals. The commonly used HCCME called HC 0 was given by Eicker (1963) and (White, 1980), is given as, White (1980) proposed this estimator to resolve the problem of estimation and inference in the presence of heteroskedasticity. This estimator proved to be consistent, in various studies, when nothing is known about the form of heteroskedasticity see e.g Arce and Mora (2002). HC 0 as discussed in Section 1 can be seriously biased for small samples. There are some alternatives to the estimator of (White, 1980), available the in literature. These estimators are proposed to control the tendency of underestimation of the variance of the OLS estimates. These alternative estimators are found to be consistent under heteroskedasticity and incorporates small sample adjustment factors see e.g Cribari-Neto and Zarkos (1999), Cribari-Neto and Zarkos (2004), Davidson and MacKinnon (1993), but none of these work well in the scenario discussed in this paper i.e. small sample size with high level of heteroskedasticity.
According to the MacKinnon and White (1985), HC 0 does not take into account the well known fact that the OLS residuals tend to be very small. They used a modified estimator of HC 0 which they obtained by using the degree of freedom correction similar to one conventionally used to obtain unbiased estimate of variance denoted by σ 2 . This yields the modified estimator HC 1 suggested by Hinkley (1977) defined as, Where, is called the finite correction factor, where k denotes the number of parameters and I n is n × n identity matrix. But according to them degree of freedom adjustment in HC 1 is not the only way to compensate for the fact that the OLS residuals tend to underestimate the true errors. So following the Horn et al., (1975) they proposed another estimator called HC 2 defined as, The HC 3 given by Davidson and MacKinnon (1993) can be written as, The estimators, HC 2 and HC 3 , include the finite sample correction factors that are based upon the leverages of different observations, greater the leverage, more inflated will be the corresponding squared residuals see e.g Cribari-Neto and da Silva (2011). The resulting quasi-t tests tend to be quite liberal when the design matrix includes high leverage observations, thus leading to imprecise inference. So Cribari-Neto and Zarkos (2004), proposed a new estimator denoted by HC 4 that takes into account the impact of high leverage points on the finite-sample behavior of the covariance matrix estimator. The HC 4 estimator is given as, . The exponent i  controls the level of discounting for i th observation and is given by the ratio between ii h and h , . Hence, the i th squared residual will be more strongly inflated when hii is large relative to ¯h. HC 4 aims at discounting for leverage points more heavily than HC 2 and HC 3 . Cribari-Neto and da Silva (2011) showed that the asymptotic approximation of the HC 4 is very poor so that they suggested a modified version of HC 4 , denoted by HC 4m , given as, The values for γ 1 and γ 2 are selected in such a way that it will be helpful in reducing the effect of leverage observation. The values suggested by Cribari-Neto and da Silva (2011) are γ 1 = 1.0 and γ 2 = 1.5 and same values are used in our simulations.

New Estimator
As we have discussed in Section 1 and 2 that all the alternative estimators of HC 0 take into account only the effect of leverage observation or the extreme values in the design matrix X. In practice, whenever we have leverage points there also exist some influential observations in the Y variable which affect the results of the variance covariance matrix of the OLS estimates. All the modified versions of HC 0 use leverage measure to rescale the OLS residuals involve in the estimation sandwich estimator to control the underestimation of the covariance matrix. But the leverage measure does not consider the effect of influential observations. Now in order to consider the effect of both leverage and influential observations in the estimation of variance covariance matrix we propose a new estimator denoted by HC 6 given as, . where k is the number of parameters, r i is the studentized residual, h ii is the leverage measure for the i th observation and e i are the OLS residuals. It can be noticed that apart from the factor k, d ii is the product of i th squared studentized residual and the factor h ii /(1 − h ii ). Thus d ii is made up of a component that reflects how well the model is fitted to the i th observation y i and a component that measures the distance of the i th observation from the rest of the data, see e.g. (Montgomery et al., 2001). The diagonal elements of E 6 , d ii , behave like h ii but unlike h ii , it can take value greater than one, when both the leverage and the influential observations are present at same point. That is why, d ii is used to detect the influential observations in the data. The most interesting feature of E 6 is that it takes into account the effect of outliers or extreme observations both in x-space and y-space, so when it is used to rescale the residuals it may improve the results of the covariance matrix estimator even when x-space have no extreme values but only y-space have some or more extreme values. The value of d ii will be larger if there are large values of X or Y or both in the data, so the i th squared residuals will be more appropriately weighted when there are outliers of any type in the data.

Results
In order to evaluate the performance of the new suggested estimator, we compute its relative probability discrepancy (RPD), see e.g, Chand and Aftab (2012), Davidson and MacKinnon (1998). Simulation study is performed following design given in Cribari-Neto and da Silva (2011). The numerical results which are stated in this section are obtained using the heteroskedastic regression model given as: Here ) , 0 ( . When λ = 1 it denoted that there is no heteroskedasticity and when λ > 1 it implies that the heteroskedasticity is present. The larger values of λ indicate the higher level of heteroskedasticity. When α = 0.26 then λ is approximately equals to 60, for α = 1.5, λ is approximately equals to 90, when α = 2, λ is approximately equals to 150 and finally when α = 2.5 the value of the λ is greater than or equals to 190.
In this study, we want to test the hypothesis H 0 : β 2 = 0 against the two sided alternative hypothesis H 1 : β 2 ≠ 0. The test statistic used is where 2  denote the OLS estimate of β 2 and ) var( 2  is variance estimate of 2  and it is based on HC 1 , HC 3 , HC 4m and HC 6 estimators. The number of Monte Carlo runs are set to 10, 000. All the simulation results are performed using the R programming language, see (R Development Core Team, 2011). In this study, we consider only heteroskedastic errors with high leverage and influential observations. Table 1 presents the empirical probabilities of quasi t-test based on the considered HCCMEs. We study the effect of level of heteroskedasticity on the approximation of asymptotic distribution of quasi t-test for different choices of sample size, ranging from small to large sample size.  We have considered asymptotic probabilities γ = 0.90, 0.95 and 0.99 corresponding to 10%, 5% and 1% levels of significance which are common choices in statistical inference. It can be observed from Table 1 that when there is no heteroskedasticity the approximation of HC 1 , HC 3 and HC 4m is better and HC 6 has relatively poor approximation. The reason behind the poor approximation of the test using HC 6 is due to the use of deviance measure as correction matrix which is specifically suggested to deal with the influential observation. So it is not recommended to use it in case of homoskedasticity. Next when the heteroskedasticity is present and is mild, α = 0.26, the quasi t-test based on HC 4m has good approximation of asymptotic distribution. In this scenario, the size of quasi t-test based on new suggested estimator is larger than that of the asymptotic distribution. This distortion of size is larger when the sample size is small. The test based on HC 1 is showing the largest discrepancy in size.
Moreover, when the level of heteroskedasticity increases and the sample size is small, the test based on HC 6 Figure 1 showed the plots of relative probability discrepancy (RPD) against the asymptotic probability of the asymptotic distribution of quasi t-test based on HC 1 , HC 3 , HC 4m and HC 6 . Usually, we are interested in studying the relative probability discrepancy when γ > 0.8. So we will discuss the results specifically for this situation. It can be observed that for sample size when the level of heteroskedasticity is low, HC 3 and HC 4m have smaller relative probability discrepancy. The situation totally changes when the level of heteroskedasticity is high. The relative probability discrepancy of test based on HC 6 decreases in presence of high level heteroskedasticity. While this is the case which adversely affects the asymptotic approximation of HC 3 and HC 4m . For large sample size, all the three tests, HC 3 , HC 4m and HC 6 , have shown same amount of relative probability discrepancy especially when γ > 0.7. The behavior of HC 1 is generally poor in all the considered scenarios.

Application to Real data
In this section, we apply the quasi t-test for the hypothesis testing of the significance of regression coefficient. The test is applied under the new suggested estimator HC 6 and comparison has been made with HC 1 , HC 3 , HC 4m . The data has been taken from Greene (1997, p.541 (14) This is the same model studied by Cribari-Neto et al., (2007  From the results given in Table 2 it can be noticed that the test on the OLS standard error rejects the null hypothesis even at 1% level of significance. Same is the case for HC 1 which rejects the null hypothesis at 10% level of significance. While the test based on other HCCMEs, i.e. HC 3 , HC 4m and HC 6 , we are unable to reject the null hypothesis even at 10% level of significance. Hence from these results it can be concluded that, the test statistic based on the HC 6 estimator will give reliable inference in real life data.

Conclusion
In this article we propose a new HC estimator, called HC 6 , which used the deviance measure to rescale the residuals. The numerical results suggest that, when the level of heteroskedasticity is very high and sample size is small, the test based on HC 6 estimator have better approximation of asymptotic distribution. We recommend using newly suggested estimator instead to other HCCMEs, especially when the sample size is small. We study the estimator only under normal distribution with heteroskedastic disturbances it can also be studied assuming heteroskedasticity under some non normal distribution of the error term.