Diagnostics with Spatial Regression

When you run a spatial regression analysis in SpaceStat, a variety of diagnostic tests will be calculated along with your model results (Anselin 1988a,b). For ordinary least squares regression these deal with non-normality, heteroskedasticity, and spatial dependence. For spatial lag and spatial error regression these deal with heteroskedasticity and spatial dependence. For spatial lag (error) the spatial dependence tests are against spatial error (lag) tendencies.

Most of these tests are formulated along the lines of Lagrange multiplier tests (Aitchison and Silvey 1960), with the tendencies against heteroskedasticity or spatial dependence for an optimal "full" model being evaluated using the slope and curvature of the log-likelihood at the current "non-full" model’s optimization point. In the case of spatial regression, the slope and curvature refer to the score and Fisher information (via the Hessian) for the appropriate log likelihood function. Lagrange multiplier tests are known to be asymptotically equivalent to Likelihood Ratio tests (Silvey 1959).

Ordinary least squares regression

The non-normality test carried out is the Jarque-Bera test (Kiefer and Salmon 1983) which examines the OLS residuals for skewness and kurtosis.

Heteroskedasticity is tested using the Breusch-Pagan test (Breusch and Pagan 1979) which is derived from Lagrange Multiplier arguments, and can be represented as a chi-squared test using as statistic half the model sum of squares of an auxiliary regression. This auxiliary regression has a dependent variable and a set of independent variables. The dependent variable is the standardized square of the OLS residuals (divided by the OLS variance and then measured relative to 1.0). The independent variables are the squares of the original OLS independent variables which determine the auxiliary regression matrix. This is also referred to as the random coefficients model (Hildreth and Houck 1968). The degrees of freedom in the chi-squared test are equal to those of the regression model minus one.

In the present version of Spacestat, the user does not have the option of specifying the heteroskedastic variables.

For situations where there is considerable non-normality the Breusch-Pagan test is less reliable than the Koenker-Bassett test (Koenker and Bassett 1982) which normalizes the Breusch-Pagan statistic by half the mean of the squares of the Breusch-Pagan dependent variable. This is reported in SpaceStat along with the Breusch-Pagan statistic. A further test of heteroskedasticity is the White test (White 1980) in which a regression is carried out of the square of the OLS residuals against all cross-products of the original regression variables. The White statistic is then equal to the product of R squared from this regression multiplied by the number of observations, with the number of degrees of freedom for the chi-square test given by the number of new regression variables in the White test.

The tests against spatial dependence in OLS are the following: first a Moran’s I test is carried out on the residuals using the correct asymptotic expression for the variance (Cliff and Ord 1973). Next are reported Lagrange multiplier tests for spatial dependence (Anselin et al 1996) against (a) spatial error alone (Burridge 1980), (b) spatial error with spatial lag present (Robust LM error)(c) spatial lag alone, and spatial lag in the presence of spatial error (Robust LM lag). Also given is a test for both spatial lag and spatial error (LM SARMA).

Spatial lag regression and spatial error regression

Again, the Breusch-Pagan test is used to examine the standardized square of the OLS residuals regressed against the square of the original coefficients. But also given are the spatial versions of this test where we adjust the Hessian entering the Lagrange multiplier analysis for the residual couplings of the auxiliary regression variable matrix (from the random coefficients model) to deviations in the spatial lag (error) parameters terms (Anselin 1988a,b).

Next a likelihood ratio test is given for the spatial lag or spatial error model in question. The difference in the calculated log-likelihood between the model in question and the OLS version of the model is used as the statistic for chi-squared test with the number of degrees of freedom of the regression model minus one.

Finally are given the Lagrange multiplier tests spatial dependence. These models take the optimized spatial lag (error) models and examine spatial tendencies towards spatial error (lag) using Lagrange multiplier tests. In each case the Hessians have to be corrected for residual coupling of the error (lag) terms to the parameter variances in the spatial lag (error) model (Anselin 1988a,b).