About Geographically Weighted Regression

As described here, aspatial regression methods are a set of tools for assessing variation in one variable (the dependent variable, y) at set levels of another variable or variables (independent, or x variables). In contrast, Geographically Weighted Regression (GWR) techniques are forms of spatial data analysis that allow you to evaluate how the relationship between a dependent variable and one or more explanatory variables changes as a function of the location u in space. More specifically, these tools allow you to explore the influence of values of your dependent variable at locations neighboring a focal location on the values estimated for coefficients of the independent variables.

So, instead of the conventional "stationary" model, for example y= b0 + b1x in the case of linear regression with a single explanatory variable, the geographically weighted regression model is built within local overlapping windows centered on the location u: y(u)=b0(u) + b1(u)x. To attenuate the impact of distant observations in the computation of the local regression, each observation used in the regression receives a weight that is a function of its proximity to the center of the window. Thus, while aspatial regression methods are simultaneously applied to all of the data within your dataset, and give you a global measure of the relationship among variables, GWR produces local measures of the importance of various predictors that can be mapped and compared. If there is no variation across space in the best-fit local model, parameter estimates produced by GWR and aspatial regression will be the same. Due to the focus on local relationships, and the strong influence of how "local" is defined on GWR results (i.e., through choice of neighbor relationships and bandwidths), GWR is often thought of as a tool for exploring patterns in your data and generating hypothesis for further testing, rather than as a tool for testing a priori hypotheses (Fotheringham et al. 2002).

GWR in SpaceStat can be used to predict a dependent variable in terms of continuous and/or categorical independent variables, and to determine the relative importance of various independent variables in predicting y, including the importance of squared and interaction terms. Note that for SpaceStat to recognize a dataset as categorical, it must be a string (alphanumeric) dataset type. Currently this tool is only implemented for point datasets; to use polygon data you will first need to create a centroid geography. Similarly, if you want to be able to use models created in aspatial regression for GWR (or vice-versa) when working with data for a polygon geography, perform your aspatial analyses on the polygon centroids.

The SpaceStat approach to GWR

GWR has been pioneered by A. Stewart Fotheringham and Martin Charlton (National Center for Geocomputation, National University of Ireland), and Chris Brunsdon (University of Glamorgen, UK). Our implementation of this tool is based primarily upon their book on the topic, but we have made some changes, (a few are detailed here), which follow from the way in which many in the public health and environmental science fields are likely to use these tools. Our approach to GWR uses a unified framework for including both geographical weighting, and an extra non-geographical weight dataset that allows for user-supplied knowledge of the ratio variances at each source point. One example of this type of weight set is the use of population data as a weight set for mortality rates, which has the effect of assigning higher "confidence" to mortality rates derived from areas with higher populations. Our goal was to treat this type of weighting together with geographic weighting within a unified framework. As a result, SpaceStat uses a maximum weighted likelihood approach to calculate the regression parameters, parameter variances, parameter R-square, expected y-values, residuals and y-standard errors as well as the "local model" R-square. This approach boils down to treating geographically weighted regression as a local extension of weighted aspatial regression. As a consequence GWR can be straightforwardly extended to non-linear regression procedures such as logistic and Poisson regression with parameter values and parameter variances calculated from a weighted log-likelihood formulation.

Choosing the form of the regression model

Three items will control the form and output from a geographically weighted regression model:

1. The nature of the dependent variable y:

Continuous (linear or Gaussian model)
Positive integer counts (Poisson model)
Proportions or rates (logistic model)

2. The nature of the explanatory, or "x" variable or variables

Continuous
Categorical

3. The weight function(s)

Geographical weights that control how neighboring locations influence values at specific locations
Non-spatial weights to account for the reliability of data (e.g., population size for disease rate data)