Location uncertainty occurs whenever the exact spatial coordinates of the data are not known. This lack of information is common, when the locations are censored for confidentiality reasons, in aggregate data, and in exposure assessment.
In aggregate data, rates or summary values are calculated from individual events. In aggregate, the individual data records are abstracted from their original spatial locations. Examples of aggregate data include census data, where summary information is recorded at the level of individual political units; species abundance calculated for forest plots; rates of disease calculated for counties or townships; and incidence of certain events recorded by a central location, such as a hospital or police station.
In addition, people move so their spatial location is not a fixed point but instead an activity space. Thus, for exposure analysis in particular, but including other types of analyses, spatial coordinates such as a person's address may be overly precise.
A common, although inappropriate approach for dealing with location uncertainty is to assign the data to the centroid of a polygon. The polygon may represent the census tract, the zip code, or the area sampled. In this method, the polygon's centroid, or geographic center, becomes the data's spatial coordinates. Yet, as Jacquez and Waller (1998) found, the results of spatial statistical tests differ for raw data and aggregate data represented by a centroid. In short, the p-values for cluster statistics for raw data and for centroids were very different, with analyses using centroid data having decreased statistical power and increased type II error (or the likelihood of false negatives). Thus, location uncertainty arising from the use of centroid locations can distort the detection and interpretation of true spatial pattern.
Next Step: