About statistical methods
The methods in ClusterSeer evaluate spatial, temporal, and spatio-temporal
disease clusters. The fundamental question behind all these methods is
whether clustering exists in the data. All
the methods evaluate hypotheses; though these hypotheses are better considered
exploratory, see Limits of cluster detection.
The hypotheses differ between methods, but all the methods can be characterized
using the following structure (from Waller
and Jacquez 1995):
The null spatial model defines the distribution of cases of the disease expected without clustering.
This distribution
may be spatial, temporal, or spatio-temporal depending on the data, question,
and method.
The null hypothesis is a
prediction about spatial pattern based on the null spatial model.
The test statistic summarizes an aspect of the data of biological or epidemiological interest.
The null distribution of the test statistic can be derived theoretically or empirically through
Monte Carlo randomization. Example
theoretical null distributions include the Poisson
null distribution. Either
way, the null distribution reflects the null spatial model.
The alternative hypothesis is a counter to the null hypothesis, a different prediction defined
either in the terms of the null spatial model or in terms of additional
parameters to define "clustering."
The alternative spatial model can be very basic and all-inclusive "not the null spatial model,"
or it can be a more specific model defining a particular model of disease
distribution.
Probability values (P-values) for
the observed test statistics can be obtained by comparing them to the
null distribution. This
comparison gives a quantitative estimate of the probability of the observed
value under the null hypothesis.