Limits of cluster detection

ClusterSeer provides statistical methods for evaluating disease clusters quantitatively. Most statisticians and researchers consider cluster detection methods as more suitable for exploratory data analysis than rigorous hypothesis testing.

As is clear from the CDC guidelines for cluster investigations, the study of disease clusters often occurs with incomplete knowledge. Spatial locations of cases often simply serve as a proxy or indirect estimation for exposure to a risk factor. The causes of a disease cluster may not yet be understood or even identified. Additionally, the precise date of disease onset is often unavailable and may be estimated with date of diagnosis or onset of symptoms. Because of this incomplete knowledge, cluster detection methods can better help identify patterns and generate hypotheses rather than formally test pre-existing hypotheses.

Once the hypotheses are generated, they need to be tested with additional, independent data. Otherwise, the procedure is somewhat circular, testing for patterns we have already identified. Thus, cluster detection/assessment is a step towards understanding spatial and temporal patterns in health data, rather than an endpoint in the process. It can be used in planning subsequent studies, such as case-control studies and environmental monitoring schemes.