Knox: Critical Values

Knox's method detects clustering using threshold values, critical time and space distances. Pairs of cases separated by less than the critical space distance are considered to be near in space. Pairs of cases separated by less than the critical time distance are said to be near in time. To run Knox's method, you need to supply these critical values.

The plot above (not from ClusterSeer) shows the time critical value as an orange line, the spatial distance critical value as a blue line. Pairs of events are categorized as near or far in space and time. There are 4 categories or classifications, as shown on the plot.

How do you determine when cases of a disease are "close enough" in time or space?

Knox designed this method to account for latency periods. A latency period is the time between exposure and the manifestation of symptoms. If you suspect a disease with a latency period of 3 days set the time critical distance long enough to allow symptoms to appear, say 4 or 5 days. For infectious diseases, the geographic critical distance reflects the average distance between 2 individuals, one of whom infected the other. In general, one selects critical distances consistent with the disease hypothesis under investigation. This hypothesis based approach avoids problems of subjectivity which arise when critical values are determined from the data.   

However, when knowledge of the underlying disease process is absent, critical values can be quantified based on the distributions of space and time distances. This approach is crude and should only be used when an epidemiologic hypothesis is lacking.   In these instances use the mean geographic distance for Dcrit and the mean time distance for Tcrit. You can systematically vary the critical distances to identify those values that maximize Knox's X.   This can provide insight into the spatial and temporal scale of the disease process, but precludes any formal evaluation of statistical significance because of multiple tests. Do not choose critical distances larger than the maximum distance in the data, since the number of cases near in both space and time will always be zero.

See Also