Choosing cluster number

In spatially constrained clustering, BoundarySeer agglomerates clusters until it reaches the target cluster number set by the user. It proceeds to this target cluster number without evaluating whether fewer or more clusters would improve the model. To assess the implications of cluster number, use the goodness-of-fit option on the constrained clustering dialog.

BoundarySeer evaluates goodness of fit for clustering through an index contrasting the variability between clusters to that within clusters, using Sum of Squares Error (SSE) terms.

Goodness of fit index = [B/(k-1)] / [W/(n-k)] (Gordon 1999)

Where B is the between-cluster SSE, W is the within-cluster SSE, k is the number of clusters, and n is the number of objects (e.g. points) in the model. To maximize the goodness of fit, choose the highest value of the index, where the differences between clusters are greater than those within.

Next step:

See also:

Example: