Boundary analysis is appropriate in the exploratory stage and the hypothesis testing stage of research. During initial data exploration, boundary analysis can identify spatial patterns and generate testable hypotheses. Designing experiments for hypothesis testing requires more careful planning and a more thorough understanding of the analytical techniques to be used. Along those lines, we offer the following guidelines for hypothesis testing using BoundarySeer.
An important consideration in any spatial investigation is the scale of the sampling framework. By scale we mean both the size of the geographic area under study, and the spatial intervals at which observations are made. Ideally, the scale of the sampling regime reflects the scale of the processes under investigation. Determination of the appropriate scale may require a pilot study or other preliminary work. A sampling regime that is too broad or too narrow for the relationships under study will likely result in failure to detect boundaries or associations that may actually exist. In the event of non-significant findings, a logical first question is, 'Was the scale appropriate for this study?'
Within BoundarySeer, boundaries may be delineated based on one or many variables measured at a set of study locations. For example, in ecology, ecotones (boundaries between adjacent ecosystems) may be delineated based on changes across space in the abundance of one dominant plant species, or based on changes in many plant species. The corresponding data sets would consist of data representing the abundance of plants measured within some unit of area at each spatial location. The first example would have only one variable for the focal species, while the second would have a column for each species sampled.
Selection of variables to include in a data set should start with existing knowledge of the system. Once a set of candidate variables has been constructed, a combination of techniques may be used to decide which variables are included in the boundary analysis. The first method is to look for boundaries for single variables, evaluating each variable independently. Then, select variables for a multivariate boundary delineation based on some predetermined criteria. For example, you may include only those variables that have significant boundaries themselves (determined using subboundary analysis), or you may include those variables that have high rates of change in the same vicinity. An alternative method is to use multivariate techniques such as principal components analysis (PCA) to determine which of several candidate variables contribute significantly to the overall variation in the system. You might then decide to include variables that account for a certain proportion (e.g. 90%) of this variation. In any case, let the research question or process model, rather than models of data alone, guide selection of variables.
Boundary overlap statistics address the question, 'Are boundaries for two data sets significantly close to each other?' Implicit in this question is the assumption that boundaries exist for the two suites of variables. Thus, boundaries must first be evaluated before assessing overlap.
For difference boundaries, we suggest you evaluate this assumption by first calculating subboundary statistics for each data set. Subboundary statistics will assess boundary contiguity. If contiguous boundaries exist, then the interpretation of boundary overlap is clear: discrete boundaries overlap. If clear boundaries do not exist within each data set, yet overlap is significant, then the two suites of variables have a more complex relationship. In this case, areas of high rate of change for each data set coincide. Further investigation may be needed to uncover the nature of the relationship.
See also: