Principal Component Analysis Tutorial

Principal component analysis is a mathematical procedure that transforms a number of possibly correlated variables into a group of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much remaining variability as possible. Simply put, PCA reveals the internal structure of the data to best explain the variance in the data.

You can use PCA at the beginning of an analysis to focus your regression parameters by revealing and reducing colinearity in the data.

First, you need to find the tutorial files that were included when you installed SpaceStat so that you can import the PCA_Tutorial project. To open this project, go to File -> Open, and then browse to C:/Program Files/Biomedware/SpaceStat/Tutorials/. This is the default location for the tutorial files. If you installed your version of SpaceStat elsewhere, the tutorial files will be where you installed SpaceStat

Once the project has been opened, go to Methods -> Principal Components. We will be using homeownership and socioeconomic data for the state of Michigan at the Census Tract level. Select ”CT2000wFIPS” as your geography. Then select "Create." Variables for PCA need to be similar items that are the same size. In this case, select PCT_Asn, PCT_Black, PCT_Hisp, PCT_Wht and use the arrow to put each variable into the ’PCA variables’ box.

Select OK to finalize your model. On the next tab, PCA settings, you can name the output folder where the Principal Components will appear.

Then select Run from the Run method tab.