Data Description
The data in this tutorial comes from the National Cancer Institute. The datasets included in the project are for colon cancer deaths in 5 year time intervals from 1970-1994. Among cancers, the highest mortality for men is from lung, prostate, and colon cancers respectively, for women it is breast, lung, and colon cancers, all of which demonstrate spatial pattern (Devesa et al. 1999). Thus, colon cancer mortality is an important subject for study, especially with tools that allow you to evaluate patterns in both space and time.
The data in this tutorial are associated with state economic areas (SEAs) which are groups of counties with similar socioeconomic characteristics. These data are grouped by race and gender for African-American and white populations.
Variable |
Definition |
R |
Mortality rate per 100,000 person-years, age-adjusted to the 1970 U.S. population. |
C |
Number of deaths |
P |
Population that we back-calculate |
LBR |
Lower bound of the 95% confidence interval on the mortality rate |
UBR |
Upper bound of the 95% confidence interval on the mortality rate |
This tutorial focuses in on the southeastern part of the continental US, in part because data for African-Americans is sparse in rural areas in other parts of the country. One problem with exploring patterns for multiple ethnic groups is that there may be large differences in population sizes across groups in some parts of your focal goegraphy - for example there are low populations of African-Americans in rural areas of the midwestern and western United States. Because of low population numbers, the counts used to create the mortality rates are based on small samples and the rates are thereby unstable—subject to fluctuations that may be due to chance. The NCI print and online atlas masks data based on few counts (< 6 deaths in the 5 year time period). We focus on the southeastern United States and Gulf Coast, including part of eastern Texas, Mississippi, Louisiana, Alabama, Georgia, Florida, South Carolina, and North Carolina. This region has high enough populations of African-Americans to avoid the rural areas becoming masked out, as masked data is unsuitable for spatial analysis. The southeastern US has been identified as a region of persistently high mortality (Cossman et al. 2003), though it is not the highest mortality region for colon cancer in the U.S. For colon cancer mortality rates, the southeast is exceeded by the northeastern states (Devesa et al. 1999).
The National Atlas has compiled metadata for part of this dataset. View their metadata at the National Atlas website.