Ceramic Petrography Laboratory
Research Results
The San Pedro Basin: Statistical Analyses and Petrofacies Model
Elizabeth J. Miksa, Sergio F. Castro-Reino, and Carlos Lavayen
The San Pedro Basin is a large and complex basin with a diverse set of underlying geologic units. It spans approximately 230 linear kilometers and occupies several environmental zones. In addition, late prehistoric to historic cultural occupation in the northern valley was different from that in the southern valley. Early occupation of the valley began in Clovis times, approximately 11,200 years ago, and continued through Archaic times (Huckell
2003).
The northern San Pedro valley, from the Gila River confluence to approximately Tres Alamos wash is relatively narrow. It is bounded by high mountains on the east and west, and is more or less straight as a geographic feature. This part of the valley was occupied in the later prehistory (the Classic Period) by peoples related to the Hohokam and Mogollon.
The southern San Pedro valley, from Tres Alamos wash to its headwaters in Mexico, is broader. It is bounded by several smaller, discontinuous mountain ranges, and the basin margins are a little more irregular. This part of the valley was occupied by the Sobaipuri and Apache in protohistoric to historic times, but there is less evidence of occupation in middle to late prehistoric times.
The Aravaipa Valley, a major tributary to the northern San Pedro, is a large sub-basin bounded by steep mountains. Its remote terrain is difficult to access, even today. The basin is wide and grassy at its southeastern headwaters zone, but cuts a steep, narrow canyon in its descent to the San Pedro. Occupation of the Aravaipa valley is poorly documented.
265 sand samples were examined and point-counted as part of the study.
Sample Inventory, Microsoft Excel 2000 (zipped)
Point Count results, Sands, Microsoft Excel 2000 (zipped)
To facilitate the statistical modeling of the San Pedro river, the petrofacies were divided into one of the three geographic subdivisions of the valley: Northern, Southern, and Aravaipa. Three separate statistical analysis were conducted, one for each sub-basin. The sub-basins are not; however, mutually exclusive. In the modeling process, we chose to overlap the petrofacies bounding these sub-basin models. That is, the southern extent of the northern model overlaps with the northern extent of the southern model, and so forth (view map). The same data can be viewed as a table.
Our reasoning is that we do not want to make the error of assuming that a petrofacies has been statistically defined in one model, only to find it overlaps compositionally with an adjacent petrofacies in the next model. In addition, although there are geographic, physiographic, and cultural differences between the sub-sections of the basin, these differences are gradational. To portray them as abrupt would be misleading.
Correspondence Analysis
The correspondence analysis of San Pedro sands was done within the sub-basin groups as defined above. Each sub-basin had different correspondence analyses run to examine compositional relationships at various scales. In this case, the correspondence analysis variables are the point count parameters.
A selection of correspondence analysis plots showing the loadings of the first two axes for each sub-basin can be viewed here or downloaded as a Microsoft Excel 2000 file. This is not the complete set of every correspondence analysis run, but it gives a good idea of the process we follow in choosing to do another analysis or in stopping the analyses at a given point. (We ran approximately 50 analyses, and showing each one is not possible.) Note that we start each analysis with the maximum number of parameters possible, as long as they reach one percent in the data set under study. Thus, we recalculate the one percent limit for each data set. For instance, in the Northern San Pedro, LVFB can be a parameter in the lithic model, but not in the mineralic or generic models. Also note that only the first plot for each analysis is shown (Axis 1 by Axis 2). We examine all axes that explain 10 percent or more of the variation in a data set in making decisions about which petrofacies should be assigned to which models, and in chosing samples that may need petrofacies reassignment.
Discriminant analysis : Sand Petrofacies modeling
Discriminant analyses were conducted within each sub-basin, with overlapping boundaries between the sub-basins as indicated above. Once we had established petrofacies boundaries, we ran discriminant analyses for each sub-basin without the overlapping petrofacies. This second set of models is used for the discriminant analysis of the sherds. The premise underlying this choice is that the model-building effort has been satisfied by overlapping the models and checking petrofacies boundaries. Once the model has been developed, there is no compelling need to continue comparing sherds to petrofacies that are compositionally and geographically well outside of their potential production zone.
A table showing the parameters chosen for each discriminant analysis can be seen here. Point count parameters and calculated parameters for each discriminant analysis model were chosen on the basis of correspondence analysis results and with reference to the distribution of counts within the data set. Parameters with the highest positive and negative rankings on axes that explain 10 percent or more of the variation in correspondence analysis were considered excellent candidates for inclusion in the discriminant analysis.
Klecka
1980:9-10 presents several limits on the statistical properties of discriminating variables:
-
No variable may be a linear combination of other discriminating variables.
-
Two variables which are perfectly correlated may not be used at the same time.
-
Population covariance matrices must be equal for each group
-
Each group is drawn from a population that has a multivariate normal distribution
There is no limit on the number of variables, except that the number of cases must exceed the number of variables by at least two.
As a general rule, we have not been checking the covariance matrices for each group. However, we do submit the proposed parameters to basic statistical analyses to check for some approximation of a normal distribution in the data set. Parameters which are present in one group but absent in the other may not be used—they violate the normality assumptions of the data set. We use parameters that are present (and not horribly skewed) in at least 50 percent of the samples in each group.
Each group in each petrofacies model is examined prior to discriminant analysis. In all cases, we use a set of nested discriminant functions to get to the petrofacies level of discrimination (see discussion). The first level of analysis is a “generic” level, where we are discriminating, in most cases, mineral-rich samples from lithic-rich sands. For the San Pedro valley, this is the first discriminant level in each of the sub-models. At this level, the presence and normality assumptions are checked for the generic groups, but not for petrofacies. Once the petrofacies level is reached in the discriminant analysis, the presence/absence of parameters is checked at the petrofacies level.
As an example, these graphs show the presence or absence of parameters in the Aravaipa generic, lithic, and mineralic models, respectively.
Generic Model
Lithic Model
Mineralic Model
The discriminant analysis models for the San Pedro Valley have been highly successful so far. Correct identification rates for sands are 88 percent for the Aravaipa Valley, 91 percent for the Northern San Pedro, and 78 percent for the Southern San Pedro. These rates are comparable to those for other basins in Arizona.
Sherd Provenance Modeling
As of this writing, only sherds from sites in the northern San Pedro have been submitted for provenance analysis. The temper in the sherds was characterized using the sand descriptions and flow chart for the northern San Pedro. A total of 2,194 sherds were characterized using the San Pedro petrofacies model.
Of these, 70 were selected for thin section analysis to verify the characterizations done by ceramicists under the binocular microscope.
Point Count results, Sherds, Microsoft Excel 2000 (zipped)
Sherds were originally submitted to the northern San Pedro sand model as it was developed for the petrofacies model. This sand model has an accuracy of 91 percent using the discriminant analysis variables.
However, the results of the initial discriminant analysis were perplexing and inconsistent, with many samples with high proportions of volcanic rocks being incorrectly classified as “mineralic” because of their high mica content. The high mica content is likely associated with the clay and not the added temper. In order to analyze the sherds appropriately, a modified petrofacies model was created to avoid problematic parameters that seemed to be affected by technological aspects of ceramic manufacture. The revised petrofacies model is slightly less accurate for the sands, at 88 percent, but is more robust for temper analysis in that compositional variation due to ceramic production technology affects it less.
The primary difference between this model and the prior model is at the generic level, where fewer parameters are used. At the petrofacies level, the models are the same. Specific point count parameters used in the discriminant analysis of northern sherds are listed here.
Results of the discriminant analysis showing the binocular temper characterization, discriminant analysis results, and final characterization for each sherd can be browsed online or downloaded as a Microsoft Excel 2000 (zipped) file.
The relationship of the point counted sherds to the overall data set is presented below with a variety of cross-tabulations of the 2,194 analyzed sherds by ware, temper type, geographic location in the basin, and time.
Source by Ware
Source by Ware by District
Source by Ware by Time
Dudleyville District: Source by Ware by Time
Aravaipa District: Source by Ware by Time
San Manuel District: Source by Ware by Time
Cascabel District: Source by Ware by Time
Tres Alamos District: Source by Ware by Time
Download complete set of cross-tabulations as Microsoft Excel 2000 (zipped) file.
The final data on ceramic provenance contribute to our understanding of the population dynamics during the Classic Period in the San Pedro valley.
Ceramic Petrography Laboratory Home
|