Classifying Rocks of Known Tectonic Affinity

Both trees were tested on three suites of samples that had not been used in the tree construction. The test data (electronic annex EA-1) include:

First, these geochemical analyses were classified using the classic Ti-Zr-Y diagram of Pearce and Cann (1973). The results are shown on Figure 6 and Table 3. A large subset of the data could not be classified, because (1) either Ti, Zr or Y had not been analysed, or (2) because the data do not plot inside any of the labeled areas of the ternary diagram. A substantial portion of the remaining data plots in field ``B'' of the discrimination diagram, which is of mixed tectonic affinity, although further classification can be done using the Ti-Zr diagram (Pearce and Cann, 1973). For the samples that plot in fields ``A'', ``C'' and ``D'', the classification seems to be quite successful, although it is hard to assess the misclassification risk because the number of ``classifiable'' points is so small.

**Figure 6:** The test data plotted on the Ti-Zr-Y discrimination diagram of Pearce and Cann (1973). More than half of the test data could not be plotted on this diagram because at least one of the three elements was missing. A - island arc tholeiites, C - calc-alkali basalts, D - within plate basalts, B - MORB, island-arc tholeiites and calc-alkali basalts. A and C are the IAB fields, D the OIB field and B a mixed field of MORBs and IABs.

Table 3: Summary table of the discriminant analysis of Figure 6. The Ti-Zr-Y plot does not discriminate MORBs from IABs (both plot in field ``B').

true	missing	out of	predicted tectonic affinity
affinity	data	bounds	MORB or IAB	IAB	MORB	OIB
IAB	22	6	31	5	-	3
MORB	44	3	8	0	-	0
OIB	32	2	2	1	-	23

It might not seem fair to the discrimination diagram method to only compare our classification trees with the method of Pearce and Cann (1973). Although this diagram has great historical significance and is still used a lot, it suffers from many of the wrong statistical assumptions that have plagued the analysis of compositional data, and have been discussed elsewhere (e.g., Aitchison, 1986). The Ti-V diagram of Shervais (1982) largely avoids these problems, because it only uses two variables, and does not rescale them to a constant sum, as is the case for the ternary Ti-Zr-Y diagram of Pearce and Cann (1973). Furthermore, the training data of Shervais (1982) do not consist of averages of multiple samples, but of individual geochemical analyses. The Ti-V diagram can distinguish between all three tectonic affinities, so there is no field of ``mixed affinity'' like field B of Figure 6. Figure 7 shows the test data plotted on the Ti-V diagram. Table 4 summarizes the performance of this classification.

**Figure 7:** The test data plotted on the Ti-V discrimination diagram of Shervais (1982).

Table 4: Summary table of the Ti-V discrimination diagram of Figure 7.

true	missing	out of	predicted affinity
affinity	data	bounds	IAB	MORB	OIB
IAB	40	0	17	10	0
MORB	2	3	0	48	2
OIB	24	7	0	1	28

The decision boundaries of all the tectonic discrimination diagrams discussed so far were drawn by eye. Vermeesch (2006) revisited these and other diagrams and recalculated the decision boundaries using the statistically more rigorous technique of discriminant analysis. Besides revisiting the diagrams of Pearce and Cann (1973), Shervais (1982) and others, Vermeesch (2006) also performed an exhaustive exploration of all possible binary and ternary combinations of 45 elements, based on the same training data used in the present paper. Here, only two of these diagrams will be discussed. The best overall linear discrimination diagram uses the combination of Si, Ti and Sr (Figure 8, Table 5). The best quadratic discrimination diagram of only relatively immobile elements uses Ti, V and Sm (Figure 9, Table 6).

**Figure 8:** The test data (164/182 used) plotted on the Si-Ti-Sr linear discrimination diagram (redrawn from Vermeesch, 2006).

**Figure 9:** The test data (85/182 used) plotted on the Ti-V-Sm quadratic discrimination diagram (redrawn from Vermeesch, 2006)

Table 5: Test of the best linear discrimination diagram of Vermeesch (2006) (Figures 8), using Si, Ti and V.

true	predicted affinity
affinity	IAB	MORB	OIB
IAB	45	9	7
MORB	0	45	1
OIB	0	0	57

Table 6: Test of the best quadratic discriminant analysis of Vermeesch (2006) (Figure 9), using Ti, V and Sm.

true	predicted affinity
affinity	IAB	MORB	OIB
IAB	24	2	0
MORB	1	44	5
OIB	0	0	9

The same data were also classified with the trees of Figures 4 and 5. The results of this experiment are shown in Tables 7 and 8. Contrary to the discrimination diagrams, the classification trees managed to assign a tectonic affinity to all 182 test samples. The HFS tree misclassifies quite a few more IABs and MORBs than the full tree. For the MORBs, it is probably not surprising that 14 out of 54 Galapagos ridge samples were misclassified as OIBs, considering the possible presence of plume-ridge interactions near the Galapagos hot spot. The higher misclassification risk of the HFS tree reminds us of the fact that unless rocks are obviously altered, it is better to use the full tree, which includes Sr.

Table 7: Test of the full tree (Figure 4) on a suite of rocks that were not used in its construction.

true	predicted affinity
affinity	IAB	MORB	OIB
IAB	54	8	5
MORB	4	49	2
OIB	2	2	56

Table 8: Test of the tree that uses only HFS elements (Figure 5) on the same suite of rocks of Table 7.

true	predicted affinity
affinity	IAB	MORB	OIB
IAB	44	17	6
MORB	3	38	14
OIB	0	1	59

Performance of the Ti-V diagram is remarkably good and comparable to that to the full tree with a misclassification rate of 13/93 for the former and 23/182 for the latter (Tables 4 and 7). The Si-Ti-Sr (17/164 misclassified, Table 5) and Ti-V-Sm (8/85 misclassified, Table 6) diagrams even seem to perform better than the classification trees. However, this is likely to change for new trees created from larger sets of training data. Discriminant analysis does not gain much from excessively large databases, whereas classification trees keep improving. And again, neither the Si-Ti-Sr nor the Ti-V-Sm diagram succeeded in classifying all the test data, in contrast with the classification trees.