next up previous
Next: Predictions for Rocks of Up: TESTING THE TREES Previous: TESTING THE TREES


Classifying Rocks of Known Tectonic Affinity

Both trees were tested on three suites of samples that had not been used in the tree construction. The test data (electronic annex EA-1) include:

First, these geochemical analyses were classified using the classic Ti-Zr-Y diagram of Pearce and Cann (1973). The results are shown on Figure 6 and Table 3. A large subset of the data could not be classified, because (1) either Ti, Zr or Y had not been analysed, or (2) because the data do not plot inside any of the labeled areas of the ternary diagram. A substantial portion of the remaining data plots in field ``B'' of the discrimination diagram, which is of mixed tectonic affinity, although further classification can be done using the Ti-Zr diagram (Pearce and Cann, 1973). For the samples that plot in fields ``A'', ``C'' and ``D'', the classification seems to be quite successful, although it is hard to assess the misclassification risk because the number of ``classifiable'' points is so small.

Figure 6: The test data plotted on the Ti-Zr-Y discrimination diagram of Pearce and Cann (1973). More than half of the test data could not be plotted on this diagram because at least one of the three elements was missing. A - island arc tholeiites, C - calc-alkali basalts, D - within plate basalts, B - MORB, island-arc tholeiites and calc-alkali basalts. A and C are the IAB fields, D the OIB field and B a mixed field of MORBs and IABs.
Image W3441-fig6


Table 3: Summary table of the discriminant analysis of Figure 6. The Ti-Zr-Y plot does not discriminate MORBs from IABs (both plot in field ``B').
true missing out of predicted tectonic affinity
affinity data bounds MORB or IAB IAB MORB OIB
IAB 22 6 31 5 - 3
MORB 44 3 8 0 - 0
OIB 32 2 2 1 - 23


It might not seem fair to the discrimination diagram method to only compare our classification trees with the method of Pearce and Cann (1973). Although this diagram has great historical significance and is still used a lot, it suffers from many of the wrong statistical assumptions that have plagued the analysis of compositional data, and have been discussed elsewhere (e.g., Aitchison, 1986). The Ti-V diagram of Shervais (1982) largely avoids these problems, because it only uses two variables, and does not rescale them to a constant sum, as is the case for the ternary Ti-Zr-Y diagram of Pearce and Cann (1973). Furthermore, the training data of Shervais (1982) do not consist of averages of multiple samples, but of individual geochemical analyses. The Ti-V diagram can distinguish between all three tectonic affinities, so there is no field of ``mixed affinity'' like field B of Figure 6. Figure 7 shows the test data plotted on the Ti-V diagram. Table 4 summarizes the performance of this classification.

Figure 7: The test data plotted on the Ti-V discrimination diagram of Shervais (1982).
Image W3441-fig7


Table 4: Summary table of the Ti-V discrimination diagram of Figure 7.
true missing out of predicted affinity
affinity data bounds IAB MORB OIB
IAB 40 0 17 10 0
MORB 2 3 0 48 2
OIB 24 7 0 1 28


The decision boundaries of all the tectonic discrimination diagrams discussed so far were drawn by eye. Vermeesch (2006) revisited these and other diagrams and recalculated the decision boundaries using the statistically more rigorous technique of discriminant analysis. Besides revisiting the diagrams of Pearce and Cann (1973), Shervais (1982) and others, Vermeesch (2006) also performed an exhaustive exploration of all possible binary and ternary combinations of 45 elements, based on the same training data used in the present paper. Here, only two of these diagrams will be discussed. The best overall linear discrimination diagram uses the combination of Si, Ti and Sr (Figure 8, Table 5). The best quadratic discrimination diagram of only relatively immobile elements uses Ti, V and Sm (Figure 9, Table 6).

Figure 8: The test data (164/182 used) plotted on the Si-Ti-Sr linear discrimination diagram (redrawn from Vermeesch, 2006).
Image W3441-fig8

Figure 9: The test data (85/182 used) plotted on the Ti-V-Sm quadratic discrimination diagram (redrawn from Vermeesch, 2006)
Image W3441-fig9


Table 5: Test of the best linear discrimination diagram of Vermeesch (2006) (Figures 8), using Si, Ti and V.
true predicted affinity
affinity IAB MORB OIB
IAB 45 9 7
MORB 0 45 1
OIB 0 0 57



Table 6: Test of the best quadratic discriminant analysis of Vermeesch (2006) (Figure 9), using Ti, V and Sm.
true predicted affinity
affinity IAB MORB OIB
IAB 24 2 0
MORB 1 44 5
OIB 0 0 9


The same data were also classified with the trees of Figures 4 and 5. The results of this experiment are shown in Tables 7 and 8. Contrary to the discrimination diagrams, the classification trees managed to assign a tectonic affinity to all 182 test samples. The HFS tree misclassifies quite a few more IABs and MORBs than the full tree. For the MORBs, it is probably not surprising that 14 out of 54 Galapagos ridge samples were misclassified as OIBs, considering the possible presence of plume-ridge interactions near the Galapagos hot spot. The higher misclassification risk of the HFS tree reminds us of the fact that unless rocks are obviously altered, it is better to use the full tree, which includes Sr.


Table 7: Test of the full tree (Figure 4) on a suite of rocks that were not used in its construction.
true predicted affinity
affinity IAB MORB OIB
IAB 54 8 5
MORB 4 49 2
OIB 2 2 56



Table 8: Test of the tree that uses only HFS elements (Figure 5) on the same suite of rocks of Table 7.
true predicted affinity
affinity IAB MORB OIB
IAB 44 17 6
MORB 3 38 14
OIB 0 1 59


Performance of the Ti-V diagram is remarkably good and comparable to that to the full tree with a misclassification rate of 13/93 for the former and 23/182 for the latter (Tables 4 and 7). The Si-Ti-Sr (17/164 misclassified, Table 5) and Ti-V-Sm (8/85 misclassified, Table 6) diagrams even seem to perform better than the classification trees. However, this is likely to change for new trees created from larger sets of training data. Discriminant analysis does not gain much from excessively large databases, whereas classification trees keep improving. And again, neither the Si-Ti-Sr nor the Ti-V-Sm diagram succeeded in classifying all the test data, in contrast with the classification trees.


next up previous
Next: Predictions for Rocks of Up: TESTING THE TREES Previous: TESTING THE TREES
Pieter Vermeesch 2005-12-14