For the Ti-V system, the data were transformed to the simplex by the
log-ratio transformation. Thus, two new variables were created:
log(Ti/(10-Ti-V)) and log(V/(10
-Ti-V)), where 10
is the
constant sum of 1 million ppm. The discriminant analysis then proceeds
as described in Section 2. The results are mapped
back to bivariate Ti-V space using the inverse log-ratio
transformation (Equation 6). Figure
11 shows the results of the LDA of the Ti-V system,
whereas Figure 12 shows the QDA results. The
decision boundaries look almost identical for both cases. Besides the
decision boundaries, Figures 11, 12
and subsequent figures also show the training data as well as the
posterior probabilities. One of the properties of many data
mining algorithms, including discriminant analysis, is the ``garbage
in, garbage out'' principle: any rock that was analysed for the
required elements will be classified as either IAB, MORB or OIB, even
continental basalts, granites or sandstones! Therefore, it is
recommended to treat the classification of samples plotting far
outside the range of the training data with caution.
In contrast with the Ti-V diagram, the decision boundaries of the Ti-Zr system look quite different between LDA (Figure 13) and QDA (Figure 14). The misclassification risk of the training data (i.e., the resubstitution error) of QDA is always less than that of LDA, because the former uses more parameters than the latter. However, this does not necessarily mean that QDA will perform better on future datasets. This problem will be discussed in Section 7. For now, suffice it to say that the resubstitution error can be used to compare two binary or two ternary diagrams with each other, but not to compare the performance of QDA with LDA or of a binary with a ternary diagram.