Igneous rocks form in a wide variety of tectonic settings, including mid-ocean ridges, ocean islands, and volcanic arcs. It is a problem of great interest to igneous petrologists to recover the original tectonic setting of mafic rocks of the past. When the geological setting alone cannot unambiguously resolve this question, the chemical composition of these rocks might contain the answer. The major, minor and trace elemental composition of basalts shows large variations, for example as a function of formation depth (e.g., Kushiro and Kuno, 1963). Traditionally, statistical classification of geochemical data has been done with discrimination diagrams (e.g., Chayes and Velde, 1965; Pearce and Cann, 1971, 1973; Pearce, 1976; Wood, 1980; Shervais, 1982). The decision boundaries of most tectonic discrimination diagrams are drawn by eye (e.g., Pearce and Cann, 1973; Wood, 1980; Shervais, 1982). Although still widely used, these diagrams have some serious problems, including:
As an alternative to discriminant analysis which resolves all these
issues, this paper suggests classification trees, which are one of the
most powerful and popular ``data mining'' techniques (Hastie et
al., 2001). One application in which classification trees have been
quite successful is email spam filtering (e.g., Hastie et al.,
2001; Carreras and Màrquez, 2001). Based on a large training
database of predetermined genuine and spam messages, spam filters
automatically generate a series of nested yes/no questions that decide
which of the two categories (genuine or spam) a new message belongs
to. The attributes used in a tree-based spam filter can be the
frequencies of certain words or characters as a percentage of the
total length of the message, the average length of uninterrupted
sequences of capital letters, the total number of capital letters,
etc. Spam filtering has many similarities with the problem of
tectonic discrimination. In the latter case, the training data will
not contain two but three classes (mid-ocean ridge, ocean island and
island arc). The attributes used for the classification will be
chemical concentrations and isotopic ratios.
Section 2 will give an introduction to the construction
of classification trees. As for discrimination diagrams, it is not
really necessary for the end-user to know all the details of the
building process, because this has to be done only once (hence this
paper) after which they are very easy to use; trees in fact easier to
use than discrimination diagrams. Therefore, only a brief
introduction to the technique will be given, along with the necessary
references for the interested reader.
In Section 3, two classification trees will be
presented for the discrimination between basalts from mid-ocean ridge
(MORB), ocean island (OIB) and island arc (IAB) settings, based on 756
major and trace element measurements and isotopic ratio analyses,
compiled from two publicly available petrologic databases. The first
tree uses all major, minor and trace elements, and should be used for
the classification of unaltered samples of basalt. The second tree
only uses immobile elements and can also be used for samples that
underwent some degree of weathering and/or metamorphism. Beyond this
initial selection of suitable features, the construction of the trees
is entirely statistical, and involves no further petrological
considerations or arbitrary decision boundaries.
In Section 4, both classification trees will be tested. First, a suite of modern basalts of known tectonic affinity will be classified by trees as well as discrimination diagrams. Then, a published dataset of twenty basalts from the Pindos Basin (Greece) will be classified. This will illustrate the limitations of the tree method and serve as a cautionary note, which is valid for all statistical classification methods.