Next: Appendix B: details of Up: How many grains are Previous: Conclusions and recommendations

Appendix A: derivation of Equations 2 and 3

Of all possible populations, those with a perfectly uniform distribution require the collection of the largest sample in order to be certain that no significant fractions have been missed. We will consider the case where there are M=20 such fractions. This case can easily be generalized to any M. For a perfectly uniform distribution, each of the 20 fractions equals exactly f=0.05.

Image A1

If we are interested in only one of these fractions, e.g. #1 (in subsequent figures, the shaded box(es) indicate(s) the fraction(s) of interest), then the probability of missing this fraction is p=1-f. The probability that this occurs for each one of k experiments is p = (1-f). This is the probability calculated by Dodson et al. [1], and given by Equation 1.

Image A2

However, if we are not just interested in one particular fraction, but in all 20 fractions, the probability of missing at least one of them is much larger. It is the probability of missing:

or:

or
...
or

In combinatoric terms:

$\displaystyle p = \binom{20}{1}(1-f)^k$ (5)

While better than (1), this is still not the equation that we want, because the probability that any two fractions are simultaneously missed is counted twice, causing an estimate of p that is too high. Therefore, the following situations:

or

or ...

have to be subtracted from Equation 5. This gives rise to the following expression:

$\displaystyle p = \binom{20}{1}(1-f)^k - \binom{20}{2}(1-2f)^k$ (6)

Equation 6 is a better approximation than Equation 5, but the probability that three fractions are missed at the same time is subtracted twice, resulting in too low an estimate for p.

Therefore, a correction is added to (6), becoming a third-order approximation:

$\displaystyle p = \binom{20}{1}(1-f)^k - \binom{20}{2}(1-2f)^k + \binom{20}{3}(1-3f)^k$ (7)

This equation will again overestimate p because the probability of simultaneously missing four fractions is counted twice. It is clear by now that this process of iterative corrections to Equation 5 can be repeated until we have corrected for the probability that all twenty fractions are missed:

This probability equals $\binom{20}{20}(1-20f)^k = 0$ , a trivial result. Adding it to the 19 previous corrections yields:

$\displaystyle p$ $\displaystyle =$ $\displaystyle \binom{20}{1}(1-f)^k - \binom{20}{2}(1-2f)^k + ... - \binom{20}{20}(1-20f)^k$ (8)

$\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{n=1}^{20}(-1)^{n-1} \binom{20}{n} (1-nf)^k$ (9)

or, generalizing by replacing 20 with M:

$\displaystyle \sum_{n=1}^{M}(-1)^{n-1} \binom{M}{n} (1-nf)^k\$ (10)

Equation 10 is a special instance of Equation 2 for A = 0 and B = 0. This form gives the correct value for p when the relevant fractions exactly add up to 100% of the population (i.e. M = 1/f). There are two situations where the relevant fractions do not exactly add up to one:

A = 1, B = 0:
Image A10
or A = 1, B = 1:
Image A11

The derivation of p for these cases is completely analogous to the derivation of Equation 10. Equation 2 is a generalization that takes care of all possibilities.

In addition to the worst-case scenario, a best-case scenario can also be considered given a certain number of relevant fractions (m). If the number of relevant fractions is not known, the lowest possible p is always associated with a delta function (one single age component). For the latter population p equals zero, which is an information-free trivial result. For example, if m = 3, the best-case scenario is given by:

Image A12

The derivation of p for this case is completely analogous to the derivation of Equation 10 with M = m = 3 and f = 1/3:

$\displaystyle p = \sum_{n=1}^{3}(-1)^{n-1} \binom{3}{n}\left(1-\frac{n}{3}\right)^k$ (11)

Next: Appendix B: details of Up: How many grains are Previous: Conclusions and recommendations

Pieter Vermeesch 2004-05-19