Stanford Encyclopedia of Philosophy

Supplement to Chance versus Randomness

B. Further Details Concerning Algorithmic Randomness


B.1 Further Details Concerning Martin-Löf Randomness

B.1.1 Statistical Tests in Martin-Löf's Framework

Martin-Löf makes use of the notion of a statistical test. In statistical hypothesis testing, we will reject a hypothesis if an outcome is observed that is improbable (supposing the hypothesis were true) at some set significance level; in that case, the outcome is significant. Correspondingly, we may define a sequence to be significant if it is a member of an effective measure zero set. A random sequence, then, is not a significant sequence.

How are these recursive significance tests defined? We illustrate the idea using Martin-Löf's (1966: 604) example of the property of large numbers (having a digit frequency equal to ½). He restricts attention to significance levels of the form 2m, and suggests that we would reject a finite initial substring x1xn at the level 2m if the difference between the observed relative frequency in that string differed ‘too much’ from ½. By ‘too much’ Martin-Löf means that fewer than 2nm strings which are in fact initial subsequences of a random sequence, would have a relative frequency differing by that much from ½. (By the definitions in supplement B.2, this means that the string is rejected if it is a member of a set which has measure less than 2nm/2n = 2m.)

An infinite sequence x is rejected at level 2m iff we would reject some initial string at that level. Finally, an infinite string is rejected tout court if, for every level of significance, there is an initial string of that sequence that we would reject at that level (Martin-Löf 1969: 609). A Martin-Löf random sequence is one we would not reject.

B.1.2 Relations between Martin-Löf Randomness, von Mises Randomness, and other Measure Theoretic Accounts

If φ is an effectively computable (recursive) place selection, the compound test of first applying φ and then some recursive significance test is also a recursive significance test (since the class of recursive functions is closed under functional composition). Therefore the class of ML-random sequences is closed under effective place selections, and contains only sequences with the right limit relative frequencies. Given the characterisation of vM-random sequences in section §2.1.1, it is clear that the ML-random sequences are a subset of the vM-random sequences. Ville's result shows that this inclusion is strict. But a result of van Lambalgen (1987b: §4) shows that despite this strict inclusion, the overlap is great, as the intersection of the ML- and vM-random sequences can be shown to have effective measure one. It also follows immediately from the above claims that the class of ML-random sequences is such that every admissible subsequence of a random sequence is itself Borel normal. (It is not known whether the requirement that all admissible subsequences be themselves Borel normal is sufficient for ML-randomness.)

Various refinements of ML-randomness have been offered, usually by varying the conditions on what should count as an effective statistical test. Schnorr (1971), for example, restricts the effective tests to a narrower class than Martin-Löf permits, and thus has a correspondingly broader set of random sequences. For Schnorr, the effective uniform measure zero sets should be defined as the intersection of a sequence of sets whose measures are not merely bounded by computable reals (as in section B.2), but are themselves computable reals. Schnorr offers an argument that this requirement—that the significance tests should be total recursive, not just recursive as Martin-Löf requires—is needed for genuinely effective significance tests. Schnorr contends that only when the test is total recursive can we ‘visualize’ the property of stochasticity that is being tested for, and this is the only constraint on whether we can genuinely grasp the property (van Lambalgen 1987a: 70).

Schnorr also provides an interesting connection with the gambling motivation behind von Mises' view, as Schnorr is able to characterise both his and Martin-Löf's conception of randomness in terms of the failure of a generalised kind of betting strategy called a Martingale, which generalises the place selection characterisation of betting strategies (van Lambalgen, 1987a: §3.4; Martin-Löf, 1969b: §9). Armed with the notion of a Martingale and various different constraints on the kind of functions which may be used to determine the betting strategy, a number of different non-equivalent notions of randomness intermediate between ML-randomness and vM-randomness/Schnorr-randomness can be discerned (Downey and Hirschfeldt, 2010: ch. 6; Dasgupta, forthcoming: §11).

However, the philosophical (as opposed to mathematical) rationale behind these notions is not yet clear. Schnorr's notion of visualisability, whatever its merits, is not clear enough to be decisive, especially because even arbitrary total recursive tests are still far from being graspable in any ordinary everyday sense. And Schnorr's restriction has other drawbacks; for one, there is no universal test of Schnorr randomness (Downey and Hirschfeldt 2010: §§6.1.1–6.1.2). This is because, if the property is satisfied by a computable example (as Schnorr requires for visualisability), there is a computable sequence which is not rejected by the test. If there were a universal test for Schnorr randomness, there would be a computable sequence that is not rejected by it—but that would be a computable random sequence, which cannot be. So of course it is an intended feature of Schnorr's account that there be no universal test; but that does make the space of randomness properties a little more unmanageable on Schnorr's view.

Another potential difficulty with Schnorr randomness is that it is not quite as robust as ML-randomness. The class of ML-random sequences has many apparently very different mathematical definitions (in terms of effective tests, in terms of Martingales, and—as we will see in §2.3—in terms of complexity) and is thus apparently a mathematically robust notion; it is not clear that Schnorr randomness has the same robustness. There is a complexity based account of Schnorr randomness in terms of computable prefix free machines (Downey and Griffiths 2004), as well as one in terms of computable martingales (Downey and Hirshfeldt 2010: §6.1.4). But ML-randomness satisfies van Lambalgen's theorem on relative randomness (van Lambalgen 1987a, Dasgupta forthcoming: §10), for if xy is a random sequence which results from interleaving two other sequences x and y, then those two sequences will be random relative to each other (that is, even complete knowledge of the outcomes in x won't improve predictability of the outcomes in y). But Schnorr random sequences do not satisfy van Lambalgen's theorem (Downey and Hirschfeldt 2010: §5.9)— in fact, there is a Schnorr random sequence such that the even outcomes compute the odd outcomes and vice versa. However, the strongest reason for suspecting that Schnorr randomness is not the correct account of random sequences is that there is a Schnorr random sequence that is not von Mises-random (Wang 1996: Theorem 3.3.5). So there is a Schnorr random sequence that does not have invariant subsequence frequencies under admissible place selections, and is therefore susceptible to exploitation by a gambling system. This seems compelling reason to think that Schnorr randomness is not conceptually adequate.

B.2 Lebesgue Measure and Effective Measure

Let X be the set of all sequences of outcomes of a binary process. Let X be the set of finite sequences of outcomes of a binary process, or strings. If xX, let sx(i) be the i-th member of sequence x. If σ is a finite sequence of outcomes, let N (σ) be the set of all (finite or infinite) xX which begin with σ.

We now define a uniform measure μ on subsets of the Cantor space (Dasgupta forthcoming: 7–9; Earman 1986: 138–9; Gowers 2008: III.55). We begin by assigning measures to the sets N(σ), called the basic sets. Note that half of the possible outcome sequences begin with 1, and half with 0; and of those that begin with 1, half begin 11, the other half 10; and so on. In general, where |σ| denotes the length of the string σ, the proportion of all possible outcome sequences that begin with σ—the measure of N(σ), denoted μ(N(σ))—is 1/2|σ|. We now extend this measure to the full space. An open set is defined to be an arbitrary union of basic sets. A basic set is maximally contained in A if N(σ) ⊆ A and for every proper initial segment τ of σ, it is not the case that N(τ) ⊆ A . It turns out that every open set G is a union of non-overlapping (‘disjoint’) basic sets that are maximally contained in G. The measure of a union of disjoint sets is the sum of the measures of the sets, so if the open set G decomposes into disjoint basic sets N1), N2), …, then μ(G) = ∑i μ(Ni)).

A set E is said to have measure zero if there is an infinite sequence of open sets G1, G2, … such that (i) E is contained in each Gi (equivalently, E ⊆ ∩iGi); and (ii) μ(Gi) ≤ 2i. (The idea is that a small set should be coverable by an open set which is also small. Conditions which do the same job as (ii) can be formulated in various ways, with various exponents; it is convenient to have the measures of the open sets diminish quickly.) A set E is measurable iff it can be approximated arbitrarily closely by the union of finitely many basic sets. (More precisely, if for any ε > 0, there is a finite union of basic sets F such that the symmetric difference between E and F can be covered by a set of measure less than ε. The symmetric difference is the set of elements which are in exactly one of E and F; it is the ‘mismatch’ between the two sets.)

The measurable sets form a σ-algebra, which is a set with the nice properties of being closed under complementation, union, and intersection. Moreover, it can be shown that the measure μ can be naturally and uniquely extended to the set of all measurable sets, where the measure defined has corresponding nice properties (so that, e.g., the measure of the complement of E is 1 − μ(E), if EF then μ(E) ≤ μ(F), etc.). The measure so-defined is the Lebesgue (uniform) probability measure on the Cantor space.

Each individual xX forms a singleton of measure zero, since if σn is the initial n elements of x, then xNn) for all n. Each Nn) is an open set, because a basic set. Moreover, there are infinitely many such sets, and for each μ(Nn)) = 2n. So this singleton set satisfies the conditions for a measure zero set. (A countable union of measure zero sets also has measure zero, so all countable sets of sequences are also of measure zero.)

We now apply to measure theory some ideas from the theory of computability. A set is recursively enumerable iff it is the range of a (total or partial) recursive function on the natural numbers (Boolos et al. 2003: §8.3). Since the recursive functions are just those which are effectively computable, by the Church-Turing thesis, we may also call these sets computably enumerable. We defined an open set as an arbitrary union of basic sets; we now define an effective open set as a union of basic sets which are members of a recursively enumerable set. There are only countably many recursive functions, so there are only countably many recursively enumerable sets of basic sets. Since there are countably many basic sets, there are uncountably many sets of basic sets. So while there are uncountably many open sets, there are only countably many effective open sets; the later notion is much more restrictive. A sequence of sets G1, G2, … is uniformly effective open if there is a recursive function f from pairs of natural numbers (m, n) to basic sets such that each Gm = ∪n=1 N(f(m, n)). That is, the sequence is uniformly effective open if not only is each member effective open, but there is a single recursive function that enumerates, for each member of the sequence, the set of basic sets of which it is the union.

Adapting the earlier definition in the obvious way, a set is said to have effective measure zero if there is an infinite sequence of uniformly effective open sets G1, G2, … such that (i) E is covered by each Gi; and (ii) μ(Gi) ≤ 2i. (The idea is that an effectively small set should be coverable by an effective open set which is also small.) A set is effective measure one iff its complement is effective measure zero.