The Probabilistic Method and Philosophy of Mathematics

Preface

Inspired from the courses Math 159, Phil 162, and Phil 184P.

A Hard Problem

Let’s introduce some definitions so understand what problem we’re trying to answer.

A complete graph $K_n$ on $n$ vertices is a graph where every pair of distinct vertices is connected by an edge. Here is an example of a $K_5$ graph.

A tournament $T_n$ is a complete graph $K_n$ together with an orientation of each edge. Here is a $T_5$ tournament.

A Hamilton Path on an oriented graph is an oriented path which uses every vertex of the graph. Here is one example for the $T_5$ above.

Now, here’s the problem: What is the largest number of Hamilton paths in a tournament?

If you think about it, this gets very messy very quickly. There are ${n\choose 2}$ total edges in a $K_n$ graph, and since each edge can be oriented in two ways, there are $2^{n\choose 2}$ total possible tournaments $T_n$ on a $K_n$. Even for a $T_5$, that’s $2^{10} =1024$ tournaments you would have to check, and then count the number of Hamilton Paths you find in each one. Aside from rejecting all isomorphic tournaments, there doesn’t seem to be any nice trick we can use to make our lives any easier. The number of Hamilton Paths we find is extremely sensitive to the orientation of even a single edge, because whether the Hamilton path exists or not is contingent on a long chain of edges being oriented correctly.

A lower bound solution to this problem comes from Szele, who used the probabilistic method to show there is a tournament $T_n$ with $\frac{n!}{2^{n-1}}$ Hamilton Paths. The proof is really quite nice.

Proof: Take a $K_n$ and orient each edge independently and uniformly at random; call the resulting tournament $T_n$. Let $S_n$ denote the set of all permutations of the $n$ vertices. $|S(n)|=n!$ since there are at first $n$ vertices to choose from, then $n-1$ vertices, and so on. Notice that each permutation of vertices is either a Hamilton path or it isn’t. For each permutation $\pi=(v_1,\dots,v_n)\in S_n$, define an indicator variable $X_\pi$ by $$ X_\pi = \begin{cases} 1 & \text{if } v_1 \to v_2 \to \cdots \to v_n \text{ is a Hamilton path}, \\ 0 & \text{otherwise.} \end{cases} $$ Let $$ X = \sum_{\pi \in S_n} X_\pi, $$ so $X$ is the number of Hamilton paths in $T_n$. For a fixed permutation $\pi=(v_1,\dots,v_n)$, the event $X_\pi=1$ occurs when the $n-1$ edges are all oriented correctly, i.e. we have $$v_1 \to v_2 \to … \to v_n$$ Since the edges are oriented independently and each direction occurs with probability $1/2$, $$ \mathbb E[X_\pi] = \Pr(X_\pi=1) = \left(\frac12\right)^{n-1} $$ By the linearity of expectation, $$ \mathbb E[X] = \sum_{\pi\in S_n} \mathbb E[X_\pi] = n!\left(\frac12\right)^{n-1} = \frac{n!}{2^{n-1}}. $$ Since $\mathbb E[X]$ is the average value of $X$ over all tournaments, it cannot be that every tournament has fewer than $\frac{n!}{2^{n-1}}$ Hamilton Paths. Therefore, there exists a tournament with at least $\frac{n!}{2^{n-1}}$ Hamilton paths. $\square$

If you’re like me, you might be suspicious about this proof. The probabilistic method is compressing two different claims into a single move. First, there is a probabilistic claim about what happens under a random procedure. Second, there is an existential claim that some mathematical object must exist if the probability is strictly greater than zero or through an argument about the expected value.

Consider the $T_5$ case, which tells us there are at least $\frac{5!}{2^{4}} = \lceil 7.5 \rceil=8$ Hamilton paths. Great—but what are they? Show me them! At least for a $T_5$ we can find them by brute force—in fact, there are a maximum of 15 Hamilton Paths in a $T_5$.

from itertools import permutations, product

n = 5
pairs = [(i, j) for i in range(n) for j in range(i+1, n)]  # 10 pairs
m = len(pairs)

# All permutations of 5 vertices (candidate Hamilton paths)
all_perms = list(permutations(range(n)))

best = 0
best_tournaments = 0
count_by_value = {}

for bits in range(1 << m):
    # Build directed edge set: for pair (i,j), bit=0 -> i->j, bit=1 -> j->i
    edges = set()
    for idx, (i, j) in enumerate(pairs):
        if (bits >> idx) & 1:
            edges.add((j, i))
        else:
            edges.add((i, j))

    # Count Hamilton paths
    count = 0
    for perm in all_perms:
        ok = True
        for a, b in zip(perm[:-1], perm[1:]):
            if (a, b) not in edges:
                ok = False
                break
        if ok:
            count += 1

    count_by_value[count] = count_by_value.get(count, 0) + 1
    if count > best:
        best = count
        best_tournaments = 1
    elif count == best:
        best_tournaments += 1

print(f"Maximum Hamilton paths over all 1024 tournaments on 5 vertices: {best}")
print(f"Number of tournaments achieving the maximum: {best_tournaments}")
print()
print("Distribution of Hamilton path counts:")
for val in sorted(count_by_value):
    print(f"  {val:3d} paths: {count_by_value[val]:4d} tournaments")

Which looks like this:

But what if I wanted to see the Hamilton Paths for $T_{100}$? This is computationally infeasible (naively checking every case and no rejection of isomorphic tournaments). The runtime of the (not optimised) python code is a horrendous $O\left(2^{n\choose 2} \cdot n \cdot n!\right)$.

$n$	$\binom{n}{2}$	$2^{\binom{n}{2}}$	$ 2^{\binom{n}{2}} \cdot n \cdot n!$
5	10	1,024	$\sim 6 \times 10^5$
6	15	32,768	$\sim 1.4 \times 10^8$
7	21	2,097,152	$\sim 7.4 \times 10^{10}$
8	28	268,435,456	$\sim 8.6 \times 10^{13}$
9	36	68,719,476,736	$\sim 2.2 \times 10^{17}$
10	45	35,184,372,088,832	$\sim 1.3 \times 10^{21}$
11	55	36,028,797,018,963,968	$\sim 1.4 \times 10^{25}$
12	66	$\approx 7.38 \times 10^{19}$	$\sim 4.2 \times 10^{28}$
$\vdots$	$\vdots$	$\vdots$	$\vdots$
100	4950	$\approx 1.25 \times 10^{1490}$	$\sim 1.17 \times 10^{1650}$

There are $10^{80}$ atoms in the observable universe, for comparison. And yet, Szele’s Theorem tells us there is a $T_{100}$ with least $\frac{100!}{2^{99}}$ Hamilton Paths. Which tournament is it? What paths are they? Despite the existence claim, the probabilistic method cannot answer these questions.

Questions

The logical argument establishing why we can use the probabilistic method works like this:

If every object in a collection of objects fails to have a certain property, then the probability that a random object chosen from the collection has that property is zero. Thus, by contraposition, if the probability that a random object chosen from the collection has that property is nonzero, then some object in the collection must possess the property.¹

Consider three wrinkles.

First, probability isn’t actually doing much work. There is nothing contingent about the “random” tournament on $n$ vertices; the space is fully specified, and the probabilities are analytic consequences of how it is constructed. What Frank Ramsey called the logic of inconclusive argument is being used here to produce conclusive proofs, which is a strange role reversal worth thinking about. The probabilities aren’t really representing uncertainty about anything.

Second, to say “let $T_n$ be a tournament with edges oriented uniformly at random” is to invoke the Principle of Indifference, which Alan Hájek has called the lynchpin of logical probability and which is famously fragile. Bertrand’s paradox shows that on continuous spaces the principle gives different answers depending on how you parametrize, with no principled way to choose between them. The probabilistic method sidesteps the difficulty by living on finite sets, where each object gets weight $1/|S|$ and there is nothing to argue about.

Finally, the proof tells us the object exists, but offers no access to it. As Paul Gordan putatively said: “This is not mathematics; this is theology.” The constructivist/intuitionist tradition descending from Brouwer has been objecting to proofs like this for a century. The contrapositive at the heart of the method relies on the law of excluded middle applied to a quantified statement which is a move that intuitionists do not accept.

To see this, let $C$ be a collection of objects and $P(x)$ the property in question. The probabilistic method’s crucial step is $$\lnot(\forall x \in C\ \lnot P(x)) \to (\exists x \in C\ P(x)),$$ which roughly says: if it is not the case that every object lacks property $P$, then some object has property $P$. In classical logic, the two sides are interchangeable. Intuitionistically, they are not. Moving between the two statements requires the law of excluded middle applied to $\exists x \in C\ P(x)$ itself: either such an $x$ exists, or every $x$ fails to satisfy $P$; the latter is ruled out by assumption, so the former must hold. Critics point to the Liar’s Paradox for why the law of the excluded middle leads to self-contradiction.

Wikipedia, Probabilistic Method ↩︎

Leon Reilly | 郭亮

Discomfort over how the probabilistic method works and a bit about constructivism

The Probabilistic Method and Philosophy of Mathematics

Preface

A Hard Problem

Questions