On Cantor’s Diagonal (again)

Just like so many mathematical cranks, the diagonal argument has always fascinated me. But unlike the cranks, it’s not the argument itself that fascinates me, but the sociological phenomenon of it: Cantor’s theorem is without a doubt the single result in mathematics which generates the most crankery.

What has fascinated me is the mystery of why. Why is this result so hard to stomach for so many? Some have suggested that the answer has to do with intuitionism, which I have argued is not the case. The issue seems to be with the notions of “infinity” and “size”: the usual presentation of the diagonal argument seems to make use of infinitely many choices, and the result (as usually presented) says that the set of reals is “bigger” than the set of naturals, which is repugnant to people who don’t accept the notion of “completed infinity”.

My goal here is not to argue against finitists (I haven’t the patience), or to try to justify the idea that there are more reals than naturals (I have no interest in flogging a horse which died over a century ago). Rather, I want to argue that Cantor’s theorem (rather, the spacial case dealing with the naturals) makes sense without talking about infinite sets, and without talking about size.

To do this, we need to talk about sets without them being extensional (that is, without them being determined by their elements) and to talk about functions in a slightly different way than we normally do when presenting the diagonal argument. We do this by dealing with types.

Types are determined not by their members, but by introduction and elimination rules. Most types of interest are “inductive types”, where the introduction rules give us “canonical elements”, and the elimination rule says that to define a function out of the type, it suffices to say what happens on the canonical elements. Often, the canonical elements are generated according to some recursive rule: If T is the type we are defining, we may define new members of T by reference to old members of T (and possibly by reference to members of other types).

The classical example is the type of natural numbers; the type \mathbb{N} of natural numbers is determined by the following rules:

  • Introduction rule: There is a member 0:\mathbb{N}.
  • Introduction rule: For any n:\mathbb{N}, there is a member \mathsf{succ}(n) :\mathbb{N}. That is, we can always take the “successor” of a natural number.
  • Elimination rule: If we have a type C, a member c_0:C and a function c_s:\mathbb{N} \rightarrow (C\rightarrow C), there is a function f:\mathbb{N}\rightarrow C such that f(0)=c_0 and for an arbitrary n:\mathbb{N} we have f(\mathsf{succ}(n)) = c_s(n,f(n)). Here, c_s(n,f(n)) is shorthand for (c_s(n))(f(n))—since c_s(n) is a function from C to C, and f(n) has type C (by the “inductive hypothesis”), we may apply c_s(n) to f(n).

We say that \mathbb{N} is inductively generated by 0 and \mathsf{succ}.

The point is that we don’t need \mathbb{N} to be some “completed collection”—we only need to be able to recognize when something is a natural number, and once we’ve recognized something to be a natural number, to be able to perform the sorts of manipulations we expect to be able to perform on natural numbers.

Another classical example is the type of Booleans, which we call \mathbf{2}. It is “inductively” generated by members we call \top:\mathbf{2} and \bot:\mathbf{2}. That is,

  • \top:\mathbf{2};
  • \bot:\mathbf{2};
  • Given a type C and members c_\top:C and c_\bot:C, there is a function f:\mathbf{2}\rightarrow C such that f(\top)=c_\top and f(\bot)=c_\bot.

An example of a function constructed using the elimination rule for \mathbf{2} is the negation, \neg:\mathbf{2}\rightarrow \mathbf{2}, which we define as follows: We define c_\top:=\bot and c_\bot:=\top. Then the elimination rule says that we have a function (which we denote \neg) from \mathbf{2} to \mathbf{2} such that \neg(\top)=\bot and \neg(\bot)=\top.

An important fact about inductive types is that canonical elements which take different forms are never equal. For example, n:\mathbb{N} is not equal to \mathsf{succ}(n), and \top\neq\bot.

The elimination rules actually generalize from functions to a type, to dependent functions on a type family. For \mathbf{2}, this becomes:

Given a type family F indexed by \mathbf{2}, and members f_\top : F(\top) and f_\bot : F(\bot), there is a dependent function f:\prod_{x:\mathbf{2}}F(x) such that f(\top)=f_\top and f(\bot)=f_\bot.

Recall that in type theory, equality is a type family, double-indexed by members of a fixed type. In particular, we have a type family D indexed by \mathbf{2} defined as D(x):= \neg(x) = x. Going a step farther: the claim x\neq y is to be regarded as a type (for x and y fixed), so we may define the type family F indexed by \mathbf{2} as F(x):= \neg(x)\neq x. Combined with the elimination rule for \mathbf{2}, the fact above that \top\neq \bot gives us

Lemma. For an arbitrary x:\mathbf{2}, we have \neg(x)\neq x. That is, there is a dependent function f:\prod_{x:\mathbf{2}}(\neg(x)\neq x).

Proof. We remarked above that we have \top \neq\bot. That is, we have some p:\top\neq\bot. By symmetry, we also have some p' :\bot\neq\top. Since \neg(\top) is \bot and \neg(\bot) is \top, we have that p:\neg(\bot)\neq\bot and p':\neg(\top)\neq\top. Setting f_\top:=p' and f_\bot:=p, we may apply the elimination rule for \mathbf{2} to get the desired f. \square.

We cannot interact with members of these types any way we want—we can only interact with them using the rules of type theory, or by choosing an arbitrary member (which we have no information about, other than its type). In particular, to define a function from \mathbb{N} to another type C we may either:

  • Use the elimination rule, provided we have the necessary information (the type C, the member c_0 and the function c_s.)
  • Use the introduction rule for function types: If, given an arbitrary member of \mathbb{N} we may construct a member of C, then there is a function f:\mathbb{N}\rightarrow C.

The take away is that we cannot define a function out of \mathbb{N} by saying “Do this to 0, this to 1, this to 2, and so on.” We must give an explicit rule which does not directly refer to the members, except in an arbitrary way, or according the “inductive structure” of \mathbb{N}.

Now, we need to interpret Cantor’s diagonal in this context. Rather than using real numbers, or defining a powerset, we will simply use function \mathbb{N}\rightarrow\mathbf{2}. We also must forego any mention of “size” or “cardinality” So our theorem becomes

Theorem. There is no surjective function from \mathbb{N} to \mathbb{N}\rightarrow\mathbf{2}.

Proof. Let f be an arbitrary function f:\mathbb{N}\rightarrow (\mathbb{N}\rightarrow\mathbf{2}). Note that we know nothing about f other than its type. That is, we may only interact with it using the introduction and elimination rules for function types.

We define a function \delta_f:\mathbb{N}\rightarrow\mathbf{2} by function introduction (“lambda abstraction”): \delta_f:=\lambda (n:\mathbb{N}). \neg f(n,n). That is, \delta_f takes a natural number n, a negates the value of applying f(n) to n.

Claim. \delta_f is not in the range of f. That is, there is no n:\mathbb{N} such that \delta_f=f(n).

Proof (of claim). Let n:\mathbb{N} be arbitrary. By the lemma, we know that \neg(f(n,n))\neq f(n,n). Since \delta_f(n) is \neg(f(n,n)), we have that \delta_f(n) \neq f(n,n). But then \delta_f and f(n) differ in some argument, namely n, so \delta_f \neq f(n). Since n was arbitrary, there is no n such that \delta_f = f(n). This proves the claim.

In other words, f is not surjective. Since f was arbitrary, we must have that no function \mathbb{N}\rightarrow(\mathbb{N}\rightarrow\mathbf{2}) is surjective. \square

And there we have it. The point is that we needn’t think of sets as “completed collection”, but merely “things generated by rules”, and we needn’t introduce the notion of “size” into the discussion at all: I don’t know what “the set of natural numbers” is (I just know I have a way of generating some natural numbers, and a way to construct a new object when I am given an arbitrary natural number) and I don’t know what “size” is (I just know what a surjective function is), and I still have a way to make sense of Cantor’s theorem.

Someone intent on arguing with my may suggest that I’ve moved the goalposts—after all, the real numbers aren’t really functions from \mathbb{N}\rightarrow\mathbf{2}. However, this misses the point: the argument can be reformulated only slightly to work for the Dedekind or Cauchy construction of the reals; both constructions are given in a style like this one—without appeal to infinity or choice or any other controversial infinitary principle—in Chapter 11 of the HoTT book.

The next post will be on a topic less mundane and tired.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s