15 More Minutes of Type Theory

In my last post, I gave a quick run-down of simple type theory with functions, products and sums; and briefly sketched the correspondence with propositional logic. We can extend type theory to accommodate predicate logic, which is today’s task. When we’re done, we’ll be one step away from Martin-Löf type theory: identity types.

Putting our intuitionist glasses back on, we want to give an account of “for all” and “exists” where a proof of “for all x, P(x)” and “exists x, P(x)” correspond to constructions. But first we need to understand predicates.

Classically, it’s common to interpret a predicate on A as a subset of A, or of some Cartesian power (A^n). A well-known fact is that for any predicate (subset), P\subseteq A, there is a characteristic function \chi_{P}:A \rightarrow 2 to the two element set \{0,1\} with \chi_P(x) = 1 precisely if x\in P. (Or in predicate notation, if P(x).) Since classically, a proposition is either true or false, we can think of a predicate on A as a function from A to the set of “propositions”, where 1=true and 0=false.

As we saw last time, in type theory, we interpret propositions not as truth values (true or false), but as types, so it makes sense to interpret a predicate P on a type A as a function P:A→Type. There’s a certain sleight of hand here: while we interpret propositions to be types, we may not wish to interpret all types as propositions. For example, it seems a bit unnatural to think of ℕ as a proposition. Because of this, we will call a function P:A→Type a type family.

Type families

Type families do more than simply allow us to interpret predicates: If we have some type family P:A→Type, we have, for each a:A a type P(a). That is, this really is an indexed family of types, the same way that \{X_i\mid i\in I\} is an indexed family of sets. If we squint right, this lays down a framework from which we might be able to come up with a type theoretic analog of the set-theoretic axioms of choice, comprehension and replacement (more on those some other time).

From a programming perspective, type families give us a more fine-grained analysis at the type level. For example, we can create a type of vectors of length n (for fixed n) which depends on n. This may seem abstract silly and rather, but n-branching trees (which have a maximum n) are common bases for data structures (e.g., B-trees), and these are usually implemented using an array of pointers which needs to remain the same length, or by hand-coding the requisite number of pointers. In either case, the programmer has to do extra work: either copy code for each case, or ensure (in the real world) that the array is never made larger or smaller. Using a type of length n vectors, we put this check at the type level: an incorrect number of children simply isn’t possible.

Type Families and Predicates

Since a proposition is true if it’s inhabited, a term p:P(a) is a “proof” that P(a) holds. For example we could define the type family isEven:N→Type by induction. In fact, let’s do it from first principles in agda:

data ⊥ : Set where

data ⊤ : Set where
 * : ⊤

data ℕ : Set where
  zero : ℕ
  succ : ℕ → ℕ

{-# BUILTIN NATURAL ℕ #-}

¬ : Set → Set
¬ a = a → ⊥

isEven : ℕ → Set
isEven zero = ⊤
isEven (succ x) = ¬ (isEven x)


p : isEven 8
p = λ z → z (λ z₁ → z₁ (λ z₂ → z₂ (λ z₃ → z₃ *)))

The three blocks starting with data declare types (agda uses Set instead of Type); I’m using ⊥ for the empty type (no constructors, hence the empty “where”), ⊤ for the unit type (with a single constructor), and ℕ for the natural numbers (with two constructors zero and the “successor” function). Agda infers the elimination rule automatically, and gives us some nice syntax for elimination (the pattern matching seen in isEven). The next line tells Agda that I’ve re-implemented the naturals–this allows me to use numerals (like 8 below). After that is the definition of negation from before, and then our type family: zero is even, so we define isEven zero to be a type we know to be inhabited; if n is even, then n+1 isn’t even, so we just use negation.

Finally, there’s an example proof that 8 is even. It looks awful. Two things:

  • Agda found that proof automatically in less time than it took me to blink.
  • A full, fully formal proof that 8 is even in any formal system is scary. I’d be impressed to see such a readable proof in (e.g.) ZF.

Universal Quantifier

We have predicates, so we need to figure out how to quantify over them, in a “constructive” way. The universal quantifier says that given x:A, there is a proof b:B(x). This looks a lot like a function from the previous post, and in fact we’ll treat it almost the same: given a type family B:A→Type, we can form the type of dependent functions (also called the dependent product), written \prod_{\mathtt{x:A}}\mathtt{B(x)}. A dependent function is introduced the same way as a normal (non-dependent) function: if, given an arbitrary x:A we can form a term b:B(x), then we can introduce the dependent function λx.b(notice, though, that the “output type” depends on the input value). As with non-dependent functions we eliminate this by application, and (λx.b)(a) computes to b[x/a].

Dependent functions generalize functions, since a “non-dependent” function A→B can be thought of as \prod_{\mathtt{x:A}}\mathtt{B(x)} where B(x) is constant. But now that we’ve generalized our dependent functions, our elimination rules aren’t good enough: it’s not enough to to be able to define normal functions out of types, we also need to be able to define dependent functions out of types. Intuitively, all we do is replace ‘type’ by ‘type family’, and ‘function’ by ‘dependent function;, and everything works out; the computation rules are even written the same. In practice, the dependent eliminators look much more terrifying than their non-dependent counter-parts. Rather than going through it, I’ll just give an example: the natural numbers

The usual way to write the proposition encoding induction (over the natural numbers) for a formula \varphi is something like

\varphi(0) \wedge (\forall n\in\mathbb{N} . \varphi(n) \rightarrow \varphi(n+1)) \rightarrow \forall n\in\mathbb{N} . \varphi(n)

Which reads “If \varphi(0) holds and for all n\in\mathbb{N}, if \varphi(n) holds then so does \varphi(n+1), then \varphi for all n\in\mathbb{N}. Translating this into type theory using dependent products as “for all” and C instead of \varphi gives us

\mathtt{C}(0) \rightarrow (\prod_{n:\mathbb{N}}\mathtt{C}(n) \rightarrow\mathtt{C}(n+1)) \rightarrow \prod_{n:\mathbb{N}}\mathtt{C}(n).

Besides changing some symbols and formatting, the only difference is that we’ve Curried our function. In fact, the elimination rule for the natural numbers is induction.

Existential quantifier

If we insist on having constructive content, then we had better be able to back a claim that “there exists an x such that P(x)” by exhibiting a witness that P(x) holds. That is, the first thing an “existential constructions” should do is exhibit a thing; but it’s not enough to just give a thing and insist it satisfies our property, we must also demonstrate that this thing does, in fact, satisfy P. That is, to prove \exists x. P(x), we need to give

  • A witness x
  • A proof that P(x) holds.

As usual, our existential quantifier is limited to some domain; in our case, this will be a type, A. We can package our “existential construction” of \exists x\in\mathtt{A}.P(x) as a pair (x,p) where

  • x : A,
  • p : P(x)

This motivates one more type constructor in our system: The sum type \sum_{\mathtt{x:A}}\mathtt{P}(\mathtt{x}). We introduce elements of this type as pairs (x,p) with x:A and p:P(x), and we eliminate them just as we would for the product type A×B, except we have to deal with the extra dependency. In short, though, to define a (dependent!) function f:\prod_{\mathtt{x}:\sum_{\mathtt{a:A}}\mathtt{P}(\mathtt{a})} \mathtt{C}(\mathtt{x}) it suffices to define f((\mathtt{a,p}) for each a:A and p:P(a).

The most obvious functions to define from any pairs are the projections; in the case of Cartesian products A×B, we have functions

  • pr_1 : \mathtt{A\times B \rightarrow A}
    pr_1(\mathtt{(a,b)}) :\equiv \mathtt{a}

  • pr_2 : \mathtt{A\times B \rightarrow B}
    pr_2(\mathtt{(a,b)}) :\equiv \mathtt{b}

The definitions are the same for dependent pairs, but there’s a catch: for the second projection of \sum_{\mathtt{x:A}}\mathtt{B(x)}, what is the type of the second coordinate of (a,b)? It’s not B, since B is a type family, not a type. Rather, it’s B(a). That is, the output type of the second projection should depend on the input value. That is, the second projection is a dependent function. Thus, we have the dependent projections:

  • pr_1 : (\sum_{\mathtt{x:A}}\mathtt{B(x)}) \rightarrow \mathtt{A}
    pr_1(\mathtt{(a,b)}) :\equiv \mathtt{a}

  • pr_2 : \prod_{p:\sum_{\mathtt{x:A}}\mathtt{B(x)}} \mathtt{B}(pr_1(p))
    pr_2(\mathtt{(a,b)}) :\equiv \mathtt{b}

It’s always good to ensure things are well-typed. In this case, we need to see that pr_1(\mathtt{(a,b)})\equiv \mathtt{a} (definitionally), to ensure that \mathtt{B(a)}\equiv\mathtt{B}(pr_1(\mathtt{(a,b)})), but of course, this is true by the definition of pr_1.

Sums and products and exponents

Observe that just as the function type A → B is a special case of a dependent product type, the (binary) product type is a special case of the dependent sum type: If our type family B : A → Type is constant, then we recover exactly the product A×B, where we’re abusing notation and using B both for the constant type family and for its (unique) output value.

This struck me as deeply counter-intuitive when I first saw it; surely the dependent sum should generalize sums and the dependent product should generalize products. Of course, this is true; we can recover binary sums and products from dependent ones (hint: they’re binary!), but what’s more illuminating is to look at “type arithmetic”, or the more familiar set arithmetic.

If A and B are finite sets, there are a few neat observations you (hopefully) learn in your first set theory course:

  • \mid A + B\mid = \mid A\mid + \mid B\mid (where + is the disjoint union),
  • \mid A \times B\mid = \mid A\mid\cdot\mid B \mid,
  • \mid A^B \mid = \mid A\mid ^{\mid B\mid} (where A^B is the set of functions from B to A, which we’ve called B\rightarrow A).

These equations actually motivate the definitions of these operations for cardinal arithmetic. Now, (dependent) sums and products represent operations of arbitrary arity, so the fact that functions are special cases of (dependent) products and (binary) products are special cases of (dependent) sums is really saying:

  • Exponentiation is just multiplying something by itself lots of times.
  • Multiplication is just adding something to itself lots of times.

And of course, ever 4th grader knows both of these facts.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s