Type theory in 15 minutes

I’ve posted a couple of things on (homotopy) type theory so far, and both times hinted at giving an actual introduction. Since I can finally feel justified in not doing work for a little bit, I’ll go ahead and do that now. I wanted to cover dependent type theory, but this takes longer than I had thought, so we’ll start with a less ambitious type theory: the simply typed lambda calculus.

Mathematics as construction

Type theory arises from an attempt to formalize L.E.J. Brouwer’s notion that mathematics is “an activity of mental construction.” The traditional way to formalize a mathematical discipline is to lay out a logical language which is intended to lay down the statements (true or false) that can be meaningfully expressed; and then posit some axioms which are intended to capture the basic truths of the system. From this, we “do” mathematics by proving certain statements to be true, and others to be false. That is, the foundation is logic, and mathematics is built on top of this foundation.

Brouwer viewed this as an inversion of the actual state of affairs: The notion of “mathematical construction” is basic, while logic is a sort of mathematical activity, and so must be built on top of this. To formalize this intuition, instead of presenting the valid statements and basic truths, we must posit the basic objects and basic constructions. Only then, if we chose, may we interpret certain objects as being “logical statements”, and certain objects as being “logical proofs”. To present type theory, then, we must determine

• What are the basic objects?
• What are the basic constructions?
• How do we interpret logic?

But before we even begin to answer those questions, we need to think a little more about what “construction” means.

Mathematical constructions: types and judgment

Intuitively, a construction should say something like “given an object x, we may form an object c(x).” For example: Given natural numbers x and y, we may form their sum, x+y. Observe that this statement requires that x and y are natural numbers; it makes no sense to form the sum of “green” and “whiskey”, or a group and a natural number. That is, constructions only make sense on objects of a given type. An object, by virtue of existing, has a type attached. We call our objects “terms”, and we write “the term x has type A” as x:A (we will also express this in English as “x is an inhabitant of A“). I said “by virtue of existing”, meaning that it doesn’t make sense to present an object without it having a type–as opposed to (ZF) set theory, where all object are taken from “the universe”, we must say what an object is when we introduce it.

This expression, x:A, is a typing judgment. It is not a proposition: after all, propositions are part of logic, and we haven’t gotten that far yet. Similarly, we do not “prove” a judgment, we infer it. The valid inferences are given by inference rules, which always take the form

Given judgments $\mathcal{J}_1, \mathcal{J}_2, \ldots, \mathcal{J}_n$, we may infer the judgment $\mathcal{I}$.

So far, we have seen one sort of judgment: x:A. However, before we can even make such a judgment, we must know that A is a type. We express this judgment A:Type. By analogy with typing judgments, this seems to be saying that Type is a type. For now, this analogy is harmless.

There is another form of judgment which is somewhat more subtle. Introducing the sum x+y does us no good unless we can compute with it. That is, unless we can say what the term x+y means. For this, we introduce equality judgments, which give “judgmental” or “definitional” equality. We write x≡y to mean “x is judgmentally equal to y.” Judgmental equality is supposed to capture the idea that two terms are different “words” with the same meaning: one term may be substituted for another without change of meaning. For example “President of the United States” and “Head of State for the US” mean the same thing. However, even though Barack Obama is the President of the United States, we cannot freely substitute “Barack Obama” for “the President of the United States”, as demonstrated by the nonsensical “George Bush was Barack Obama before the President of the United States.”

Since x and y can only have the same meaning when they have the same type, x≡y only makes sense when x:A and y:A (for some type A). When we wish to be clear about the type under question, we will write x≡y:A. By the same analogy as above, we also have A≡B:Type.

That is, we have

• Typing judgments x:A and A:Type
• Equality judgments x≡y:A and A≡B:Type.

Presenting types

Intuitively, a type is the collection of all objects for which certain constructions make sense. Since we need to introduce the “basic” objects and “basic” constructions, this means for any type we need to say

• what “basic” (or canonical) terms inhabit that type (how to introduce inhabitants of that type);
• what “basic” constructions we may perform on terms of that type (how to eliminate inhabitants of that type);
• How the basic constructions “behave” on the basic terms (how to compute with the basic constructions).

These are given by 3 sorts of rules, called introduction, elimination and computation rules. The elimination and computation rules will capture the idea that a type is “completely determined by its canonical inhabitants”. That is, it suffices to define constructions by defining them on canonical terms. For reasons that will become clearer when I discuss dependent types, we will need a fourth sort of rule called a formation rule, which says under what conditions we can “form” a given type. Usually this is “the most obvious condition possible”.

A type is defined by these 4 sorts of rules; unlike sets, which are defined as the totality of their members, a type is defined by when it can be formed, what its canonical elements are, what basic constructions it admits and how these basic constructions behave.

The first example: functions

Since we can present the rest of our types by reference to function types, we start with these. Intuitively, function types represent the type of “valid constructions” which take terms of a given type, and return terms of another. We have

Function formation: Given types A and B (A:Type and B:Type), we may form the type of functions from A to B, written A→B.

Function introduction: What does it mean to have a “basic” function from A to B? Since functions represent constructions, we should have that whenever we have a way to construct a B from an arbitrary A, we may construct a function. Somewhat more rigorously: whenever it is the case that given x:A, we may construct b:B, then there is a function λx.b:B. This is called a lambda abstraction and is usually read out loud as “lambda x dot b”. Intuitively, this is supposed to be the function which returns b when applied to input x. There is a subtlety here: b is a term, but it may contain x as a “free” variable. For example, x+1 is a term of type N (the type of natural numbers), but in order to actually determine what this term means, we must first know what x means. So, λx.x+1 is a function which takes some x:N and returns x+1 (which also has type N).

Function elimination: Given a function f:A→B and a term x:A, we may apply f to x, to get f(x):B. For example, our function λx.x+1 may be applied to the number 0, which should return 1. This fact is captured by

Function computation: If, given an arbitrary x:A, we may construct b:B, and we have a term a:A, then (λx.b)(a) means “b, with x replaced with a.” We write this (λx.b)(a)≡b[x/a]. Using our example function, we have (λx.x+1)(0)≡0+1. When we actually have 0, + and 1 defined, we will see that this is in fact judgmentally equal to 1.

A brief note: This might all look technical and scary, but I’ve really said nothing interesting so far: I’ve only said that if we have a rule for “constructing” an object (of a certain type) whenever we have an object (of some other type), then we have a mathematical object which “captures” this construction. Moreover, this object behaves exactly as we should expect it to.

Functions of multiple arguments

Crucially, the definition of the function type only accepts a single input. To define a function, such as + which takes multiple inputs, we can abstract over multiple arguments. E.g., we have the (formally unsatisfying, but intuitively clear) function λx.λy.x+y. What is the type of this function? Well, it takes x:N, and returns λy.x+y. This in turn takes y:N and returns x+y. x+y:N, so λy.x+y:N→N, and so finally λx.λy.x+y:N→(N→N). That is, we have a function which takes an N and returns a function N→N. This is often called a Curried function (after Haskell Curry). Since we write functions of multiple arguments a lot, we will omit brackets, and assume → associates “to the right”. So N→N→N means a function which takes an N and returns a function N→N, while (N→N)→N is a function which takes a function N→N and returns an N.

The rest of the simple types

Which types are included in “the” simply typed lambda calculus depends on the purpose–some authors just have function types (and some base type), others add sums, or products, or N. We’ll stick with what we need to encode propositional logic: functions, sums, products, unit and empty.

The unit type is supposed to be a “one element type”, which we may as well call 1. We may always form 1.

We introduce a unique canonical term, let’s say *:1.

What does it mean to eliminate a term of type 1? The elimination rule is supposed to say how to define constructions “out of” 1. Since a construction “out of” 1 is the same thing as a function out of 1, we need to know how to define such a function. Intuitively, this should just select a term of some type A. That is,

unit elimination: Given a type A:Type and a term a:A, we have a function a*:1→A such that
unit computation: a*(*) ≡ a.

The empty type is supposed to be a type which we can never construct an inhabitant of. We will call this type 0 (and as with 1, we can always form it). Since we do not want to have any inhabitants, there is no introduction rule.

What about the elimination rule? Just as in set theory, we want to have a unique “empty” function from 0 to any other type. That is: given a type A, there is a function !:0→A, and since there is no introduction rule for 0, we have no computation rule: there is nothing to compute.

Product types are supposed to capture the Cartesian product of two types. So, the product A×B should have terms (a,b) with a:A and b:B. This gives us the formation and introduction rules as:

Given A:Type and B:Type we may form the product A×B:Type; given a:A and b:B, we introduce (a,b):A×B.

What about elimination? The obvious choice is the projections, but we can do a little better: a construction “out of” A×B should correspond to a function A×B→C for some type C; that is, function of two arguments. Recalling back to curried functions, then we should have a function A→B→C; more explicitly:

Product elimination: Given a type C:Type and a function f:A→B→C, there is a function f':A×B→C which behaves like f “on pairs”. that is, we have the computation rule f'((a,b)) ≡ f(a)(b). Then we can define the projections as p1 ≡ (λx.λy.x)' (that is, the function defined by elimination on λx.λy.x) and p2 ≡ (λx.λy.y)' Checking that this works:

p1((a,b)) ≡ (λx.λy.x)'((a,b)) ≡ (λx.λy.x)(a)(b) ≡ (λy.a)(b) ≡ a

The first equality is the definition of p1, the second by the computation rule for products, and the third and fourth by the computation rule for functions. (Notice that there’s no replacement to do in the last one, since y doesn’t occur in the expression a).

Finally, we want sum types, which we write A+B (and we can form this type under the same conditions as product and function types). These correspond to disjoint unions in set theory. Why not unions? because we cannot arbitrarily throw objects into new types (as we can in set theory), so we need to “tag” our terms with their “real” home. That is, given types A:Type and B:Type, and a term a:A, we have a term inl a: A+B, and likewise, given b:B, we have inr b:B (inl is read “in left” and inr is read “in right”). Notice that we have two introduction rules.

To define a function A+B→C, we need to know what to do with the “left” terms and what to do with the “right” terms. In other words, we need two constructions f:A→C and g:B→C. This allows us to construct a function f+g:A+B→C, with the computation rules (one for each introduction rule!), (f+g)(inl a) ≡ f(a) and (f+g)(inr b) ≡ g(b).

Encoding logic

To wrap up this (already long) post, let’s look at how to encode logic. According to our Brouwerian intuition, we should have that “prove a proposition” corresponds to “construct an object”–logical proof is a part of mathematical activity, and mathematical activity is all about constructing things. It seems natural to think that all proofs of a given proposition have the same “type”–the type of proofs of that proposition, so why not encode propositions as types, and say that a “proof” if that proposition is a term of the corresponding type?

We start with the two obvious propositions: “the” true proposition, and “the” false proposition. Since truth corresponds to being inhabited, we have an obvious candidate for true: 1, and since false things should never be provable, the false proposition should have no inhabitants: let’s take 0.

Implication: “A implies B” means if A is true, then B is true. That is, from a proof of A, we can construct a proof of B. This is exactly what A→B describes.

Conjunction: to show that “A and B are true” we need to know that A is true and that B is true. In other words, to construct a proof of “A and B“, we need a proof of A and a proof of B: this is exactly A×B

Disjunction: To prove “A or B” we need either A or B. As you can hopefully guess by now this is exactly A+B.

Finally, what about negation? The usual way to show something is false is to show that it leads to absurd consequences. That is, ¬A is true when from a proof of A, we can conclude something false. I.e., ¬A should correspond to A→0.

So, summing it up:

 True:  1
False:  0
A→B:  A→B
A∧B:  A×B
A∨B:  A+B
¬A:  A→0


If you compare the introduction and elimination rules for types, with the corresponding introduction and elimination rules in Natural Deduction, you see that (except for product elimination), they correspond exactly, with 0 elimination being the principle of explosion. Moreover, the usual elimination rules for conjunction are exactly the projections, and we can “prove” the elimination rule for products in natural deduction (That is, we have a derivation in ND of A∧B→C with open assumption A→B→C)

One thing we’re missing is double negation elimination: that is, we have intuitionistic or constructive propositional logic. For Brouwer, this was a Good Thing: proving that “A is true” by showing that A is not false doesn’t give us an actual witness that A is true. For example, the usual presentation of Euclid’s proof doesn’t (necessarily) give us an actual prime, it just tells us that there must be one somewhere. (For what it’s worth, the theorem is constructively valid, you just need to be more careful to present the proof constructively.)

There are other reasons that this is a good thing (and of course, reasons it’s a bad thing), but we’ll leave those for another day.