Foundations of Functional Programming/The λ-cube

The λ-cube. Direction of each arrow is direction of inclusion.

The λ-cube is a set of eight type theories which combine three different type system features in all the possible ways. These three features are:

Values depending on types, also known as polymorphic types.
Types depending on types, also known as type constructors.
Types depending on values, also known as dependent types.

The fourth natural item in this list, namely values depending on values, is a feature of all λ-calculi; any language with functions has this feature. Indeed, we can see each of these three type system features as a way of extending the notion of a function:

Polymorphically typed expressions are functions from types to values.
Type constructors are functions from types to types.
Dependent types are functions from values to types.
And, of course, functions from values to values are a feature of all λ-calculi and have no special name.

The description of polymorphically typed expressions as functions from types to values may sound unfamiliar to the ear. However, it is exactly how they are described in the second-order λ-calculus. An expression of type $\forall \alpha .\ \mathbf {T}$ has the form $\Lambda \alpha .\ A$ , and to instantiate it to an expression of type $\mathbf {T} [\alpha /\mathbf {U} ]$ you write $(\Lambda \alpha .\ A)\mathbf {U}$ .

This syntax makes it clear how polymorphically typed values can be understood as functions from types to values. A polymorphically typed value takes a type as an argument, and returns an instance of itself instantiated to that type.

Basic structure of the λ-cube

The basic structure of the λ-cube is explained by the following table:

	Polymorphic types?	Type constructors?	Dependent types?
λ→ (explicitly simply typed λ-calculus)	No	No	No
λ2 (explicitly typed second-order λ-calculus)	Yes	No	No
λω	No	Yes	No
λP	No	No	Yes
λω	Yes	Yes	No
λP2	Yes	No	Yes
λPω	No	Yes	Yes
λPω (calculus of constructions)	Yes	Yes	Yes

Two of these systems are familiar. λ→ is (modulo minor syntactic variations) the explicitly simply typed λ-calculus. λ2 is (modulo minor syntactic variations) the explicitly typed second-order λ-calculus. λPω, the fullest system in the λ-cube, is also called the "calculus of constructions."

Syntax

Unlike the systems we have considered so far, the systems of the λ-cube make no syntactic distinction between types and values. There is one kind of expression, encompassing both type and value expressions. This corresponds to the fact that in the systems of the λ-cube, types are in essence a special kind of value. Henceforth we will use the term "value" in a way which is inclusive of types.

The systems of the λ-cube also introduce a distinction between variables and constants which was not present in the systems we have considered so far. This distinction corresponds roughly to the same distinction in programming languages. We will denote variables by $a,b,c,...$ as before, and we will denote constants by ad hoc labels; but when we need to refer to an arbitrary constant, we will use bold lower case letters $\mathbf {a} ,\mathbf {b} ,...$ . We continue to use $A,B,C,...$ to denote arbitrary expressions.

The syntax is as follows:

${\begin{array}{rcl}{\text{Expr}}&::=&{\text{Var}}\\&|&{\text{Const}}\\&|&{\text{Expr}}\ {\text{Expr}}\\&|&\lambda \ {\text{Var}}\ :\ {\text{Expr}}.\ {\text{Expr}}\\&|&({\text{Var}}\ :\ {\text{Expr}})\to {\text{Expr}}\end{array}}$

This syntax should be somewhat familiar. In relation to the explicitly typed second-order λ-calculus, the λ-abstractions in this syntax serve both the role of the λ-abstractions and the Λ-abstractions of the explicitly typed second-order λ-calculus. That is, a λ-abstraction can denote a function which takes either a non-type value, or a type, as an argument.

Dependent function type syntax

The syntax $(x:A)\to B$ is a generalization of the syntax $A\to B$ for a function type. This generalized syntax is needed to express dependent types. The syntax's special meaning arises when $x$ occurs free in $B$ . In this case, $(x:A)\to B$ expresses the type of a function which takes a value $x$ of type $A$ and produces a value of type $B$ , where $B$ is an expression denoting a type which depends on the value of $x$ .

Let us give a simple example of this syntax being used. Suppose ${\text{Vect}}$ is a type constructor for fixed length vectors: so ${\text{Vect}}\ a\ n$ , where $a$ is a type and $n$ is a natural number, denotes the type of vectors of length $n$ with elements of type $a$ .

Now consider a function ${\text{rep}}$ which takes an integer $x$ and a natural number $n$ , and produces a vector consisting of $n$ copies of $x$ . Such a function has type

${\text{rep}}:{\text{Int}}\to (n:{\text{Nat}})\to {\text{Vect}}\ {\text{Int}}\ n$

In this example we have used the syntax ${\text{Int}}\to B$ as a shorthand for $(x:{\text{Int}})\to B$ , where $x$ does not occur free in $B$ . We will continue to do this. That is, in general,

$A\to B$

is shorthand for

$(x:A)\to B$ ,

where $x$ is a variable which does not occur free in $B$ .

We will close this topic by noting that in most presentations of the λ-cube, the syntax $\Pi x:A.\ B$ is used instead of $(x:A)\to B$ . The former syntax is motivated by the analogy between dependent function types and dependent Cartesian products in higher mathematics. We prefer our syntax for its greater familiarity from a programming perspective.

Sorts

The λ-cube is the first place we encounter the notion of "sorts." A sort is best understood, in simple terms, as a "type of types." In the λ-cube, where types are a kind of value, types generally have types, in the sense that they can appear on the left hand side of typing judgments.

We shall denote the "type of types" by ${\text{Type}}$ . ${\text{Type}}$ is the simplest example of a sort. (A more common notation for this sort is $\ast$ , but we choose our notation for greater familiarity, and for its use in programming languages such as Idris.)

So far, I have given no reason to think that there are any sorts under than ${\text{Type}}$ . If sorts are types of types, and ${\text{Type}}$ is the type of types, why would there be any sorts other than ${\text{Type}}$ ?

The reason that the systems of the λ-cube (and many other λ-calculi) have more than one sort is to resolve the following problem. What is the type of ${\text{Type}}$ ? It is natural to say that ${\text{Type}}$ , the type of types, is a type, and that therefore ${\text{Type}}:{\text{Type}}$ . The problem is that the assumption that ${\text{Type}}:{\text{Type}}$ gives rise to paradoxes such as Girard's paradox which render λ-calculi logically inconsistent.

For this reason, many λ-calculi, including the systems of the λ-cube, come up with an additional sort, which we will call ${\text{Type}}_{1}$ , and stipulate that ${\text{Type}}:{\text{Type}}_{1}$ , rather than ${\text{Type}}:{\text{Type}}$ . This choice preserves logical consistency. (A more common notation for ${\text{Type}}_{1}$ is $\Box$ .)

Another word on notation. We will also denote ${\text{Type}}$ by ${\text{Type}}_{0}$ , and systems such as Martin-Löf type theory have an infinite hierarchy of sorts ${\text{Type}}_{0},{\text{Type}}_{1},{\text{Type}}_{2},...$ . That is the reason for writing the sort of ${\text{Type}}$ as ${\text{Type}}_{1}$ .

In the systems of the λ-cube, there are only two sorts: ${\text{Type}}$ (a.k.a. ${\text{Type}}_{0}$ ) and ${\text{Type}}_{1}$ . In the systems of the λ-cube, ${\text{Type}}_{1}$ does not belong to any sort; it does not appear on the left hand side of any typing judgments.

Syntactically speaking, sorts are constants. The two sorts are the only constants we will specifically need to deal with; if we didn't need sorts, we could formulate the systems of the λ-cube without constants, as we did for previous calculi.

Declarations and contexts

Declarations have the same syntax as before. However, contexts are defined differently in the systems of the λ-cube than in the systems we have seen so far.

In the systems we have seen so far, contexts are sets of declarations. Sets are unordered collections. In the systems of the λ-cube, contexts are sequences -- ordered finite collections -- of declarations. We denote a context by a comma-separated list of its constituent declarations. An empty context is denoted by whitespace; so for example, $\vdash x:A$ means that the empty context entails that $x:A$ .

The reason for this is that in the systems of the λ-cube, often a declaration is required for another declaration to make sense. The simplest example is that for the typing judgment $x:a$ to make sense, where $a$ is a variable standing for a type and $x$ a variable standing for an ordinary value, one must have $a:{\text{Type}}$ . A context consisting of the single declaration $x:a$ is not valid; however, the context $a:{\text{Type}},\ x:a$ is valid. On the other hand, if we swap the order of the declarations, the resulting context $x:a,\ a:{\text{Type}}$ is invalid, because at the occurrence of the declaration $x:a$ we don't know that $a$ is a type.

In the systems we considered before, this particular necessity did not exist because there was a syntactic distinction between types and values, and so you could tell just by looking at it that a variable represented a type.

In greater generality, for a variable to occur on the right hand side of a statement, it must have a type declared in the context governing the statement. For a declaration $x:a$ which occurs within a context $\Gamma$ , the context governing it is the part of $\Gamma$ which comes before $x:a$ .

Type theory

The rules of β-reduction in all systems of the λ-cube are as they have been before.

The different systems of the λ-cube differ in their rules defining valid type entailments. We will begin by giving the rules which are common to all of the systems. In the following rules, $s$ denotes any sort (one of ${\text{Type}}$ or ${\text{Type}}_{1}$ ).

(axiom)	$\vdash {\text{Type}}:{\text{Type}}_{1}$
(start)	${\frac {\Gamma \vdash A:s}{\Gamma ,\ x:A\vdash x:A}},\ {\text{if}}\ x\notin \Gamma$
(weakening)	${\frac {\Gamma \vdash A:B\quad \Gamma \vdash C:s}{\Gamma ,\ x:C\vdash A:B}},\ {\text{if}}\ x\notin \Gamma$
(application)	${\frac {\Gamma \vdash F:(x:A)\to B\quad \Gamma \vdash C:A}{\Gamma \vdash FC:B[x/C]}}$
(abstraction)	${\frac {\Gamma ,x:A\vdash B:C\quad \Gamma \vdash (x:A)\to C:s}{\Gamma \vdash (\lambda x:A.B):(x:A)\to C}}$
(conversion)	${\frac {\Gamma \vdash A:B\quad \Gamma \vdash B':s\quad B\equiv _{\beta }B'}{\Gamma \vdash A:B'}}$

β-conversion

The final rule in the above table, the conversion rule, uses a symbol we have not seen before: $\equiv _{\beta }$ , which stands for β-convertibility. We say that two expressions are β-convertible when they can be connected by a series of (forwards or backwards) β-reduction steps. More precisely, $\equiv _{\beta }$ is the smallest equivalence relation which holds between two expressions if one is β-reducible to another. This mathematical definition needs more unpacking.

In mathematics, a "binary relation" (or just a "relation") on a set $S$ is a set $R$ of ordered pairs $(a,b)$ of elements of $S$ . Given elements $a,b\in S$ , we typically write $aRb$ to mean that $(a,b)\in R$ . $aRb$ means that $a$ is related to $b$ by the relation $R$ .

A simple example of a relation is the relation $\leq$ on the set of natural numbers, $\mathbb {N}$ . $\leq$ , as a relation in the mathematical sense, is the set of pairs $(a,b)$ of natural numbers such that $a$ is less than or equal to $b$ .

An "equivalence relation" on a set $S$ is a relation $R$ on $S$ satisfying the following axioms:

Reflexivity: For all $a\in S$ , $aRa$ .
Symmetry: For all $a,b\in S$ , if $aRb$ then $bRa$ .
Transitivity: For all $a,b,c\in S$ , if $aRb$ and $bRc$ , then $aRc$ .

A simple example of an equivalence relation is the identity relation on any set $S$ : that is, the relation which an object bears only to itself. A more complex example of an equivalence relation is the relation on natural numbers of being congruent modulo seven: i.e., of having the same remainder when divided by seven.

The relation $\equiv _{\beta }$ , called β-convertibility, is the least equivalence relation on the set of expressions of the λ-cube such that if an expression $A$ is β-reducible to an expression $B$ , then $A\equiv _{\beta }B$ . More precisely, $\equiv _{\beta }$ is the intersection (in the set-theoretic sense) of all equivalence relations $R$ on λ-cube expressions such that if $A$ β-reduces to $B$ then $ARB$ . It can be proven that this intersection is itself an equivalence relation.

System-specific rules

The systems of the λ-cube differ in what kinds of functions they allow. The kinds of functions under consideration are: functions from values to values; functions from types to values; functions from types to types; and functions from values to types. For each of these four types of functions, there is a rule to the effect that functions of that type exist.

For example, the rule stating that functions from values to values exist reads as follows:

${\frac {\Gamma \vdash A:{\text{Type}}\quad \Gamma ,\ x:A\vdash B:{\text{Type}}}{\Gamma \vdash (x:A)\to B:{\text{Type}}}}$

The rule stating that functions from types to values exist reads as follows:

${\frac {\Gamma \vdash A:{\text{Type}}_{1}\quad \Gamma ,\ x:A\vdash B:{\text{Type}}}{\Gamma \vdash (x:A)\to B:{\text{Type}}}}$

The difference between these rules is the sort of $A$ ; in the former rule, $A$ is a ${\text{Type}}$ (a type of regular values), and in the latter rule, $A$ is a ${\text{Type}}_{1}$ (a type of types).

The general pattern is that $A$ is a ${\text{Type}}$ for functions taking regular values, and a ${\text{Type}}_{1}$ for functions taking types. Similarly, $B$ is a ${\text{Type}}$ for functions producing regular values, and a ${\text{Type}}_{1}$ for functions producing types. The sort of the resulting function type is the same as the sort of $B$ .

Now we describe the general pattern formally. Let $s_{1}$ and $s_{2}$ be sorts. The " $(s_{1},s_{2})$ rule" is the rule:

${\frac {\Gamma \vdash A:s_{1}\quad \Gamma ,\ x:A\vdash B:s_{2}}{\Gamma \vdash (x:A)\to B:s_{2}}}$

The systems of the λ-cube are differentiated by which of the $(s_{1},s_{2})$ rules they have. The following table describes the $(s_{1},s_{2})$ rules of each system, and the feature each rule provides:

Feature:	Ordinary functions	Polymorphic types	Type constructors	Dependent types
Rule:	$({\text{Type}},{\text{Type}})$	$({\text{Type}}_{1},{\text{Type}})$	$({\text{Type}}_{1},{\text{Type}}_{1})$	$({\text{Type}},{\text{Type}}_{1})$
λ→ (explicitly simply typed λ-calculus)	Yes	No	No	No
λ2 (explicitly typed second-order λ-calculus)	Yes	Yes	No	No
λω	Yes	No	Yes	No
λP	Yes	No	No	Yes
λω	Yes	Yes	Yes	No
λP2	Yes	Yes	No	Yes
λPω	Yes	No	Yes	Yes
λPω (calculus of constructions)	Yes	Yes	Yes	Yes

Properties of the calculi

All systems of the λ-cube have the following properties:

β-reduction preserves type: if $\Gamma \vdash A:T$ and and $A$ β-reduces to $A'$ , then $\Gamma \vdash A':T$ .
The strong normalization property holds, so that all functions typeable in the calculus halt, and the calculus is not Turing complete.
The problems of type checking and typability are decidable.
Every value expression has at most one type, up to β-conversion. More precisely, for any expression $A$ and any context $\Gamma$ , if two expressions $B$ and $B'$ are such that $\Gamma \vdash A:B$ and $\Gamma \vdash A:B'$ , then $B\equiv _{\beta }B'$ .