THESE NOTES HAVE NOT BEEN VERIFIED BY THE INSTRUCTOR. USE AT YOUR OWN RISK.
The 1930's was an active time for computing machines – we saw the invention of the Turing machine, Post's machine, random access machines, 2-stack machines, 2-counter machines ... all of which are, in fact, equivalent. Another equivalent system is that of lambda calculus (or $\lambda$-calculus, or the lambda calculus), invented by Alonzo Church; this is what we'll talk about today.
You can find detailed notes on lambda calculus on the course website.
Note that this topic will not be on the final.
1Definitions¶
We first define a countable family of variables: $x, y, z, \ldots$ (we'll usually use letters from the end of the alphabet to denote variables, and letters from the beginning of the alphabet to denote constants, although in the purest form of $\lambda$ calculus there are no constants). That's all we need. The inductive definition of $\lambda$ terms is as follows:
- A variable is a $\lambda$ term.
- If $M$ and $N$ are $\lambda$ terms, so is $MN$.
- If $M$ is a $\lambda$ term and $x$ is a variable, then $\lambda x \cdot M$ is too (think of this as $x \mapsto M$, as in, $x$ maps to $M$, or
lambda x: M
in Python).
This encodes all computable functions!
For example, $(\lambda x \cdot M)N$ can be thought of as calling the $\lambda$-function with $N$ as the parameter.
1.1Free and bound variables¶
Let $FV$ be the function that takes a lambda-term as input and outputs the free variables in lambda. So, given the lambda-term $M$, $FV(M)$ is a finite subset of the set of variables.
- $FV(x) = \{x\}$
- $FV(MN) = FV(M) \cup FV(N)$
- $FV(\lambda x\cdot M) = FV(M) \setminus \{x\}$ (since $x$ is bound)
For example, consider the following expression (where parentheses are used to limit scope):
$$\underbrace{\lambda x \cdot \underbrace{\lambda y \cdot \underbrace{(\lambda z \cdot \underbrace{xyz}_{\text{all free}})}_{x, \,y \text{ free; } z \text{ bound}}y}_{x \text{ free; } y,\, z \text{ bound}}}_{\text{all bound}}$$
1.2Substitution rules¶
Note that Prakash uses his own notation for substitution, which may differ from the standard notation. Also note that the notation used below is not actually part of lambda calculus - it's just metanotation, to explain what the rules mean.
- $x[x \mapsto N] = N$
- $y[x \mapsto N] = y$ when $x \neq y$ (basically, nothing happens)
- $(MM')[x \mapsto N] = (M[x \mapsto N])(M'[x \mapsto N])$
- $(\lambda x \cdot M)[x \mapsto N] = (\lambda x \cdot M)$ (it's a no-op, because $x$ is shadowed by the $x$ in the lambda function)
- $(\lambda y \cdot M)[x \mapsto N] = \lambda y \cdot (M[x \mapsto N])$ where $x \neq y$, and either $y$ does not appear free in $N$ or $x \notin FV(M)$ (in which case this is a no-op)
- $(\lambda y \cdot M)[x \mapsto N] = (\lambda z \cdot M[y \mapsto z])[x \mapsto N]$ where $x \neq y$, $x$ does occur free in $M$ and $y$ does occur free in $N$, where $z$ is a fresh variable (i.e., one that does not appear in $M$).
1.3Reduction rules¶
Note that $Var(M)$ gives the bound variables of the lambda term $M$.
- $(\alpha)$: $\lambda y \cdot M \to \lambda v \cdot M[y \mapsto v]$ where $v \notin Var(M)$ (called alpha reduction; also known as "renaming bound variables")
- $(\beta)$: $(\lambda y\cdot M) N \to M[y\mapsto N]$
- $(\rho)$: $M \to M$
- $(\mu)$: if $M \to N$ then $PM \to PN$ (we can also write this as $\displaystyle \frac{M \to N}{PM \to PM}$
- $(\nu)$: if $M \to N$ then $MP \to NP$
- $(\xi)$ if $M \to N$ then $\lambda x\cdot M \to \lambda x \cdot N$
- $(\tau)$: $\displaystyle \frac{M\to N, \, N \to P}{M \to P}$ (transitivity)
Note: this is the untyped, pure version of lambda-calculus.
1.3.1Examples¶
Note: $\lambda xy\cdot M$ is shorthand for $\lambda x\cdot\lambda y\cdot M$.
$$\underbrace{(\lambda x\cdot x)}_{\text{identity function}} M \to M \tag{called $\beta$-redex}$$
$$\underbrace{(\lambda x\cdot y)}_{\text{constant function}}M \to y$$
$$\underbrace{(\lambda x \cdot \lambda y \cdot x)}_{\text{constant function factory, or K combinator}} M \to \lambda y \cdot M\tag{assuming no free $y$'s in $M$}$$
1.4Simulating selection (if/else)¶
If we want to be able to produce all computable functions using lambda calulus, we'll need a way to simulate selection (the if-then-else concept). To do that, we'll first need truth/false values. Let's use $\lambda xy\cdot x$ as true (we'll call it $T$), and $\lambda xy\cdot y$ as false (we'll call it $F$). We can see that $T$ is a function that ignores its second argument, and $F$ is a function that ignores its first argument.
Consider $TMN$ and $FMN$. They reduce as follows:
$$\begin{align} TMN & = (\lambda x\cdot(\lambda y\cdot x))MN \to (\lambda y\cdot M)N \\ & \to M \\ FMN & = (\lambda x \cdot (\lambda y \cdot y)) MN \to (\lambda y \cdot y) N \\ & \to N \end{align}$$
This allows us to build expressions of the form if x then M else N
where x
is either T or F.
1.5Simulating data structures¶
The salient property of data structures is the ability to put disparate things together and then take them apart, unscathed. We will now define some terms that will allow us to combine expressions then, later, retrieve the individual expressions, in a manner similar to that of very basic lists. We define $D = \lambda yzx \cdot xyz$. Then, $DMN = (\lambda yzx \cdot xyz)MN = \lambda x \cdot (xM)N$. Next, we define $\underline{\text{fst}} = \lambda w \cdot wT$ (equivalent to $\lambda w \cdot (wT)$ by precedence rules) and $\underline{\text{snd}} = \lambda w \cdot wF$.
Then, $\underline{\text{snd}}(DMN) = (\lambda w\cdot wF)(\lambda x\cdot xMN) \to (\lambda x\cdot xMN)F \to FMN \to N$, which is exactly what we wanted.
Similarly, $\underline{\text{fst}}(DMN) = (\lambda w \cdot wT)(\lambda x \cdot xMN) \to (\lambda x\cdot xMN)T \to TMN \to M$. Pretty awesome.
1.6Simulating numbers¶
Next, we can build the lambda-calculus version of numbers, using Church numerals. Let $\underline{n}$ be the Church numeral for the number $n$ (starting from $n=0$). Now, the Church numeral for 0, $\underline{0}$, is given by $\lambda f \cdot \lambda x \cdot x$ (this happens to be equivalent to F, our expression for false). $\underline{1}$ is given by $\lambda f \cdot \lambda x \cdot fx$, and $\underline{2}$ by $\lambda f \cdot \lambda x \cdot f(fx)$, etc. In shorthand, we have that $\underline{n} = \lambda fx \cdot f^n(x)$.
We define the successor function: $\underline{\text{succ}} = \lambda y \cdot \lambda f \cdot \lambda x \cdot yf(fx)$. Then:
$$\underline{\text{succ}}\underline{n} \to \lambda fx \cdot (\underline{n}f) (fx) \to \lambda fx \cdot f^n(fx) \to \lambda fx\cdot f^{(n+1)}x \to \underline{n+1}$$
Now we define addition: $\underline{\text{plus}} = \lambda uvfx\cdot uf(fvfx)$ since $\underline{n}fx$ means iterate $f$ $n$ times on $x$. So $\underline{\text{plus}} \underline{n} \underline{m} \to \lambda fx \cdot \underline{n} f(\underline{m} fx) \to \lambda fx \cdot \underline{n} f(f^mx) \to \lambda fx \cdot f^{n+m}x$, as desired.
Multiplication ($\underline{\text{times}}$) and exponentiation ($\underline{\text{exp}}$) are defined in the notes.
If we need to encode negative integers, we can do that using a tuple: $+\underline{n}$ can be represented by $(\underline n, T)$ and $-\underline{n}$ by $(\underline n, F)$.
1.7Simulating recursion¶
We define a function $\Delta = \lambda x\cdot xx$. Then $\delta \delta$: $(\lambda x \cdot xx)(\lambda x\cdot xx) \to \Delta\Delta$ which is just a copy of itself.
Next, we define the Y combinator: $Y = \lambda f\cdot (\lambda x \cdot f(xx))(\lambda x\cdot f(xx))$. Then $YM \to M(YM)$. If $Mx = x$ for some $x$, this $x$ is called a fixed point of $M$. $Y$ finds fixed points for any function. You can find more information on the Y combinator on wikipedia.