The undecidability of validity in first-order logic. This lecture's notes are brought to you by @wetmore. Due to time constraints this only introduces first-order logic, and attempts to make it familiar and understandable.
Prakash's notes are very complete and detail why validity of first-order logic is undecidable.
Detailed notes on the instructor's website
1First-order logic¶
First we'll begin with a overview of first-order logic. You may recall the basics of propositional logic, which informally allows us to build up propositional statements and ask whether they have the value of true, or false. That is, if I were to tell you "$P$ is true", you'd be able to infer that "$P$ or $Q$" is true. First-order logic is a "stronger" system that lets us quantify over variables and introduce predicates.
The ingredients of first-order logic are the following:
- A countable set of symbols called variables, $Var$: we often use the traditional $x,y,z, \ldots$
- A set of symbols called constants. Variables can range over some set, but constants are fixed. We'll use arbitrary symbols $a,b,c,\ldots$.
- A finite set of function symbols: $f,g,h, \ldots$. A function has arity, a positive integer representing the number of arguments it takes.
- A finite set of predicate symbols: $P,Q,R,\ldots$. Predicate symbols also have arity. Intuitively one might want to think of a predicate as a function returning "true" or "false", but the concept of truth is currently ill-defined. We'll make this clearer shortly.
1.1Terms and formulas¶
With the first three of these building blocks we can assemble terms: constants are terms, variables are terms, and if $f$ is a function symbol of arity $n$, and $t_1, \ldots, t_n$ are terms then $f(t_1, \ldots, t_n)$ is a term. We call terms with no variables ground terms.
Finally, we may build formulas from these terms. We'll use the logical symbols you may be used to ($\lnot, \land, \lor, =, \Rightarrow$) and the quantifiers $\forall$ and $\exists$, but for the moment you do not need to attach meaning to them, just recognize that we can create well-formed formulas with them inductively in the following ways:
- If $t_1, t_2$ are terms then $t_1 = t_2$ is a formula
- If $P$ is a predicate symbol of arity $n$ and $t_1, \ldots, t_n$ are terms then $P(t_1, \ldots, t_n)$ is a formula
- If $\varphi$ is a formula then $\lnot\varphi$ is a formula
- If $\varphi$ and $\psi$ are formulas then so are $\varphi \land \psi$, $\varphi \lor \psi$ and $\varphi \Rightarrow \psi$.
- If $\varphi$ is a formula and $x$ is a variable then $\forall x.\varphi$ and $\exists x.\varphi$ are formulas.
1.2Interpretation¶
So far we've merely specified what formulas are legal, but they don't mean anything yet - they're just a string of symbols. Meaning comes from interpretation. We set a structure (a "domain of discourse") $D$, which is some set. An interpretation $\mathcal I$ is a structure $D$, plus:
- For each constant $a$ there is an element $d_a \in D$
- For each function symbol $f$ of arity $n$ there is a function $f : D^n \to D$
- For each predicate symbol $P$ of arity $n$ there is a set $P_D \subseteq D^n$
Note $D^n$ denotes the $n$-fold cartesian product $D \times D \times \ldots \times D$, $n$ times. Also, note the formal structure of a predicate is similar to that of a relation.
A valuation (or environment) is a map $\rho : Var \to D$, where we give each variable a concrete interpretation as an element of our structure/domain $D$. Let the notation $\rho[x \mapsto d]$ denote a valuation that is the same as $\rho$, except $x$ maps to $d$ instead of $\rho(x)$. We often fix $\mathcal I$, and if we fix some valuation $\rho$ we can write ⟦$t$⟧ to represent the interpretation of term $t$ given $\mathcal I$ and $\rho$. This is defined as follows:
- ⟦$a$⟧ = $d_a$
- ⟦$x$⟧ = $\rho(x)$
- ⟦$f(t_1, \ldots, t_n)$⟧ = ⟦$f($⟦$t_1$⟧$, \ldots, $⟦$t_n$⟧$)$⟧
1.3What is truth, anyway?¶
With $\mathcal I$ and $\rho$ we can finally define truth. We write $\mathcal I, \rho \models \varphi$ to mean that the formula $\varphi$ is true given the interpretation $\mathcal I$ and valuation $\rho$. What this means exactly is defined inductively:
- $\mathcal I, \rho \models t_1 = t_2$ if ⟦$t_1$⟧ = ⟦$t_2$⟧
- $\mathcal I, \rho \models P(t_1, \ldots, t_n)$ if $($⟦$t_1$⟧$, \ldots, $⟦$t_n$⟧$) \in P_D$
- $\mathcal I, \rho \models \varphi \land \psi$ if $\mathcal I, \rho \models \varphi$ and $\mathcal I, \rho \models \psi$ (we can define $\mathcal I, \rho \models \varphi \lor \psi$ similarly)
- $\mathcal I, \rho \models \lnot\varphi$ if $\mathcal I, \rho \not\models \varphi$
- $\mathcal I, \rho \models \exists x.\varphi$ if there is some element $d \in D$ such that $\mathcal I, \rho[x \mapsto d] \models \varphi$
- $\mathcal I, \rho \models \forall x.\varphi$ if for every $d \in D$, $\mathcal I, \rho[x \mapsto d] \models \varphi$
This illustrates the difference between terms and formulas - we cannot ask whether or not a term is true. It is merely a "thing" that we can use in actual questions: does $P$ imply $Q$? Is $x$ equal to $y$? These are questions we can (hopefully) answer as true or false. But it makes no sense to ask if, say, "2" is true or false.
So the truth of a formula depends on what we let the symbols in the formula mean. For example, consider the predicate $\text{sqrt2}(x)$, which we'll define for two choices of $D$: $\mathbb N$,the natural numbers, and $\mathbb R$, the real numbers. Informally, we want the predicate to return true if $x^2 = 2$. So then $\text{sqrt2}_{\mathbb N} = \varnothing$ and $\text{sqrt2}_{\mathbb R} = \{ \sqrt{2} \}$. Then formula $\exists x.\text{sqrt2}(x)$ is true if we let $D = \mathbb R$ but not if $D = \mathbb N$.
2Validity¶
As we've seen, a formula is detached from its interpretation - we may have a single formula but whether or not it is true depends on the interpretation. In fact, for a given formula there are three possibilities:
- The formula is true in any interpretation
- The formula is true in some interpretations and false in others
- The formula is false in any interpretation
If the first case holds, i.e. the formula is true in every interpretation, then the formula is valid. Valid formulas are true based purely on the formula and its symbols, irrespective of choice of interpretation. For example, the formula $x = x$ is valid - no matter what we substitute in for $x$, it will be true.
If, however, a formula is only true in some interpretations, we say the formula is satisfiable - there is some choice of interpretation and valuation in which this formula is true. We call this choice of interpretation and valuation a model, and this model satisfies the formula.
In the last case, the formula cannot be satisfied by any model, and it is invalid. These are all semantic notions about the formula itself.