Thursday, November 7, 2013 CC-BY-NC
Reductions and PCP

Maintainer: admin

Reductions. The Post correspondence problem.

1Reductions

We say $P \leq Q$ if a solution to $Q$ produces a solution to $P$.

1.1Reduction for $E_{TM}$

Suppose that $L(M) = \{w \in \Sigma^* \mid M \text{ halts and accepts } w\}$ (in other words, the language recognised by machine $M$). We define the following set:

$$E_{TM} = \{\langle M \rangle \mid L(M) = \varnothing\} \tag{where $\langle M \rangle$ is an encoding of $M$}$$

Also, recall (from last lecture):

$$A_{TM} = \{\langle M, w\rangle \mid w \in L(M)\}$$

This set is RE but not decidable.

Now let's perform the following reduction: $A_{TM} \leq E_{TM}$. Assuming we have a recogniser (which we can treat as a black box) for $E_{TM}$, we need to effectively transform it into a solution for $A_{TM}$.

Given $(M, w)$ we define a new TM $M'$, which ignores its input and runs $M$ on $w$. (In Python, $M'$ would be defined as lambda x: M(w) where M and w are pre-defined.) If $M$ accepts $w$, then $M'$ accepts its input. Then we feed $\langle M' \rangle$1 to the supposed recogniser for $E_{TM}$. If $M$ accepts $w$, then $L(M') = \Sigma^*$; otherwise, it is just $\varnothing$. Thus if (and only if) our recogniser tells us that $L(M') = \Sigma^*$, then we would know that $M$ accepts $w$, which in turns gives us a decision procedure for $A_{TM}$.

We saw, in the previous lecture, a proof that $A_{TM}$ is not decidable. What we showed above is that if a recogniser for $E_{TM}$ exists, we would be able to create a valid recogniser for $A_{TM}$, which we know to be impossible. Thus $E_{TM}$ is also undecidable.

Incidentally, note that while $\overline{E_{TM}}$ is an RE set, $E_{TM}$ is not even RE.

1.2Reduction for $REG_{TM}$

Consider the following set:

$$REG_{TM} = \{ \langle M \rangle \mid L(M) \text{ is regular} \}$$

This is undecidable, which we will prove by reducing $A_{TM}$ to $REG_{TM}$ ($A_{TM} \leq REG_{TM}$).

Suppose we have a recogniser for $REG_{TM}$ and we are given $\langle M \rangle$ and $w$. Define a new Turing machine $M'$ which, when given some input $x \in \Sigma^*$, first checks if $x$ is of the form $a^nb^n$. If so, we accept, otherwise, simulate $M(w)$ and accept if and only if $M$ accepts $w$.

Next, we feed $\langle M' \rangle$ as input to our recogniser for $REG_{TM}$. If $M$ accepts $w$, then $M'$ will accept anything (i.e., $L(M') = \Sigma^*$, which is regular); otherwise, $M'$ will only words in the form $a^nb^n$ for some $n$ (i.e., $L(M') = \{a^nb^n\}$, which is not regular). Thus if (and only if) our recogniser tells us that $L(M')$ is regular, then we would know that $M$ accepts $w$, which in turns gives us a decision procedure for $A_{TM}$. This is a contradiction as we know there does not exist a decision procedure for $A_{TM}$, so $REG_{TM}$ is also undecidable.

1.3Reduction for $EQ_{TM}$

Consider the following set:

$$EQ_{TM} = \{\langle M_1, M_2 \rangle \mid L(M_1) = L(M_2)\}$$

In other words, given the description of two Turing machines, we check if they recognise the same language.

This is of course undecidable. We can prove it by reducing $A_{TM}$ to this ($A_{TM} \leq EQ_{TM}$). Given $M, w$, construct two new Turing machines: one, $M_1$, which rejects everything; and one, $M_2$, which rejects everything if and only if $M$ accepts $w$ and accepts everything otherwise. Then, $L(M_1) = L(M_2)$ if and only if $M$ accepts $w$. $\blacksquare$

1.4The Post correspondence problem

Now we'll introduce the Post correspondence problem, which will be a very powerful tool in our reduction arsenal.

The statement of the problem is as follows: we are given a finite set of pairs of strings, $S = \{\alpha_1, \beta_1), (\alpha_2, \beta_2), \ldots\}$. We can think of them as dominoes, with the $\alpha$'s on top and the $\beta$'s on the bottom. Our goal is to find a sequence of dominoes (with repeats allowed) for which the string formed by the strings on the top and that formed by the strings on the bottom are equal. Formally, a solution for an instance of PCP is a sequence of integers $i_1, \ldots, i_m \in \{1, n\}$ such that

$$\alpha_{i_1}\alpha_{i_2} \alpha_{i_3} \cdots \alpha_{i_m} = \beta_{i_1} \beta_{i_2} \beta_{i_3} \cdots \beta_{i_m}$$

1.4.1Examples

Here's an example:

$$S = \{(b, bbb), (babbb, ba), (ba, a)\}$$

One valid solution is $2, 1, 1, 3$, for then we would get $babbbbbba$ on both the top and the bottom.

Another example:

$$S = \{(a, bb), (b, aaa), (bab, aaab)\}$$

This clearly has no solutions, since the strings on the bottom are all longer than the strings on the top. In general, however, these problems are not decidable.

1.4.2Proof of undecidability

Next, we'll construct a proof that PCP is undecidable.

Consider a Turing machine which encodes sequences of configurations separated by a $\#$ such that successive configurations represent transitions of the Turing machine. A valid computation of a Turing machine is a finite sequence of configurations $\alpha_0\#\alpha_1\#\alpha_2\#\cdots\#\alpha_k$ such that

  1. $\alpha_0$ is a start configuration (i.e., the start state, $q_0$, must be the leftmost symbol);
  2. $\alpha_k$ is an end configuration (where the Turing machine has reached an accept/reject state and has stopped);
  3. $\forall i < k$, $\alpha_i \to \alpha_{i+1}$ reflects a possible transition for $\delta$. (For example, if we had $abaqabb$, and $\delta(q, a) = (q', b, L)$, then the only possible next configuration would be $abaq'abbb$.)

The crucial thing to note here is that changes between configurations (if any) are localised to a small region, of 3 spaces, because the machine can only move left or right one space at a time.

1.4.2.1Simulating computations

Now we can perform the reduction. We will show that if we solve PCP, we would be able to construct valid computations, and thus simulate the computation of a Turing machine on arbitrary input, which is not possible (see: the halting problem). In order to perform the reduction, we need to carefully design our dominoes so that when we go from one to the next, we only get possible transitions. We'll use a slightly modified version of PCP (called MPCP), adding the requirement that the first tile2 must be a particular tile: $(\#, \#q_0w_0,\cdots,w_n\#)$ (where the bottom is some start configuration).

We also enforce the following condition: if $\delta(q, a) = (q', b, R)$ and $q'$ is not a reject state, the next domino must be $(qa, bq')$. Similarly, if $\delta(q, a) = (r, b, L)$ then the next domino must be $(cqa, rcb)$ for any $c \in \Sigma$.

Then, for every $a \in \Gamma$, we create a tile of the form $(a, a)$, as well as tiles of the form $(aq_{\text{accept}}, q_{\text{accept}})$, $(q_{\text{accept}}a, q_{\text{accept}})$ and $(q_{\text{accept}}\#\#, \#)$. We also need the special tiles $(\#, \#)$ and $(\#, \sqcup\#)$.

Here's an example. Suppose that $\Gamma = \{0, 1, 2\}$ and that $w = 0100$. We begin at the start state, and read a 0. Suppose $\delta(q_0, 0) = (q_7, 2, R)$. Then our first domino would look like $(\#, \#q_00100\#)$. We would also have a domino that looks like $(q_0, 2q_7)$, and we'd have to put this next to it (since there wouldn't be any other tile that begins with $q_0$). Then we'd match on all the remaining symbols at the bottom of the first domino, using the tiles $(1, 1)$ and $(0, 0)$, resulting in $\#q_00100\#$ at the top and $\#q_001000\#2q_7100\#$ at the bottom.

1.4.2.2Forcing the start state

Now we will show that we can force the start state to be the desired one rather than making that one of the requirements.

We create the following new symbols (where $*$ is also a new symbol):

$$\begin{align} \overrightarrow{u} & = u_1u_2\ldots u_n \\ * \overrightarrow{u} & = * u_1* u_2* \ldots* u_n \\ \overrightarrow{u}* & = u_1*u_2*\ldots *u_n* \\ *\overrightarrow{u}* & = *u_1*u_2*\ldots *u_n* \end{align}$$

Given tiles $(t_1, b_1)$, $(t_2, b_2)$, etc, we want to force starting with $(t_1, b_1)$. To do so, we can transform it into $(*t_1, *b_1*)$. Then we transform the other dominoes so that they look like $(*t_2, b_2*)$, $(*t_3, b_3*)$, etc.

We also add a new tile $(*\diamondsuit, \diamondsuit)$ where $\diamondsuit$ is a new symbol. This will be used as the ending tile.

Thus, we're forced to start with $(*t_1, *b_1*)$ since nothing else starts with a $*$ on the bottom. So PCP is unsolvable, since it can encode generic Turing machine computations.

  1. Note that since we're just feeding the source code for $M'$, and not $M'$ itself, as input to $E_{TM}$, we don't need to worry about the fact that $M'$ might go into an infinite loop. 

  2. Tile and domino are used interchangeably in these notes.