Thursday, November 14, 2013 CC-BY-NC
Limitations of CFLs

Maintainer: admin

THESE NOTES HAVE NOT BEEN VERIFIED BY THE INSTRUCTOR. USE AT YOUR OWN RISK.

Taught by Artem. Limitations of CFLs. This class ended early, which is why these notes are not very long.

1The limitations of CFLs

Unlike with regular languages, there are many questions that can't be answered at all for CFLs. Although certain questions for regular languages cannot be answered efficiently, all of them can at least be answered. With CFLs, however, some questions are simply undecidable.

1.1Undecidability of the ambiguity question

We'll use the Post correspondence problem to show that the question of whether or not a context-free grammar is ambiguous is undecidable.

Suppose we have the following blocks:

$$\left \{ \left [ \frac{t_1}{b_1} \right ], \ldots, \left [ \frac{t_n}{b_n} \right ] \right \}\tag{where $t_i, b_i \in \Sigma^*$}$$

Also, we add the following to the set of symbols, which will be used to tag which type of tile was used: $\Delta = \{d_1, \ldots, d_n\}$ (so the set of symbols that we can use is $\Sigma \cup \Delta$).

Create a CFL with the following rules:

  • $S \to T \mid B$
  • $T_i \to t_iTd_i$ for all $i \in \{1, \ldots, n\}$
  • $B_i \to b_iBd_i$ for all $i \in \{1, \ldots, n\}$
  • $T \to T_1 \mid \ldots \mid T_n \mid \epsilon$
  • $B \to B_1 \mid \ldots \mid B_n \mid \epsilon$

If the grammar is ambiguous, then there are two different ways to get the same string, which happens if and only if the instance of PCP has a solution. Thus we've reduced PCP to ambiguity. $\blacksquare$

TODO: Come back to this and make this explanation more detailed.

1.2VALCOMP

We'll define a language VALCOMP which takes as input a deterministic Turing machine, $M$, and a letter set $z$, and consists of the set of all valid computations of $M$ on $z$ (i.e., computations that halt). So if your machine never halts on $z$ for any input, then VALCOMP is $\varnothing$.

Now, we create a machine to recognise the complement of VALCOMP, which happens to be a context-free language. The algorithm works as follows: count the number of hashes (separators), and do something for every other separator. This machine never halts! It's undecidable. This shows how even innocent-sounding questions about CFLs can be as difficult as solving the halting problem.

TODO: This section is incomplete. Check back later, or feel free to edit it yourself to fill in the missing notes.

1.3Other undecidable problems

  • Given $G$, is $L(G)$ context-free?1
  • Given $G_1$ and $G_2$, is $L(G_1) \cap L(G_2)$ context-free?
  • Given $G$, is $L(G)$ regular?
  • Given $G$, is $L(G)$ ambiguous?
  1. I think this includes context-sensitive grammars (which we haven't studied) but I'm sure. It seems that the question wouldn't make any sense otherwise, though, since any context-free grammar should of course generate a context-free language?