Counting using bijections continued: labelled spanning trees in a graph, and Catalan numbers.
1Counting using bijections¶
1.1Labelled spanning trees¶
Let $G = (V, E)$ be an undirected graph where all $n$ vertices are labelled $1, \ldots, n$. Consider the number of distinct spanning trees (taking rotation/reflection into account) present in such a graph for different values of $n$:
- $n=1$: there's only one vertex, so there is clearly only one spanning tree (just the vertex itself). So: 1
- $n=2$: also only one spanning tree, as there's only one way to connect the two vertices. So: 1
- $n=3$: the vertices are 1, 2, and 3; we could create a spanning tree out of $1-2-3$, or $1-3-2$, or $3-1-2$. So: 3
- $n=4$: there are two types of configurations:
- if we put them in a straight line and connect them that way, there are 12 options:
- $1-2-3-4$ and $1-3-2-4$;
- $1-2-4-3$ and $1-4-2-3$;
- $1-3-4-2$ and $1-4-3-2$;
- $2-1-4-3$ and $2-4-1-3$;
- $2-3-1-4$ and $2-1-3-4$;
- $3-1-2-4$ and $3-2-1-4$.
- if we put one in the middle and put the other three around it, and connect them all to the middle, we have 4 options (one for each possibility in the middle)
- Total: 12 + 4. So: 16
- if we put them in a straight line and connect them that way, there are 12 options:
- $n=5$: 125 (given by prof)
1.1.1The formula¶
By quick inspection we can guess that the general formula for the number of labelled spanning trees is $t(n) = n^{n-2}$. Let's now try to prove this.
1.1.2The proof¶
We can prove this using a bijection. Let $g(n)$ be the number of labelled spanning trees with a particular left vertex and a particular right vertex (they can be the same vertex). So $g(n) = t(n) \cdot n^2$, since there are $n$ choices for the left vertex and $n$ choices for the right vertex. Now, if $g(n) = n^n$, then we are done. Can we construct a bijection between $g(n)$ and something we know to be of size $n^n$?
Consider the set of strings of length $n$, composed using an alphabet of $\{1, 2, \ldots, n\}$. We know this to have size $n^n$. This is equivalent to the set of functions from $[n]$ to $[n]$, and we'll think of it using this formulation. Take any function $f: [n] \to [n]$. We can view $f$ as a directed graph - for each $x \in [n]$, we draw a node, and for all $x \in [n]$, we draw a directed edge from $x$ to $f(x)$. Thus each vertex will have an out-degree of 1, and an in-degree of at most $n$.
Thus can we split the graph into connected components, each of which is called an arborescence. So the resulting graph has one cycle for each arborescence, potentially with some trees hanging off of it.
Let $S \subseteq [n]$ be the set of nodes that are directly in a cycle (i.e., not merely hanging off of a cycle). For example, if $f(1) = 3$, $f(2) = 2$, $f(3)=2$, and $f(4)=1$, then there is only one cycle (when 2 loops back to itself), with the rest of the graph hanging off of it. So $S = \{2\}$.
Consider what happens when we restrict $f$ to $S$. That is, only consider $f$ on the domain $S$. Let $S_0$ be $\min(S)$ (i.e., the least element in $S$), and let $S_1 = \max(S)$. To find a labelled spanning tree from this, place the vertices in $S$ in $f$-order. So if $f(x)$ (on domain $S$) is given by
$x$ | 1 | 4 | 5 | 7 | 8 | 9 |
---|---|---|---|---|---|---|
$f(x) | 7 | 9 | 1 | 5 | 8 | 4 |
Then 7 is the left vertex, 4 is the right vertex, and we have 9-1-5-8 in the middle.
For the other vertices, connect them the way they are originally connected in the graph, since they're already trees.
Thus we have all we need to create our bijection. To reverse the process - i.e., go from a labelled spanning tree to the original function - we do the following: Let $P$ be the path from the left to the right vertex. In our example above, we would have 7-9-1-5-8-4. This is your $f$-order. To get the original order, just write it in ascending order: 1-4-5-7-8-9. Now draw this as a graph, using the initial $f$-order: 1 goes to 7, 4 goes to 9, 5 goes to 1, etc.
Then we just add back the rest of the trees. From there, we can easily recover the original function based on the arcs in the forest. $\blacksquare$
Note: we're not expected to be able to come up with a bijection like this for the final. However, we might get something similar but simpler, so it's a good idea to verify this bijection and try to understand it.
1.2Catalan numbers¶
How many sequences of $n$ +'s and $n$ -'s are there such that each partial sum (starting from the left) is non-negative?
For example, when $n=3$, we have $+-+-+-$, $+-++--$, $++-+--$, $++--+-$, and $+++---$, so 5 total.
What's the pattern? What's the number of Catalan numbers for general $n$? Well, whatever it is, it also happens to be equal to the number of paths from $(0, 0)$ to $(2n, 0)$ on the Cartesian plane using $n$ $(1, 1)$ steps (aka $\nearrow$) and $n$ $(1, -1)$ steps (aka $\searrow$), where we are never allowed to go below the $x$-axis.
These are called Dyck walls (where Dyck is pronounced like "dike"). Clearly, the number of Dyck walks for a given $n$ is equivalent to the number of Catalan numbers for the same $n$. We call this number $C(n)$.
Theorem: $\displaystyle C(n) = \frac{1}{n+1} \binom{2n}{n}$
Proof: Observe that there are $\binom{2n}{n}$ paths from $(0, 0)$ to $(2n, 0)$ using $n$ $\nearrow$'s and $n$ $\searrow$'s, because you specify the $n$ places (with no replacement) for the $\nearrow$s and the $\searrow$ spots are determined by that (or vice versa).
Now consider what fraction of these paths go below the x-axis. We'll call this number $D(n)$. Well, any path that goes below the x-axis hits the line $y=-1$ at some point. Let $(q, -1)$ be the first such point for a path $P$. Now split $P$ into two sub-paths at $(q, -1)$, such that everything from $(0, 0)$ to $(q, -1)$ is part of path $P_1$, and everything from $(q, -1)$ to $(2n, 0)$ is part of $P_2$.
By design, everything in $P_1$ is $\geq$ the line $y=-1$. What if we were to reflect $P_1$ across this line, and call the result $P_1'$? Well, everything in $P_1'$ would be $<$ the line $y =-1$, except for the final endpoint which is at $(q, -1)$. Now create a new path $P' = \{P_1', P_2\}$. This is a path from $(0, -2)$ to $(2n, 0)$, using $(n+1)$ $\nearrow$s and $(n-1)$ $\searrow$s, because we have one more $\searrow$ than $\nearrow$ in $P_1$ and so $P_1'$ would have one more $\nearrow$ than $\searrow$s, and $P_2$ must have one more $\nearrow$ than $\searrow$ as a result as well.
So, because there are $n+1$ $\nearrow$s, there are $\binom{2n}{n+1}$ paths from $(0, -2)$ to $(2n, 0)$. But this number is just $D(n)$! So we have
$$D(n) = \binom{2n}{n+1}$$
Using this, we derive a formula for $C(n)$:
$$\begin{align} C(n) & = \binom{2n}{n} - \binom{2n}{n+1} = \\ & = \frac{(2n)!}{n!n!} - \frac{(2n)!}{(n+1)!(n-1)!} \\ & = \frac{(2n)!}{n!n!} - \frac{(2n)!}{n!\cdot (n+1) \cdot (n-1)!} \cdot \frac{n}{n} \\ & = \frac{(2n)!}{(n!)^2} \cdot \frac{(n+1)}{(n+1)} - \frac{(2n)!\cdot n}{(n!)^2\cdot(n+1)} \tag{as $(n+1)!\cdot n = n!$} \\ & = \frac{(2n)!\cdot(n+1) - (2n)!\cdot n}{(n!)^2\cdot (n+1)} \\ & = \frac{(2n)!}{(n!)^2\cdot(n+1)} \tag{by cancelling} \\ & = \frac{(2n)!}{n!\cdot n!} \cdot \frac{1}{n+1} \\ & = \frac{1}{n+1} \binom{2n}{n} \end{align}$$