HTSEFP: How to solve every problem CC-BY-NC

Maintainer: admin

The exam will take place on Friday, April 25, in Studio 2 of the gym. Bring a faculty standard calculator.

There will be 6 questions, 3 on theory, 3 on applications.

1Integer programming

1.1General solution

1.2Examples

Winter 2010 final, question 2

From the lecture on April 8:

Maximise

$$W = 0.5x^2 + 3y + 2z$$

such that

$$3x + 4y+3z \leq 10 \quad \text{and} \quad 2x+3y+3z \leq 10$$

where $x, y, z$ are all integers $\geq 0$.

Use D.P. formulation, subject to the first constraint.

Let's look at our options. $x$ can be 0, 1, 2, or 3 (any more and you'd have $3x > 10$, which wouldn't satisfy the first constraint even if the others were 0). $y$ can be 0, 1 or 2. $z$ can be 0, 1, 2, or 3.

First, we calculate $f_1(x') = 0.5x^2$ for each value of $x' \leq 10$ such that $3x \leq x'$. We have:

  • $f_1(0) = 0$ ($x = 0$)
  • $f_1(1) = 0$ ($x = 0$)
  • $f_1(2) = 0$ ($x = 0$)
  • $f_1(3) = 0.5$ ($x = 1$)
  • $f_1(4) = 0.5$ ($x = 1$)
  • $f_1(5) = 0.5$ ($x = 1$)
  • $f_1(6) = 2$ ($x = 2$)
  • $f_1(7) = 2$ ($x = 2$)
  • $f_1(8) = 2$ ($x = 2$)
  • $f_1(9) = 4.5$ ($x = 3$)
  • $f_1(10) = 4.5$ ($x = 3$)

Next, we calculate values of $\displaystyle f_2(x) = \max_{y=0,1,2} \bigg [ 3y + f_1(x-4y) \bigg ]$, marking with a $\bigstar$ the maximum for each $x$-value1:

  • $f_2(0) = 0$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(0) = 0$ $\bigstar$
    • when $y=1$, $f_1(x-4y) = f_1(-4)$ which isn't defined
    • when $y=2$, also undefined
  • $f_2(1) = 0$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(1) = 0$ $\bigstar$
    • when $y=1$, $f_1(x-4y) = f_1(-3)$ which isn't defined
    • when $y=2$, also undefined
  • $f_2(2) = 0$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(2) = 0$ $\bigstar$
    • when $y=1$, $f_1(x-4y) = f_1(-2)$ which isn't defined
    • when $y=2$, also undefined
  • $f_2(3) = 0.5$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(3) = 0.5$ thus $3y + f_1(x-4y) = 0.5$ $\bigstar$
    • when $y=1$, $f_1(x-4y) = f_1(-1)$ which isn't defined
    • when $y=2$, also undefined
  • $f_2(4) = 3$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(4) = 0.5$, thus $3y + f_1(x-4y) = 0.5$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(0) = 0$, thus $3y + f_1(x-4y) = 3$ $\bigstar$
    • when $y=2$, $f_1(x-4y) = f_1(-4)$ which isn't defined
  • $f_2(5) = 3$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(5) = 0.5$, thus $3y + f_1(x-4y) = 0.5$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(1) = 0$, thus $3y + f_1(x-4y) = 3$ $\bigstar$
    • when $y=2$, $f_1(x-4y) = f_1(-3)$ which isn't defined
  • $f_2(6) = 3$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(6) = 2$, thus $3y + f_1(x-4y) = 2$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(2) = 0$, thus $3y + f_1(x-4y) = 3$ $\bigstar$
    • when $y=2$, $f_1(x-4y) = f_1(-23)$ which isn't defined
  • $f_2(7) = 3.5$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(7) = 2$, thus $3y + f_1(x-4y) = 2$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(3) = 0.5$, thus $3y + f_1(x-4y) = 3.5$ $\bigstar$
    • when $y=2$, $f_1(x-4y) = f_1(-1)$ which isn't defined
  • $f_2(8) = 6$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(8) = 2$, thus $3y + f_1(x-4y) = 2$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(4) = 0.5$, thus $3y + f_1(x-4y) = 3.5$
    • when $y=2$, $3y = 6$ and $f_1(x-4y) = f_1(0) = 0$ thus $3y + f_1(x-4y) = 6$ $\bigstar$
  • $f_2(9) = 6$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(9) = 4.5$, thus $3y + f_1(x-4y) = 4.5$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(5) = 0.5$, thus $3y + f_1(x-4y) = 3.5$
    • when $y=2$, $3y = 6$ and $f_1(x-4y) = f_1(1) = 0$ thus $3y + f_1(x-4y) = 6$ $\bigstar$
  • $f_2(10) = 6$
    • when $y=0$, $3y=0$ and $f_1(x-4y) = f_1(10) = 4.5$, thus $3y + f_1(x-4y) = 4.5$
    • when $y=1$, $3y=3$ and $f_1(x-4y) = f_1(6) = 2$, thus 3y + f_1(x-4y) = 5$
    • when $y=2$, $3y = 6$ and $f_1(x-4y) = f_1(2) = 0$ thus $3y + f_1(x-4y) = 6$ $\bigstar$

Finally, we calculate $f_3(10)$. Note that we don't need to calculate $f_3(x)$ for values of $x$ other than 10 - this is the last one, because there are only 3 variables.

$$f_3(10) = \max_{z \in 0, 1, 2, 3} [ 2z + f_2(10 - 3z)] = \begin{cases} z=0: & 0 + f_2(10) = 0+6 = 6 \\ z=1: & 2 + f_2(7) = 2 + 3.5 = 5.5 \\ z=2: & 4 + f_2(4) = 4 + 3 = 7 \, \bigstar \\ z=3: & 6 + f_2(1) = 6 + 0 = 6 \end{cases} $$

Thus the maximum value for $W$ is attained when $z=2$. To find what the values of $x$ and $y$ should be, look at how $f_2(4)$ was obtained. We see that $y=1$ and that $f_1(0)$ was evaluated, which tells us that $x=0$. Thus $W = 0.5x^2 + 3y + 2z = 0 + 3 + 4 = 7$, as we calculated.

So we have $x=0$, $y=1$, $z=2$.

For a sanity check, let's make sure that this satisfies both constraints. We have $3x+4y+3z = 0 + 4 + 6 = 10 \leq 10$ $\checkmark$. Also, we have $2x + 3y + 3z = 0 + 3 + 6 = 9 \leq 10$ $\checkmark$. So both constraints are satisfied. $\blacksquare$

2The traveling salesman problem

2.1General solution

The formula should be given. Just do sets of size $k$ for $k=0, 1, 2, 3, \ldots$ until you've covered all the possibilities. Once you get to the end, backtrack - the first arg of the last $f$-value is the last node, the first arg of the second to last $f$-value is the second to last node, etc.

2.2Examples

Winter 2010 final, question 3

3The transportation problem

3.1General solution

You're given a table. Hopefully the formula to use is given, but it should be

$$f_N(x_i) = \min_{0 \leq x_{1N} \leq x_i} \bigg [ g_{1N}(x_{1N}) + g_{2N} (r_N- x_{1N}) + f_{N-1}(x_i-x_{1N} ) \bigg ]$$

where $g_{ij}(x_{ij})$ is the total cost of sending $x_{ij}$ units from depot $i$ to demand point $j$ (so the setup cost $c_{ij}$ if necessary, plus the coefficient $a_{ij}$ times $x_{ij}$), and $r_N$ is the demand needed at demand point $N$.

You iterate through the demand points. For each demand point, iterate through all the possible ways of allocating the $r_N$ resources provided by the depots.

3.2Example

Assignment 5, question 1.

4Markov decision processes

4.1Examples

4.1.1With discounting

Given in class (Thursday, February 13):

State $i$ Policy $k$ $p_{i1}^k$ $p_{i2}^k$ $q_i^k$ (expected reward)
1 1 0.6 0.4 4.6
1 2 0.7 0.3 3.2
2 1 0.4 0.6 2.4
2 2 0.5 0.5 1.5

$\alpha = 0.9$

At time step 1, we see that policy 1 is best for both states.

Fuck it, I'll do it later if there's time

4.1.2Without discounting

Given in class (Tuesday April 8):

State $i$ Policy $k$ $p_{i1}^k$ $p_{i2}^k$ $q_i^k$ (expected reward)
1 1 0.7 0.3 6
1 2 0.2 0.8 4
2 1 0.5 0.5 3
2 2 0.6 0.4 2

Evaluate $f_n(i)$ and the optimal policy $n=1,2,3,\ldots$, where

$$f_n(i) = \max_k \left [ q_i^k + \sum_{j=1}^N p_{ij}^k f_{n-1}(j) \right ] \tag{with $f_0=0$}$$

At time step 1:

  • If we're at state 1:
    • policy 1 reward: 6 $\bigstar$
    • policy 2 reward: 4
  • If we're at state 2:
    • policy 1 reward: 3 $\bigstar$
    • policy 2 reward: 2

Thus $f_1(1) = 6$ (policy 1) and $f_1(2) = 3$ (policy 1).

Now, time step 2:

  • If we're at state 1:
    • Policy 1: $0.7 \cdot f_1(1) + 0.3 \cdot f_1(2) + 6 = 0.7\cdot 6 + 0.3 \cdot 3 + 6 = 11.1$ $\bigstar$
    • Policy 2: $0.2 \cdot f_1(1) + 0.8 \cdot f_1(2) + 4 = 0.2 \cdot 6 + 0.8 \cdot 3 + 4 = 7.6$
  • If we're at state 2:
    • Policy 1: $0.5 \cdot f_1(1) + 0.5 \cdot f_1(2) + 3 = 0.5 \cdot 6 + 0.5 \cdot 3 + 3 = 7.5$ $\bigstar$
    • Policy 2: $0.6 \cdot f_1(1) + 0.4 \cdot f_1(2) + 2 = 0.6\cdot 6 + 0.4 \cdot 3 + 2= 6.8$

So $f_2(1) = 11.1$ (policy 1) and $f_2(2) = 7.5$ (policy 1).

Next, time step 3:

  • State 1:
    • Policy 1: $0.7 \cdot f_2(1) + 0.3 \cdot f_2(2) + 6 = 0.7\cdot 11.1 + 0.3 \cdot 7.5 + 6 = 16.02$ $\bigstar$
    • Policy 2: $0.2 \cdot f_2(1) + 0.8 \cdot f_2(2) + 4 = 0.2 \cdot 11.1 + 0.8 \cdot 7.5 + 4 = 12.22$
  • If we're at state 2:
    • Policy 1: $0.5 \cdot f_2(1) + 0.5 \cdot f_2(2) + 3 = 0.5 \cdot 11.1 + 0.5 \cdot 7.5 + 3 = 12.3$ $\bigstar$
    • Policy 2: $0.6 \cdot f_2(1) + 0.4 \cdot f_2(2) + 2 = 0.6\cdot 11.1 + 0.4 \cdot 7.5 + 2= 11.66$

So $f_3(1) = 16.02$ (policy 1) and $f_3(2) = 12.3$ (policy 1).

Now, let's verify that the optimal policy for $n=3$ is the optimal policy as $n \to \infty$. You're basically solving some recurrence relations. Here's are the actual recurrence relations (derivation omitted):

$$\underbrace{g + c_1 = 6 + 0.7c_1 + 0.3c_2}_{\text{for state 1}} \quad \text{and} \quad \underbrace{g + c_2 = 3 + 0.5c_1 + 0.5c_2}_{\text{for state 2}} \tag{equations (a) and (b)}$$

where $g$ is the gain derived from choosing the optimal policy at $n=3$ (which happens to be policy 1 for both states) at each state as $n \to \infty$, and $c_i$ is $\displaystyle \lim_{n\to\infty} f_n(i)$ (so the expected reward, over an infinite horizon).

Then, to solve for $c_1$ and $c_2$, we subtract the two equations:

$$c_1 - c_2 = 6 + 0.7c_1 + 0.3c_2 - 3 - 0.5c_1 - 0.5c_2 = 3 + 0.2c_1 -0.2c_2$$

Thus we have $0.8c_1 -0.8c_2 = 3$ and so $c_1-c_2 = 30/8 = 3.75$. So the expected gain in choosing policy 1 (over policy 2) for each time step at each state is 3.75. Next, we add (a) and (b):

$$2g + c_1 + c_2 = 9 + 1.2c_1 + 0.8c_2$$

Rearranging, we get $2g = 9 + 0.2c_1-0.2c_2 = 9+0.2(c_1-c_2) = 9+0.2(3.75) = 9.75$. So $g=9.75/2 = 4.875$. I don't know what we do with this though.

Then, we use the PIR to find the optimal policy ???

$$PIR = \max \left [ q_i^k + \sum_{j=1}^n P_{ij}^k c_j \right ]$$

At state 1, you have $6 + 0.7(3.75) = 8.625$ (so sub in $c_1-c_2$ for $c_1$, maybe $c_2=0$?) for policy 1, and $3+0.2(3.75) = 3.75$ for policy 2. So obviously policy 1 is optimal here.

At state 2, you have $3 + 0.5(3.75) = 4.875$ for policy 1, and $2 + 0.6(3.75)= 4.25$ for policy 2. Again, policy 1 is the optimal policy.

This confirms that policy 1 at both states is optimal over an infinite horizon.

5Semi-Markov decision processes

6Queueing theory

See the notes.

7Calculus of variations

See the notes.

7.1Example

7.1.1Unconstrained

Maximise:

$$\int_0^a x^3(y')^2 \, dx \tag{$y(0) = 0$, $y(a) = b$}$$

Recall Euler's equation:

$$F_y = -\frac{d}{dx} F_{y'}$$

$F(x, y, y') = x^3(y')^2$. $F_y = 0$, $F_{y'}$ = $2x^3y'$. Then

$$\frac{d}{dx} F_{y'} = \frac{d}{dx} 2x^3y' = 6x^2y' + 2x^3y'' \tag{product rule}$$

which is equal to 0 because $F_y$ is. So we have the ODE

$$6x^2y' = -2x^3y''$$

We can rearrange this to get

$$xy'' + 3y' = 0$$

This is a linear ODE! Apparently the solution is $y = c_1 + c_2x^{-2}$ but I'm not sure how to derive that. How do I ODE?

  1. Note that the $x$-values here don't actually correspond to the values of $x$ the variable that we're trying to find a value for (in $W$). It should really be called something other than $x$, to avoid confusion, but Sancho uses $x$ and it's probably even more confusing to change it now.