Loading [MathJax]/jax/element/mml/optable/BasicLatin.js

HTSEFP: How to solve every problem CC-BY-NC

Maintainer: admin

The exam will take place on Friday, April 25, in Studio 2 of the gym. Bring a faculty standard calculator.

There will be 6 questions, 3 on theory, 3 on applications.

1Integer programming

1.1General solution

1.2Examples

Winter 2010 final, question 2

From the lecture on April 8:

Maximise

W=0.5x2+3y+2z

such that

3x+4y+3z10and2x+3y+3z10

where x,y,z are all integers 0.

Use D.P. formulation, subject to the first constraint.

Let's look at our options. x can be 0, 1, 2, or 3 (any more and you'd have 3x>10, which wouldn't satisfy the first constraint even if the others were 0). y can be 0, 1 or 2. z can be 0, 1, 2, or 3.

First, we calculate f1(x)=0.5x2 for each value of x10 such that 3xx. We have:

  • f1(0)=0 (x=0)
  • f1(1)=0 (x=0)
  • f1(2)=0 (x=0)
  • f1(3)=0.5 (x=1)
  • f1(4)=0.5 (x=1)
  • f1(5)=0.5 (x=1)
  • f1(6)=2 (x=2)
  • f1(7)=2 (x=2)
  • f1(8)=2 (x=2)
  • f1(9)=4.5 (x=3)
  • f1(10)=4.5 (x=3)

Next, we calculate values of f2(x)=max, marking with a \bigstar the maximum for each x-value1:

  • f_2(0) = 0
    • when y=0, 3y=0 and f_1(x-4y) = f_1(0) = 0 \bigstar
    • when y=1, f_1(x-4y) = f_1(-4) which isn't defined
    • when y=2, also undefined
  • f_2(1) = 0
    • when y=0, 3y=0 and f_1(x-4y) = f_1(1) = 0 \bigstar
    • when y=1, f_1(x-4y) = f_1(-3) which isn't defined
    • when y=2, also undefined
  • f_2(2) = 0
    • when y=0, 3y=0 and f_1(x-4y) = f_1(2) = 0 \bigstar
    • when y=1, f_1(x-4y) = f_1(-2) which isn't defined
    • when y=2, also undefined
  • f_2(3) = 0.5
    • when y=0, 3y=0 and f_1(x-4y) = f_1(3) = 0.5 thus 3y + f_1(x-4y) = 0.5 \bigstar
    • when y=1, f_1(x-4y) = f_1(-1) which isn't defined
    • when y=2, also undefined
  • f_2(4) = 3
    • when y=0, 3y=0 and f_1(x-4y) = f_1(4) = 0.5, thus 3y + f_1(x-4y) = 0.5
    • when y=1, 3y=3 and f_1(x-4y) = f_1(0) = 0, thus 3y + f_1(x-4y) = 3 \bigstar
    • when y=2, f_1(x-4y) = f_1(-4) which isn't defined
  • f_2(5) = 3
    • when y=0, 3y=0 and f_1(x-4y) = f_1(5) = 0.5, thus 3y + f_1(x-4y) = 0.5
    • when y=1, 3y=3 and f_1(x-4y) = f_1(1) = 0, thus 3y + f_1(x-4y) = 3 \bigstar
    • when y=2, f_1(x-4y) = f_1(-3) which isn't defined
  • f_2(6) = 3
    • when y=0, 3y=0 and f_1(x-4y) = f_1(6) = 2, thus 3y + f_1(x-4y) = 2
    • when y=1, 3y=3 and f_1(x-4y) = f_1(2) = 0, thus 3y + f_1(x-4y) = 3 \bigstar
    • when y=2, f_1(x-4y) = f_1(-23) which isn't defined
  • f_2(7) = 3.5
    • when y=0, 3y=0 and f_1(x-4y) = f_1(7) = 2, thus 3y + f_1(x-4y) = 2
    • when y=1, 3y=3 and f_1(x-4y) = f_1(3) = 0.5, thus 3y + f_1(x-4y) = 3.5 \bigstar
    • when y=2, f_1(x-4y) = f_1(-1) which isn't defined
  • f_2(8) = 6
    • when y=0, 3y=0 and f_1(x-4y) = f_1(8) = 2, thus 3y + f_1(x-4y) = 2
    • when y=1, 3y=3 and f_1(x-4y) = f_1(4) = 0.5, thus 3y + f_1(x-4y) = 3.5
    • when y=2, 3y = 6 and f_1(x-4y) = f_1(0) = 0 thus 3y + f_1(x-4y) = 6 \bigstar
  • f_2(9) = 6
    • when y=0, 3y=0 and f_1(x-4y) = f_1(9) = 4.5, thus 3y + f_1(x-4y) = 4.5
    • when y=1, 3y=3 and f_1(x-4y) = f_1(5) = 0.5, thus 3y + f_1(x-4y) = 3.5
    • when y=2, 3y = 6 and f_1(x-4y) = f_1(1) = 0 thus 3y + f_1(x-4y) = 6 \bigstar
  • f_2(10) = 6
    • when y=0, 3y=0 and f_1(x-4y) = f_1(10) = 4.5, thus 3y + f_1(x-4y) = 4.5
    • when y=1, 3y=3 and f_1(x-4y) = f_1(6) = 2, thus 3y + f_1(x-4y) = 5$
    • when y=2, 3y = 6 and f_1(x-4y) = f_1(2) = 0 thus 3y + f_1(x-4y) = 6 \bigstar

Finally, we calculate f_3(10). Note that we don't need to calculate f_3(x) for values of x other than 10 - this is the last one, because there are only 3 variables.

f_3(10) = \max_{z \in 0, 1, 2, 3} [ 2z + f_2(10 - 3z)] = \begin{cases} z=0: & 0 + f_2(10) = 0+6 = 6 \\ z=1: & 2 + f_2(7) = 2 + 3.5 = 5.5 \\ z=2: & 4 + f_2(4) = 4 + 3 = 7 \, \bigstar \\ z=3: & 6 + f_2(1) = 6 + 0 = 6 \end{cases}

Thus the maximum value for W is attained when z=2. To find what the values of x and y should be, look at how f_2(4) was obtained. We see that y=1 and that f_1(0) was evaluated, which tells us that x=0. Thus W = 0.5x^2 + 3y + 2z = 0 + 3 + 4 = 7, as we calculated.

So we have x=0, y=1, z=2.

For a sanity check, let's make sure that this satisfies both constraints. We have 3x+4y+3z = 0 + 4 + 6 = 10 \leq 10 \checkmark. Also, we have 2x + 3y + 3z = 0 + 3 + 6 = 9 \leq 10 \checkmark. So both constraints are satisfied. \blacksquare

2The traveling salesman problem

2.1General solution

The formula should be given. Just do sets of size k for k=0, 1, 2, 3, \ldots until you've covered all the possibilities. Once you get to the end, backtrack - the first arg of the last f-value is the last node, the first arg of the second to last f-value is the second to last node, etc.

2.2Examples

Winter 2010 final, question 3

3The transportation problem

3.1General solution

You're given a table. Hopefully the formula to use is given, but it should be

f_N(x_i) = \min_{0 \leq x_{1N} \leq x_i} \bigg [ g_{1N}(x_{1N}) + g_{2N} (r_N- x_{1N}) + f_{N-1}(x_i-x_{1N} ) \bigg ]

where g_{ij}(x_{ij}) is the total cost of sending x_{ij} units from depot i to demand point j (so the setup cost c_{ij} if necessary, plus the coefficient a_{ij} times x_{ij}), and r_N is the demand needed at demand point N.

You iterate through the demand points. For each demand point, iterate through all the possible ways of allocating the r_N resources provided by the depots.

3.2Example

Assignment 5, question 1.

4Markov decision processes

4.1Examples

4.1.1With discounting

Given in class (Thursday, February 13):

State i Policy k p_{i1}^k p_{i2}^k q_i^k (expected reward)
1 1 0.6 0.4 4.6
1 2 0.7 0.3 3.2
2 1 0.4 0.6 2.4
2 2 0.5 0.5 1.5

\alpha = 0.9

At time step 1, we see that policy 1 is best for both states.

Fuck it, I'll do it later if there's time

4.1.2Without discounting

Given in class (Tuesday April 8):

State i Policy k p_{i1}^k p_{i2}^k q_i^k (expected reward)
1 1 0.7 0.3 6
1 2 0.2 0.8 4
2 1 0.5 0.5 3
2 2 0.6 0.4 2

Evaluate f_n(i) and the optimal policy n=1,2,3,\ldots, where

f_n(i) = \max_k \left [ q_i^k + \sum_{j=1}^N p_{ij}^k f_{n-1}(j) \right ] \tag{with $f_0=0$}

At time step 1:

  • If we're at state 1:
    • policy 1 reward: 6 \bigstar
    • policy 2 reward: 4
  • If we're at state 2:
    • policy 1 reward: 3 \bigstar
    • policy 2 reward: 2

Thus f_1(1) = 6 (policy 1) and f_1(2) = 3 (policy 1).

Now, time step 2:

  • If we're at state 1:
    • Policy 1: 0.7 \cdot f_1(1) + 0.3 \cdot f_1(2) + 6 = 0.7\cdot 6 + 0.3 \cdot 3 + 6 = 11.1 \bigstar
    • Policy 2: 0.2 \cdot f_1(1) + 0.8 \cdot f_1(2) + 4 = 0.2 \cdot 6 + 0.8 \cdot 3 + 4 = 7.6
  • If we're at state 2:
    • Policy 1: 0.5 \cdot f_1(1) + 0.5 \cdot f_1(2) + 3 = 0.5 \cdot 6 + 0.5 \cdot 3 + 3 = 7.5 \bigstar
    • Policy 2: 0.6 \cdot f_1(1) + 0.4 \cdot f_1(2) + 2 = 0.6\cdot 6 + 0.4 \cdot 3 + 2= 6.8

So f_2(1) = 11.1 (policy 1) and f_2(2) = 7.5 (policy 1).

Next, time step 3:

  • State 1:
    • Policy 1: 0.7 \cdot f_2(1) + 0.3 \cdot f_2(2) + 6 = 0.7\cdot 11.1 + 0.3 \cdot 7.5 + 6 = 16.02 \bigstar
    • Policy 2: 0.2 \cdot f_2(1) + 0.8 \cdot f_2(2) + 4 = 0.2 \cdot 11.1 + 0.8 \cdot 7.5 + 4 = 12.22
  • If we're at state 2:
    • Policy 1: 0.5 \cdot f_2(1) + 0.5 \cdot f_2(2) + 3 = 0.5 \cdot 11.1 + 0.5 \cdot 7.5 + 3 = 12.3 \bigstar
    • Policy 2: 0.6 \cdot f_2(1) + 0.4 \cdot f_2(2) + 2 = 0.6\cdot 11.1 + 0.4 \cdot 7.5 + 2= 11.66

So f_3(1) = 16.02 (policy 1) and f_3(2) = 12.3 (policy 1).

Now, let's verify that the optimal policy for n=3 is the optimal policy as n \to \infty. You're basically solving some recurrence relations. Here's are the actual recurrence relations (derivation omitted):

\underbrace{g + c_1 = 6 + 0.7c_1 + 0.3c_2}_{\text{for state 1}} \quad \text{and} \quad \underbrace{g + c_2 = 3 + 0.5c_1 + 0.5c_2}_{\text{for state 2}} \tag{equations (a) and (b)}

where g is the gain derived from choosing the optimal policy at n=3 (which happens to be policy 1 for both states) at each state as n \to \infty, and c_i is \displaystyle \lim_{n\to\infty} f_n(i) (so the expected reward, over an infinite horizon).

Then, to solve for c_1 and c_2, we subtract the two equations:

c_1 - c_2 = 6 + 0.7c_1 + 0.3c_2 - 3 - 0.5c_1 - 0.5c_2 = 3 + 0.2c_1 -0.2c_2

Thus we have 0.8c_1 -0.8c_2 = 3 and so c_1-c_2 = 30/8 = 3.75. So the expected gain in choosing policy 1 (over policy 2) for each time step at each state is 3.75. Next, we add (a) and (b):

2g + c_1 + c_2 = 9 + 1.2c_1 + 0.8c_2

Rearranging, we get 2g = 9 + 0.2c_1-0.2c_2 = 9+0.2(c_1-c_2) = 9+0.2(3.75) = 9.75. So g=9.75/2 = 4.875. I don't know what we do with this though.

Then, we use the PIR to find the optimal policy ???

PIR = \max \left [ q_i^k + \sum_{j=1}^n P_{ij}^k c_j \right ]

At state 1, you have 6 + 0.7(3.75) = 8.625 (so sub in c_1-c_2 for c_1, maybe c_2=0?) for policy 1, and 3+0.2(3.75) = 3.75 for policy 2. So obviously policy 1 is optimal here.

At state 2, you have 3 + 0.5(3.75) = 4.875 for policy 1, and 2 + 0.6(3.75)= 4.25 for policy 2. Again, policy 1 is the optimal policy.

This confirms that policy 1 at both states is optimal over an infinite horizon.

5Semi-Markov decision processes

6Queueing theory

See the notes.

7Calculus of variations

See the notes.

7.1Example

7.1.1Unconstrained

Maximise:

\int_0^a x^3(y')^2 \, dx \tag{$y(0) = 0$, $y(a) = b$}

Recall Euler's equation:

F_y = -\frac{d}{dx} F_{y'}

F(x, y, y') = x^3(y')^2. F_y = 0, F_{y'} = 2x^3y'. Then

\frac{d}{dx} F_{y'} = \frac{d}{dx} 2x^3y' = 6x^2y' + 2x^3y'' \tag{product rule}

which is equal to 0 because F_y is. So we have the ODE

6x^2y' = -2x^3y''

We can rearrange this to get

xy'' + 3y' = 0

This is a linear ODE! Apparently the solution is y = c_1 + c_2x^{-2} but I'm not sure how to derive that. How do I ODE?

  1. Note that the x-values here don't actually correspond to the values of x the variable that we're trying to find a value for (in W). It should really be called something other than x, to avoid confusion, but Sancho uses x and it's probably even more confusing to change it now.