HTSEFP CC-BY-NC

Maintainer: admin

Tips for problem-solving and material we've encountered in problems. Sources: midterms, past finals, assignments, sample questions.

Under construction

1Search and game-playing

1.1Heuristics

Admissible = optimistic about the distance left (so $h \leq e$ where $e$ is the actual distance). The closer it is to the actual distance, the more useful the heuristic for all kinds of search algoriths.

1.2Genetic algorithms

For optimisation. Can maintain multiple solutions, not guaranteed to converge to correct answer. Describe the individuals, fitness functions, operations, population, etc.

Crossover: easy, just cross at the specified point

1.3Alpha-beta

Doing it to depth $k+1$ instead of $k$ might be better, but it depends on the heuristic function.

Needs a good move-ordering for efficiency.

Only leads to optimal play against an opponent using the same heuristic function, if the function is good enough (like, not trivial). If the heuristic is shitty then who knows what will happen.

1.4Monte Carlo tree search

Works well with high branching factor. Not necessarily optimal (depends on the policy for generating trajectories, and how many were generated)

1.5Uniform cost search

If we add a positive number to all edges this could change the result

1.6Beam search

Not guaranteed to converge on an optimal solution even for an admissible heuristic, unless the beam is wide enough to include all successor nodes every time; space and time are $O(k^d)$; lower memory requirement than A*.

1.7Simulated annealing

For optimisation. Only keeps one best solution, guaranteed to converge with the right choice of parameters. If you lower the temp too quickly, you might get stuck in a local optimum; if you lower it too slowly it could take longer to reach the optimum than it will take to get our midterms back.

1.8Minimax trees

Self-explanatory. To ensure the maximum amount of pruning, sort it in increasing order at min nodes, decreasing at max.

1.9Constraint satisfaction

Useful when you just want to find a valid solution, not necessarily the optimal solution. Describe with variables, domains, and constraints. Should indicate how the problem can be solved, too (e.g., depth-first search with backtracking).

1.10Iterative deepening

Depth-limited search but you increase the depth and start again if there's still time. Leads to duplicated work but better memory requirements (linear). Good for large state spaces.

Can be done for game-tree search. The con is that a lot of work gets duplicated.

2Machine learning

2.1Linear perceptron

Can learn AND, OR, but not XOR (not linearly separable)

Take a vector of weights, find its dot product with some input vector, return yes if the output is above a certain threshold and no otherwise. Can be used to take a majority vote.

To represent XOR, you need multiple perceptrons:

From wikimedia commons

To represent some boolean formula, use +1 for positive literals, -1 for negative, and an appropriate threshold. Just try all values I guess

2.2Sigmoid neuron

Also a binary classifier, also not linearly separable, but if we form a network with a single hidden layer we can represent any sort of boolean function; number of hidden units required is exponential in the number of inputs.

2.3Decision trees

Not good for taking a majority vote

2.4Gradient descent for training neural networks

To compute the weight update rule: $w = w - \alpha\nabla J$ where $J$ is some least squares bullshit and $\nabla$ is the gradient of $o$ wrt $w$

Can we replace this with simulated annealing? Yes, if we let the algorithm move upward with some small probability, but it would be slower, less optimal, etc.

2.5Function approximation

Training error
error rate on the training set.
Testing error
error rate on the test set.
Overfitting
When the error rate on the training set is low but the testing error is high
k-fold cross-validation
train on $9/k$, test on $1/k$, repeat $k$ times

3Probability

3.1Basic probability calculations

If you don't know this already then you'll probably fail the exam

3.2Maximum likelihood estimation

Find the event that is most likely to lead to a specific outcome. Take the log of the likelihood function (the product of the density function $n$ times), find its derivative, set to 0 (to find an optimum). This can also be done with gradient descent (guess and check)

3.3Drawing Bayes nets

Draw a circle for an event, arrows to indicate influence, then write the probabilities (given any influencing events) next to each event.

3.4Bayes theorem

$$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$$

3.5Utility theory

Value of information
Expected value of best action with the information - expected value of best action with the information

3.6Bandit problems

Difference between UCB and epsilon-greedy: the former wants to reduce uncertainty, the latter is just greedy (with some small probability for non-greediness); the differences even out over time

4Logic

4.1Converting to CNF

ands of ors. The harder one. It looks like $\to$ can be left as-is, though, judging from the solutions to 4b for the sample questions

Skolemisation lets you replace an existential quantifier with some arbitrary variable, say $X0$.

4.2Translating into first-order logic

Use variables for things if necessary (like "Bob").

Note that you can't do things like "most" in FOL. You can represent a finite, constant number though, like "there are two different boys in the world" (just state that they're not equal)

4.3Proving statements from first-order logic

Use skolemisation. Resolution: if $a \to b$ and $b \to c$ then $a \to c$ (transitivity). Unification: replace a variable by something else (like a skolem constant)

5MDPs

5.1Describing an MDP

States, actions, transition probabilities, rewards

5.2Optimal policies and values

Policy
Whatever gets the highest reward.
Value at a state
Compute the reward accumulated by that state if we behave acc. to the optimal policy
Optimal value function
$V^*(n)$, a function indicating the optimal value if we start at state $n$ (and continue until the end of time). It looks like we can leave it in recurrence relation form, judging from the solutions to 6c for the sample questions
Value iteration, $V_n(k)$
later

5.3Bellman equations

fuck this

5.4Temporal difference learning

later

6Problem formulation

Basically an essay question. Given a problem, how would you solve it using AI techniques, etc. MDPs are a good bet. Describe rewards, transition models, states, etc. For optimisation problems,