HTSEFP: Final CC-BY-NC

Maintainer: admin

1Sample variance for two independent samples

We have two random samples. The first consists of $n_1$ values, taken from a population with a variance of $\sigma_1^2$; the second consists of $n_2$ values, taken from a population with a variance of $\sigma_2^2 = \alpha\sigma_1^2$. Let $S_1^2$ be the sample variance for the first sample, and let $S_2^2$ be the sample variance for the second sample.

(a) Find a number $b$ such that

$$P\left (\frac{S_1^2}{S_2^2} \leq b \right ) = 0.95$$

(b) Find a number $a$ such that

$$P\left ( a \leq \frac{S_1^2}{S_2^2} \right ) = 0.95$$

(c) If $a$ and $b$ are as in parts (a) and (b), find

$$P\left ( a \leq \frac{S_1^2}{S_2^2} \leq b \right )$$

1.1General solution

(a) $$\frac{S_1^2/\sigma_1^2}{S_2^2/\sigma_2^2} = \frac{S_1^2/\sigma_1^2}{S_2^2/(\alpha\sigma_1^2} = \alpha\frac{S_1^2}{S_2^2}$$

This has an $F$ distribution with $n_1 - 1$ numerator degrees of freedom and $n_2-1$ denominator degrees of freedom. To find a value $b$ such that $\displaystyle P \left ( \frac{S_1^2}{S_2^2} \leq b \right ) = 0.95$, we multiply both sides of the inequality by $\alpha$, resulting in

$$P\left ( \alpha\frac{S_1^2}{S_2^2} \leq \alpha b \right ) = 0.95$$

We would then look at the table Critical Values of the F distribution, $\alpha = 0.05$ near the end of the exam booklet. Let the entry in column $n_1-1$, row $n_2-1$ be $\alpha b = b_0$ (meaning that the value of $\alpha b$ cutting off an upper-tail area of 0.05 is approximately $b_0$). Consequently, $b = b_0 / \alpha$ and that is the answer.

(b) This is similar, except we have to first use this relationship:

$$P\left ( \frac{U_1}{U_2} \leq k \right ) = P \left ( \frac{U_2}{U_1} \geq \frac{1}{k} \right ) \text{ or, equivalently, } P \left ( k \leq \frac{U_1}{U_2} \right ) = P \left ( \frac{1}{k} \geq \frac{U_2}{U_1} \right )$$

Using this, we can rewrite the equation we're given as

$$P\left ( a \leq \frac{S_1^2}{S_2^2} \right ) = P \left ( \frac{1}{a} \geq \frac{S_2^2}{S_1^2} \right )$$

Dividing both sides of the inequality by $\alpha$, we get

$$P \left ( \frac{1}{\alpha a} \geq \frac{S_2^2}{\alpha S_1^2} \right )$$

Now, $\displaystyle \frac{S_2^2}{\alpha S_1^2}$ is an $F$ distribution with $n_2-1$ numerator degrees of freedom and $n_1-1$ denominator degrees of freedom. Using the same table that we used above, only switching the row and the column, we find that the value of $\frac{1}{\alpha a}$ cutting off an upper-tail area of 0.05 is approximately $a_0$. Consequently, $a = 1 / (\alpha \cdot a_0)$ and that is the answer.

(c) $1 - 0.05 - 0.05 = 0.90$

1.2Examples

  • Assignment 1, question 1

2MLEs and continuous functions

Given some samples from the probability density function $f(y|\theta)$ for some distribution $Y$ with parameter $\theta$:

(a) Find the MLE of $\theta$.
(b) Find the expected value and variance of the MLE.
(c) Supposing that $\theta$ is actually $\theta_0$, give an approximate bound for the error of estimation.
(d) Find the MLE for the variance of the distribution.

2.1General solution

(a) First, find the likelihood function, given by the joint density of the random sample:

$$L = \prod_{i=1}^n f(y_i|\theta)$$

Then, take the logarithm of that, $\log L$, and differentiate it with respect to $\theta$. Set that to 0, and solve the equation for $\theta$. This gives you the MLE for $\theta$. If you're also given values, you can plug those into the equation for the MLE to get a numerical MLE.

(b) Expected value: $E(\hat \theta)$; variance: $Var(\hat \theta)$. Use the relevant algebraic manipulation rules if possible. With any luck the distribution given will be something we can easily figure out the mean and variance for, and the equations will involve $E(Y_i)$ or $Var(Y_i)$ and everything will work out and flowers pick themselves

(c) Use the formula $2\sqrt{Var(\hat \theta)}$ using the variance of the MLE obtained in (c) and using $\theta_0$ for the value of $\theta$.

(d) Take the value for the MLE obtained in part (a) and substitute it for $\theta$ in the formula for the variance of $Y$ (possibly obtained during part (b)).

2.2Examples

3Method-of-moments estimators

Let $Y_1,\ldots, Y_n$ be a random sample (independent, identically distributed random variables) from some distribution. Find the method-of-moments estimator(s) for one or more parameters.

3.1General solution

The first two moments are as follows:

$$\mu_1' = E(Y) \quad \mu_2' = E(Y^2) = Var(Y) + (E(Y))^2$$

The first two sample moments are as follows:

$$m_1 = \frac{1}{n} \sum_{i=1}^n Y_i = \overline Y \quad m_2 = \frac{1}{n} \sum_{i=1}^n Y_i^2$$

Then, we just equate them and solve for the relevant parameters. The hard part for this type of question is likely finding the moments themselves, but this shouldn't be hard if you memorise all the moments for all the distributions1 or know when it's necessary to integrate.

As an example, for the normal distribution, the first two moments are

$$\mu_1' = \mu \quad \mu_2 = \sigma^2 + \mu^2$$

and the first two sample moments are what's given above (can't simplify them). Then, we just set the appropriate moments equal to each other, resulting in

$$\hat \mu = \overline Y}$$

as an estimator for $\mu$ and

$$\hat \sigma^2 = \frac{1}{n} \sum_{i=1}^n Y_i^2 - \hat \mu^2 = \frac{1}{n} \sum_{i=1}^2 Y_i^2 - \overline Y^2$$

as an estimator for $\sigma^2$.

3.2Examples

  • Assignment 1, questions 4 and 5

4Bayesian estimation

Oh god

4.1General solution

Not gonna

4.2Examples

  • Assignment 1, question 6

5Unbiasedness in estimators

Given an estimator defined in terms of other estimators whose expected values and variance are known, determine whether or not it is biased. Justify your result.

Alternatively, given an estimator defined in terms of $Y_i$ (random variable from a given distribution), same thing.

5.1General solution

An estimator $\hat \theta$ is unbiased iff $(\hat \theta) = \theta$. For the case where the estimator is defined in terms of other estimators, this shouldn't be too difficult, as you'll be given the expected value of the other estimators. It shouldn't be too difficult in the other case either, come to think of it, although this might require finding the expected value of $Y_i$ by integrating $yf(y)$ with respect to $y$ from 0 to infinity first.

If any of the estimators involves taking the minimum or maximum (or $k$th minimum etc) of several random variables, then it is likely biased, due to order statistics (?). Theoretically, even the median would be biased with respect to the mean ... not really sure about this though.

5.2Examples

6Minimising variance in estimators

Either:

  • Choose a constant $\alpha$ in the formula for an estimator such that the variance of the estimator is minimised;
  • or, decide which estimator among several has the lowest variance.

6.1General solution

In the first case, find a function involving $\alpha$ for the variance of estimator using the algebraic manipulation rules for variance etc. Then, find the minimum of this function by taking its derivative with respect to $\alpha$ and setting it equal to $0$. Solve for $\alpha$. To confirm that this point is a minimum, you can take the second derivative.

In the second case, you just need to find the variance for several estimators. Recall that $Var(Y_i) = E(Y_i^2) - (E(Y_i))^2$ and that $\displaystyle E(Y_i) = \int_{-\infty}^{\infty} f(y) \,dy$ where $f(y)$ is the probability density function.

6.2Examples

  • Assignment 1, questions 7 (b) (first type) and 8 (b) (second type)

7Efficiency of two estimators

Find the efficiency of $\hat \theta_1$ to $\hat \theta_2$.

7.1General solution

$$\text{efficiency}(\hat \theta_1, \hat \theta_2) = \frac{Var(\hat \theta_2)}{Var(\hat \theta_1)}$$

(This indicates that if $\hat \theta_1$ is a better estimator - that is, one, with a lower variance - than $\hat \theta_2$, then its relative efficiency will be greater than 1.)

Anyway, now we just have to find the variance of both estimators. This is probably the hard part. If order statistics is involved somehow, then you'll probably need to transform coordinates or something. I should explain this in the section on order statistics.

7.2Examples

  • Assignment 2, question 1

8Consistent estimators

Show that $\hat \theta$ is a consistent estimator for $\theta$.

8.1General solution

Either show that

$$\lim_{n \to \infty} E(\hat \theta) = \theta$$

or that

$$\lim_{n \to \infty} Var(\hat \theta) = 0$$

8.2Examples

9Sufficient statistics

Show that some statistic $\hat \theta$ is sufficient for $\theta$. Or, find a sufficient statistic for $\theta$.

9.1General solution

Um ... do this later? Involves likelihood functions and theorem 9.4 (what is that again etc)

9.2Examples

10MVUEs

Find the MVUE of some parameter $\theta$, using some statistic.

OR: show that the MVUE of some parameter is something.

OR: determine whether or not something is an MVUE.

10.1General solution

Not too hard it seems but do this later

10.2Examples

11Estimating with a bound

Given the sample mean and standard deviation or variance (random sampling of size $n$), estimate the real mean or whatever and place a bound on the error of estimation.

11.1General solution

Use the sample mean/whatever for the estimate. The bound is given by the formula

$$b = 2\sigma_{\hat \theta} = 2\frac{\theta}{\sqrt{n}}$$

where does this come from??? what does it mean?????

11.2Examples

  • Assignment 2, question 7

12Finding distribution functions

Given the probability density function, find the distribution function.

12.1General solution

Integrate it. Should this be moved to the top? It's pretty basic.

12.2Examples

  • Assignment 2, question 8 (a)

13Pivotal quantities

Given the probability density function, show that something is a pivotal quantity and use that pivotal quantity to find an $\alpha$% lower confidence limit for $\theta$.

13.1General solution

What are pivotal quantities?

13.2Examples

  • Assignment 2, question 8 (b) and (c)

14Confidence intervals

Find a something% confidence interval for some statistic. Use that interval to make a judgement about possible values for the statistic or interpret the results. Optionally, place a bound on the error of estimation, or just set up an upper bound or something. What assumptions are necessary for the methods used to be valid?

14.1General solution

Read up on this. Making a judgement about the statistic is trivial, just see if it falls within the interval or not.

14.2Examples

15Sample sizes

Find the sample size necessary to estimate some parameter $\theta$ within some range with some probability, when either the parameter is thought to be around $\theta_0$ or when we know nothing about $\theta$ (but we do know the variance).

Alternatively, if there are two populations and we want to estimate the distance between them or something, find the necessary sample size etc

15.1General solution

Later. Converse of confidence intervals maybe?

15.2Examples

  • Assignment 3, questions 2, 3 (b), and 4

16Hypothesis testing

State the null and alternative hypotheses. Test one hypothesis against another. Make conclusions? Profit3

Also optionally find the $p$-value for the test. Why does it feel like this class is just about memorising? I dropped bio for a reason ;_;

16.1General solution

Read up on this again sigh

16.2Examples

17Uniformly most powerful tests

Find the uniformly most powerful test for testing some hypothesis against some other hypothesis.

Or, a most powerful critical region

What conclusion would you make given some observations

17.1General solution

I don't even know what this means. I can guess but that's it

Neyman-Pearson?

17.2Examples

18Using the factorisation criterion

Wasn't really sure how else to classify it

18.1General solution

Use theorem 9.4? What is that again?

18.2Examples

  • Assignment 5, question 1

19Linear regression

Method of least squares, finding the least squares line (or LOBF), etc. Might even include confidence intervals or testing hypotheses, lucky you. Use the LOBF to predict things?

19.1General solution

Later

19.2Examples

20ANOVA tables

Given an ANOVA (analysis of variance) table, FILL IT OUT

Then, do the data support some hypothesis, etc (see the hypothesis testing section)

20.1General solution

i don't know ?????

20.2Examples

  1. lol 

  2. 6 (b) isn't mentioned as an example in any of the problems because it's just deriving variance, which is a component of other problem types. Actually maybe I'll make this into a problem type eventually. 

  3. Not guaranteed