这份讲义第14页
Example (Uniform distribution – some unusual features!). Suppose $X_{1}, \ldots, X_{n} \stackrel{\text { iid }}{\sim}$ Uniform $[0, \theta]$, where $\theta>0$, i.e.
$$
f(x ; \theta)= \begin{cases}\frac{1}{\theta} & \text { if } 0 \leqslant x \leqslant \theta \\ 0 & \text { otherwise. }\end{cases}
$$
What is the MLE for $\theta$ ? Is the MLE unbiased?
Calculate the likelihood:
$$
\begin{aligned}
L(\theta) &=\prod_{i=1}^{n} f(x ; \theta) \\
&= \begin{cases}\frac{1}{\theta^{n}} & \text { if } 0 \leqslant x_{i} \leqslant \theta \text { for all } i \\
0 & \text { otherwise }\end{cases} \\
&= \begin{cases}0 & \text { if } 0<\theta<\max x_{i} \\
\frac{1}{\theta^{n}} & \text { if } \theta \geqslant \max x_{i}\end{cases}
\end{aligned}
$$
Note: $\theta \geqslant x_{i}$ for all $i \Longleftrightarrow \theta \geqslant \max x_{i}.$(And $\max x_{i}$ means $\max _{1 \leqslant i \leqslant n} x_{i}$).Figure 4.1. Likelihood $L(θ)$ for ${\rm Uniform}[0, θ]$ example.
From the diagram:
the max occurs at $\widehat{\theta}=\max x_{i}$
this is not a point where $\ell^{\prime}(\theta)=0$
taking logs doesn't help.
Consider the range of values of $x$ for which $f(x ; \theta)>0$, i.e. $0 \leqslant x \leqslant \theta$. The thing that makes this example different to our previous ones is that this range depends on $\theta$ (and we must take this into account because the likelihood is a function of $\theta$ ).
The MLE of $\theta$ is $\widehat{\theta}=\max X_{i}$. What is $E(\widehat{\theta})$ ?
Find the c.d.f. of $\widehat{\theta}$ :
$$
\begin{aligned}
F(y) &=P(\widehat{\theta} \leqslant y) \\
&=P\left(\max X_{i} \leqslant y\right) \\
&=P\left(X_{1} \leqslant y, X_{2} \leqslant y, \ldots, X_{n} \leqslant y\right) \\
&=P\left(X_{1} \leqslant y\right) P\left(X_{2} \leqslant y\right) \ldots P\left(X_{n} \leqslant y\right) \quad \text { since } X_{i} \text { independent } \\
&= \begin{cases}(y / \theta)^{n} & \text { if } 0 \leqslant y \leqslant \theta \\
1 & \text { if } y>\theta .\end{cases}
\end{aligned}
$$
So, differentiating the c.d.f., the p.d.f. is
$$
f(y)=\frac{n y^{n-1}}{\theta^{n}}, \quad 0 \leqslant y \leqslant \theta .
$$
So
$$
\begin{aligned}
E(\widehat{\theta}) &=\int_{0}^{\theta} y \cdot \frac{n y^{n-1}}{\theta^{n}} d y \\
&=\frac{n}{\theta^{n}} \int_{0}^{\theta} y^{n} d y \\
&=\frac{n \theta}{n+1} .
\end{aligned}
$$
So $\widehat{\theta}$ is not unbiased. But note that it is asymptotically unbiased: $E(\widehat{\theta}) \rightarrow \theta$ as $n \rightarrow \infty$. In fact under mild assumptions MLEs are always asymptotically unbiased.
Example (Uniform distribution). Suppose $X_{1}, \ldots, X_{n} \stackrel{\text { iid }}{\sim}$ Uniform $[0, \theta]$, i.e.
$$
f(x ; \theta)= \begin{cases}\frac{1}{\theta} & \text { if } 0 \leqslant x \leqslant \theta \\ 0 & \text { otherwise. }\end{cases}
$$
We will consider two estimators of $\theta$ :
• $T=2 \bar{X}$, the natural estimator based on the sample mean (because the mean of the distribution is $\theta / 2$ )
• $\widehat{\theta}=\max X_{i}$, the MLE.
Now $E(T)=2 E(\bar{X})=\theta$, so $T$ is unbiased. Hence
$$
\begin{aligned}
\operatorname{MSE}(T) &=\operatorname{var}(T) \\
&=4 \operatorname{var}(\bar{X}) \\
&=\frac{4 \operatorname{var}\left(X_{1}\right)}{n}
\end{aligned}
$$
We have $E\left(X_{1}\right)=\theta / 2$ and
$$
E\left(X_{1}^{2}\right)=\int_{0}^{\theta} x^{2} \cdot \frac{1}{\theta} d x=\frac{\theta^{2}}{3}
$$
So
$$
\operatorname{var}\left(X_{1}\right)=\frac{\theta^{2}}{3}-\left(\frac{\theta}{2}\right)^{2}=\frac{\theta^{2}}{12}
$$
hence
$$
\operatorname{MSE}(T)=\frac{4 \operatorname{var}\left(X_{1}\right)}{n}=\frac{\theta^{2}}{3 n}
$$
Previously we showed(见1楼) that $\widehat{\theta}$ has p.d.f.
$$
f(y)=\frac{n y^{n-1}}{\theta^{n}}, \quad 0 \leqslant y \leqslant \theta
$$
and $E(\widehat{\theta})=n \theta /(n+1)$. So $b(\widehat{\theta})=n \theta /(n+1)-\theta=-\theta /(n+1)$. Also,
$$
E\left(\widehat{\theta}^{2}\right)=\int_{0}^{\theta} y^{2} \cdot \frac{n y^{n-1}}{\theta^{n}} d y=\frac{n \theta^{2}}{n+2}
$$
So
$$
\operatorname{var}(\widehat{\theta})=\theta^{2}\left(\frac{n}{n+2}-\frac{n^{2}}{(n+1)^{2}}\right)=\frac{n \theta^{2}}{(n+1)^{2}(n+2)}
$$
hence
$$
\begin{aligned}
\operatorname{MSE}(\widehat{\theta}) &=\operatorname{var}(\widehat{\theta})+[b(\widehat{\theta})]^{2} \\
&=\frac{2 \theta^{2}}{(n+1)(n+2)} \\
&<\frac{\theta^{2}}{3 n} \quad \text { for } n \geqslant 3 \\
&=\operatorname{MSE}(T)
\end{aligned}
$$
• $\operatorname{MSE}(\widehat{\theta}) \ll \operatorname{MSE}(T)$ for large $n$, so $\widehat{\theta}$ is much better — its MSE decreases like $1 / n^{2}$ rather than $1 / n$.
• Note that $\left(\frac{n+1}{n}\right) \widehat{\theta}$ is unbiased and
$$
\begin{aligned}
\operatorname{MSE}\left(\frac{n+1}{n} \widehat{\theta}\right) &=\operatorname{var}\left(\frac{n+1}{n} \widehat{\theta}\right) \\
&=\frac{(n+1)^{2}}{n^{2}} \operatorname{var}(\widehat{\theta}) \\
&=\frac{\theta^{2}}{n(n+2)} \\
&<\operatorname{MSE}(\widehat{\theta}) \quad \text { for } n \geqslant 2
\end{aligned}
$$
However, among all estimators of the form $\lambda \widehat{\theta}$, the MSE is minimized by $\left(\frac{n+2}{n+1}\right) \widehat{\theta}$. [To show this: note $\operatorname{var}(\lambda \widehat{\theta})=\lambda^{2} \operatorname{var}(\widehat{\theta})$ and $b(\lambda \widehat{\theta})=\frac{\lambda n \theta}{n+1}-\theta$. Now plug in formulae and minimise over $\lambda$.]