Forgot password?
 Register account
View 1920|Reply 8

[不等式] 关于二项式分布的众数

[Copy link]

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

hbghlyj Posted 2019-9-25 00:04 |Read mode
Last edited by hbghlyj 2023-3-6 20:02Binomial distribution
PMF: $\displaystyle \binom {n}{k}p^{k}q^{n-k}$
Mode: $ ⌊ ( n + 1 ) p ⌋$ or $\displaystyle \lceil (n+1)p\rceil -1$


Finding mode in Binomial distribution
Let $a_k=P(X=k)$, we have
$$a_k=\binom{n}{k}p^kq^{n-k}\qquad\text{and}\qquad a_{k+1}=\binom{n}{k+1}p^{k+1}q^{n-k-
1},$$
where as usual $q=1-p$ in binomial distribution.

We calculate the ratio $\dfrac{a_{k+1}}{a_k}$. Note that $\frac{\binom{n}{k+1}}{\binom{n}{k}}$ simplifies to $\frac{n-k}{k+1},$
and therefore
$$\frac{a_{k+1}}{a_k}=\frac{n-k}{k+1}\cdot\frac{p}{q}=\frac{n-k}{k+1}\cdot\frac{p}{1-p}.$$

From this equation we can follow:

\begin{align*}
k > (n+1)p-1 \implies a_{k+1} < a_k \\  
k = (n+1)p-1 \implies a_{k+1} = a_k \\
k < (n+1)p-1 \implies a_{k+1} > a_k
\end{align*}
  
The calculation (almost) says that we have equality of two consecutive probabilities precisely if $a_{k+1}=a_k$, that is, if $k=np+p-1$.  Note that $k=np+p-1$ implies that $np+p-1$ is an integer.


So if $k=np+p-1$ is not an integer, there is a single mode; and if $k=np+p-1$ is an integer, there are two modes, at $np+p-1$ and at $np+p$.  

Not quite! We have been a little casual in our algebra. We have not paid attention to whether we might be multiplying or dividing by $0$.  We also have casually accepted what the algebra seems to say, without doing a reality check.  

Suppose that $p=0$. Then $np+p-1$ is an integer, namely $-1$. But whatever $n$ is, there is a single mode, namely $k=0$. In all other situations where $np+p-1$ is an integer, the $k$ we have identified is non-negative.

However, suppose that $p=1$. Again, $np+p-1$ is an integer, and again there is no double mode. The largest $a_k$ occurs at one place only, namely $k=n$, since $np+p$ is in this case beyond our range.

That completes the analysis when $np+p-1$ is an integer. When it is not, the analysis is simple. There is a single mode, at $\lfloor np+p\rfloor$.

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2019-9-25 06:46
Last edited by hbghlyj 2019-9-25 06:52回复 1# hbghlyj
a=5
  1. Last /@ Last /@
  2.   Table[MaximalBy[
  3.     Table[{Coefficient[(5 x + 1)^n, x, i], i}, {i, 0, n}], First], {n,
  4.      1, 100}]
  5. {1, 2, 3, 4, 5, 5, 6, 7, 8, 9, 10, 10, 11, 12, 13, 14, 15, 15, 16, \
  6. 17, 18, 19, 20, 20, 21, 22, 23, 24, 25, 25, 26, 27, 28, 29, 30, 30, \
  7. 31, 32, 33, 34, 35, 35, 36, 37, 38, 39, 40, 40, 41, 42, 43, 44, 45, \
  8. 45, 46, 47, 48, 49, 50, 50, 51, 52, 53, 54, 55, 55, 56, 57, 58, 59, \
  9. 60, 60, 61, 62, 63, 64, 65, 65, 66, 67, 68, 69, 70, 70, 71, 72, 73, \
  10. 74, 75, 75, 76, 77, 78, 79, 80, 80, 81, 82, 83, 84}
Copy the Code
规律是遇到5的倍数重复一次
a=6
  1. Last /@ Last /@
  2.   Table[MaximalBy[
  3.     Table[{Coefficient[(6 x + 1)^n, x, i], i}, {i, 0, n}], First], {n,
  4.      1, 100}]
  5. {1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 12, 12, 13, 14, 15, 16, 17, \
  6. 18, 18, 19, 20, 21, 22, 23, 24, 24, 25, 26, 27, 28, 29, 30, 30, 31, \
  7. 32, 33, 34, 35, 36, 36, 37, 38, 39, 40, 41, 42, 42, 43, 44, 45, 46, \
  8. 47, 48, 48, 49, 50, 51, 52, 53, 54, 54, 55, 56, 57, 58, 59, 60, 60, \
  9. 61, 62, 63, 64, 65, 66, 66, 67, 68, 69, 70, 71, 72, 72, 73, 74, 75, \
  10. 76, 77, 78, 78, 79, 80, 81, 82, 83, 84, 84, 85, 86}
Copy the Code
规律是遇到6的倍数重复一次

0

Threads

17

Posts

322

Credits

Credits
322

Show all posts

mowxqq Posted 2022-3-16 17:52
这样解得的是必要条件。只能说明${{\text{a}}_{\text{k}}} \geqslant {a_{k - 1}}$与${{\text{a}}_{\text{ ...
hbghlyj 发表于 2019-9-25 00:04
19092500018e42bc817400d948.png
根据这两个不等式解出 $k$ 的范围,两侧系数都是单调的,不就是只有一个极值吗

277

Threads

547

Posts

5413

Credits

Credits
5413

Show all posts

力工 Posted 2022-3-16 18:24
这两个不等式说明了$k$的左边不减,右边不增,则$a_k$必是最大值,但觉得有风险,如果$k$恰好是$1或n$就有一个不等式无解了。

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2022-3-18 07:02
$$\binom{n}{\lfloor n/2\rfloor}=\max_k\binom nk$$
刚才Erdős-Ko-Rado定理那篇帖子里又遇到这个了

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2022-10-14 17:42
Hypergeometric distribution(超几何分布)
PMF(概率质量函数):$$p_{X}(k)=\Pr(X=k)={\frac {{\binom {K}{k}}{\binom {N-K}{n-k}}}{\binom {N}{n}}}$$
The pmf is positive when $\displaystyle \max(0,n+K-N)\leq k\leq \min(K,n)$.

Mode(众数): $\displaystyle \left\lceil {\frac {(n+1)(K+1)}{N+2}}\right\rceil -1,\left\lfloor {\frac {(n+1)(K+1)}{N+2}}\right\rfloor $.
类似地计算:
$\frac{p_X(k)}{p_X(k-1)}=\frac{K-k+1}k\cdot\frac{n-k+1}{N-K-n+k}$
当$p_X(k)>0,p_X(k-1)>0$时$\frac{p_X(k)}{p_X(k-1)}>1\Leftrightarrow\frac{K-k+1}k>\frac{N-K-n+k}{n-k+1}\Leftrightarrow\frac{K+1}k>\frac{N-K+1}{n-k+1}\Leftrightarrow\frac{K+1}k>\frac{N+2}{n+1}\Leftrightarrow k<\frac{(n+1)(K+1)}{N+2}$
所以,对于$\max(0,n+K-N)+1\leq k\leq\left\lfloor {\frac {(n+1)(K+1)}{N+2}}\right\rfloor $有$p_X(k-1)<p_X(k)$.
同理,对于$\left\lceil {\frac {(n+1)(K+1)}{N+2}}\right\rceil -1\leq k\leq\min(K,n)-1$有$p_X(k)>p_X(k+1)$.

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2022-12-12 18:27

超几何分布的众数

hyperGeom.pdf
Examples of the Hypergeometric Distribution
The hypergeometric distribution, $h(N; k; n; x)$, arises in the following way:
Suppose we have $N$ balls $k$ of which are red and $N-k$ are blue. We draw a sample, without replacement, of $n$ balls. Let $X$ equal the number of red balls in our sample of size $n$.
Then$$\Pr(X=x)=h(N, k, n, x)=\frac{\left(\begin{array}{l}k \\ x\end{array}\right)\left(\begin{array}{l}N-k \\ n-x\end{array}\right)}{\left(\begin{array}{l}N \\ n\end{array}\right)}, \quad 0 \leq x \leq n$$
Example 1. (Exercise 5.1.36) A bin of 1000 turnbuckles has an unknown number $D$ of defectives. A sample of 100 turnbuckles has 2 defectives. The maximum likelihood estimate for $D$ is the number which gives the highest probability for obtaining the number of defectives observed in a sample. Find that value of $D$.
Solution: In this example $N = 1000, k = D$ (the defective turnbuckles are called red balls), and the sample size is $n=100$. What value of $D$ makes this the most probable event? We make a table of $h[1000; D; 100; 2]$ for varying values of $D$. Here's the result from Mathematica:
In[15]:= n1=Binomial[1000,100];
In[16]:= pr[d_]:=Binomial[d,2]*Binomial[1000-d,98]/n1;
In[17]:= Table[{d,N[pr[d]]},{d,2,40}]
Out[17]= {{2, 0.00990991}, {3, 0.0268104}, {4, 0.0483501}, {5, 0.0726546},
> {6, 0.098248}, {7, 0.123986}, {8, 0.149}, {9, 0.172646}, {10, 0.194466},
> {11, 0.214153}, {12, 0.231519}, {13, 0.246474}, {14, 0.259001},
> {15, 0.269145}, {16, 0.276991}, {17, 0.282658}, {18, 0.286288},
> {19, 0.288038}, {20, 0.28807}, {21, 0.286554}, {22, 0.283656},
> {23, 0.27954}, {24, 0.274364}, {25, 0.268278}, {26, 0.261422},
> {27, 0.253928}, {28, 0.245918}, {29, 0.237503}, {30, 0.228785},
> {31, 0.219855}, {32, 0.210795}, {33, 0.201677}, {34, 0.192565},
> {35, 0.183516}, {36, 0.174578}, {37, 0.165792}, {38, 0.157194},
> {39, 0.148812}, {40, 0.14067}}
So we see that the most likely value of $D$ is $20$.
Example 2. (Exercise 5.1.37) There are an unknown number of moose on Isle Royale. To estimate the number of moose, 50 moose are captured and tagged. Six months later 200 moose are captured and it is found that 8 of these are tagged. Estimate the number of moose on Isle Royale 1 from these data.
Solution: In the hypergeometric distribution let $N$ equal the total number of moose. We are told that $k$ of them are tagged. (The tagged moose are the red balls.) A sample of size $k = 200$ is drawn and we are told that 8 are tagged. We ask for what choice of $N$ maximizes the probability $h(N; 50; 200; 8]$, where $h(N; k; n; x)$ is given above in (1). If one makes a table as in the above example, one finds that $N = 1250$ maximizes the hypergeometric distribution for the above values of $n, k$ and $x$. Figure 1 shows a plot of $h(N; 50; 200; 8)$ as a function of $N$.
Screenshot 2022-12-12 at 10-35-07 hyperGeom.pdf.png

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2022-12-12 18:46
So we see that the most likely value of $D$ is $20$.
$$\frac{h(N, k\phantom{-,1}, n, x)}{h(N, k-1, n, x)}=\frac k{k-x}\cdot\frac{N-k-n+x+1}{N-k+1}\ge1\Leftrightarrow k\le\frac{(N + 1) x}n$$
代入$N=1000,x=2,n=100$得$k=20$时取最大值.
one finds that $N = 1250$ maximizes the hypergeometric distribution for the above values of $n, k$ and $x$.
$$\frac{h(N\phantom{-,1}, k, n, x)}{h(N-1, k, n, x)}=\frac{\frac{N-k}{N-k-n+x}}{\frac{N}{N-n}}\ge1\Leftrightarrow N\le\frac{kn}x$$
代入$k=50,n=200,x=8$得$N=1250,1249$时取最大值.

3159

Threads

7941

Posts

610K

Credits

Credits
63770
QQ

Show all posts

 Author| hbghlyj Posted 2022-12-12 19:06
还可以计算Gamma函数的导数
对于$n=4,p=\frac23,q=\frac13,$
FindRoot[D[2^k/(k!(4-k)!),k]==0,{k,2}]
得到k≈2.84522
用1#得到$\lfloor5\cdot\frac23\rfloor=\lceil5\cdot\frac23\rceil-1=3$

对于$n=5,p=\frac23,q=\frac13,$
FindRoot[D[2^k/(k!(5-k)!),k]==0,{k,4}]
计算Gamma函数的导数得到k≈3.51004
用1#得到$\lfloor6\cdot\frac23\rfloor=4$ 与 $\lceil6\cdot\frac23\rceil-1=3$

Mobile version|Discuz Math Forum

2025-5-31 11:08 GMT+8

Powered by Discuz!

× Quick Reply To Top Edit