全对称,不在x=y时取最值例子?有什么特点?

realnumber · 2023-6-7 20:06

Last edited by realnumber 2023-6-11 17:12如标题,最好有2元的例子,三元也行,这类问题有的话,有什么特点?

hbghlyj · 2023-6-7 20:22

Last edited by hbghlyj 2023-8-5 21:37MAA
Waterhouse378-387

Let $f\left(x_{1}, \ldots, x_{n}\right)$ and $g\left(x_{1}, \ldots, x_{n}\right)$ be two symmetric functions. On the set where $g$ stays equal to $g(r, \ldots, r)$, the function $f$ should have a local maximum or minimum at $(r, \ldots, r)$.

Of course we do not yet have any idea why, or indeed whether, the principle should be true. All we know (from Bouniakovsky) is that no rudimentary symmetry argument will prove it. But at least we can try testing it by the general method to test for a constrained extremum at a point $P$. The classical way to remember this "Lagrange multiplier" method is to think of curves $c(t)$ that have $c(0)=P$ and lie in the set where $g$ is constant. Differentiating the condition $g(c(t))=$ constant, we find that
$$
0=\sum\left(D_{i} g\right)(P) c_{i}^{\prime}(0)
$$
which says that the curve's tangent vector $c^{\prime}(0)$ is perpendicular to the gradient vector $(\nabla g)(P)$ formed by the $D_{i}(g)=\partial g / \partial x_{i}$. Conversely, so long as $(\nabla g)(P)$ is nontrivial, the implicit function theorem shows that every vector perpendicular to $(\nabla g)(P)$ occurs as the tangent to some such curve $c(t)$. Now if $f$ has a local (constrained) extremum at $P$, each $f(c(t))$ has an extremum at 0 , and its derivative vanishes there. Hence we have
$$
0=\sum\left(D_{i} f\right)(P) c_{i}^{\prime}(0)
$$
This says that $(\nabla f)(P)$ is also perpendicular to all the $c^{\prime}(0)$, which implies that $(\nabla f)(P)$ is a scalar multiple of $(\nabla g)(P)$.
Points where this condition holds (together with points where $\nabla g=0$ ) are the critical points for the constrained extremum problem.
All this is familiar in general. Does something special happen when $f$ and $g$ are symmetric? Let us work out an example, say
$$
\begin{aligned}
&f\left(x_{1}, x_{2}, x_{3}\right)=x_{1} x_{2}+x_{2} x_{3}+x_{3} x_{1} \\
&g\left(x_{1}, x_{2}, x_{3}\right)=\left(x_{1} x_{2} x_{3}\right)^{2}
\end{aligned}
$$
The gradients of these are
$$
\begin{aligned}
&\nabla f=\left(x_{2}+x_{3}, x_{1}+x_{3}, x_{1}+x_{2}\right), \\
&\nabla g=2 x_{1} x_{2} x_{3}\left(x_{2} x_{3}, x_{1} x_{3}, x_{1} x_{2}\right) .
\end{aligned}
$$
We are interested in points $P$ of the form $(r, r, r)$, and there we have
$$
\begin{aligned}
&\nabla f(r, r, r)=(2 r, 2 r, 2 r)=2 r(1,1,1) \\
&\nabla g(r, r, r)=\left(2 r^{5}, 2 r^{5}, 2 r^{5}\right)=2 r^{5}(1,1,1)
\end{aligned}
$$
Thus $\nabla f$ is a multiple of $\nabla g$ simply because in each of them all the entries are equal. Once we see that this is what we want to prove, we have no difficulty proving it.

LEMMA 1. Suppose $f\left(x_{1}, \ldots, x_{n}\right)$ is a symmetric differentiable function. Then at a point $x_{1}=\cdots$ $=x_{n}=r$, all the $D_{i} f$ are equal.

Proof. Let $\pi$ be a permutation of $1,2, \ldots, n$. For any function $h$ we can define
$$
h_{\pi}\left(x_{1}, \ldots, x_{n}\right)=h\left(x_{\pi(1)}, \ldots, x_{\pi(n)}\right),
$$
and trivially then
$$
\left(D_{\pi(i)} h_{\pi}\right)\left(x_{1}, \ldots, x_{n}\right)=\left(D_{i} h\right)\left(x_{\pi(1)}, \ldots, x_{\pi(n)}\right) \text {. }
$$
In the present case all $f_{\pi}$ equal $f$ and all $x_{i}$ are equal.$\blacksquare$
With hindsight (and a little charity) we might say that Purkiss also managed to show that the points $(r, \ldots, r)$ are critical points. But then he undeniably fell into an error. When there is only one free variable, a critical point is "in general" a local maximum or minimum-that is, it will be one or the other except for degenerate cases where the second derivative vanishes. But in several variables this is not true, and nondegenerate critical points can equally well be saddle points.
Purkiss simply ignored this possibility, and thus he left a major gap in his argument. Yet it must be admitted that in our original examples we did not in fact encounter any saddle points. Support for the principle can also be drawn from George Chrystal, who in 1889 included it in Part 2 of his famous textbook on algebra [4, II. 61-63]. Recognizing the inadequacy of Purkiss' proof, Chrystal treated only the case where $f$ and $g$ are symmetric polynomials; these he rewrote in terms of the elementary symmetric polynomials, and after some computation he was able to establish the principle (apart from some degenerate situations). With this encouragement, then, let us take up where Purkiss left off and try to show that for some reason we never get saddle points.

For this we need the second-order terms in the Lagrange multiplier method. These ought to be familiar, but in fact most advanced calculus books seem to skip them. Briefly, then, let us again consider one of our curves $c(t)$ lying in the set where $g=g(P)$. We have
$$
\begin{aligned}
0 &=\left.\frac{d^{2}}{d t^{2}} g(c(t))\right|_{t=0} \\
&=\sum_{i}\left(D_{i} g\right)(P) c_{i}^{\prime \prime}(0)+\sum_{i, j}\left(D_{i} D_{j} g\right)(P) c_{i}^{\prime}(0) c_{j}^{\prime}(0)
\end{aligned}
$$
and$$
\left.\frac{d^{2}}{d t^{2}} f(c(t))\right|_{t=0}=\sum_{i}\left(D_{i} f\right)(P) c_{i}^{\prime \prime}(0)+\sum_{i, j}\left(D_{i} D_{j} f\right)(P) c_{i}^{\prime}(0) c_{j}^{\prime}(0)
$$
We know that at our critical point we have $\nabla f(P)=\lambda(\nabla g)(P)$ for some scalar $\lambda$. Multiplying the first equation above by $\lambda$ and subtracting, we get
$$
\left.\frac{d^{2}}{d t^{2}} f(c(t))\right|_{t=0}=\sum_{i, j}\left[\left(D_{i} D_{j} f\right)(P)-\lambda\left(D_{i} D_{j} g\right)(P)\right] c_{i}^{\prime}(0) c_{j}^{\prime}(0)
$$
For a local maximum or minimum, we want these second derivatives to have the same sign for all $c(t)$. Thus the extra condition we need is that the quadratic form
$$
Q(v)=\sum\left[\left(D_{i} D_{j} f\right)(P)-\lambda\left(D_{i} D_{j} g\right)(P)\right] v_{i} v_{j}
$$
should be positive definite or negative definite on the space of all $v$ perpendicular to $\nabla g(P)$. Using the implicit function theorem, one can show that this condition is indeed sufficient $[5,154]$.
This tells us what we need, but why should it automatically be true for symmetric $f$ and $g$ ? Let us look again at our example,
$$
f=x_{1} x_{2}+x_{2} x_{3}+x_{3} x_{1}, \quad g=\left(x_{1} x_{2} x_{3}\right)^{2} .
$$
At a point $(r, r, r)$ we find that the matrices of second partials are
$$
\begin{aligned}
& \left(D_i D_j f(P)\right)=\left(\begin{array}{lll}
0 & 1 & 1 \\
1 & 0 & 1 \\
1 & 1 & 0
\end{array}\right) \\
& \left(D_i D_j g(P)\right)=\left(\begin{array}{lll}
2 r^4 & 4 r^4 & 4 r^4 \\
4 r^4 & 2 r^4 & 4 r^4 \\
4 r^4 & 4 r^4 & 2 r^4
\end{array}\right) .
\end{aligned}
$$
What strikes the eye here is that all the diagonal entries are equal and all the off-diagonal entries are equal. Again, once we notice this we can easily check it in general.

Lemma 2. Suppose $f\left(x_1, \ldots, x_n\right)$ is a symmetric twice-differentiable function. Then at a point $(r, \ldots, r)$ all $D_i D_i f$ are equal and all $D_i D_j f$ for $i \neq j$ are equal.

Proof. In the proof of Lemma 1 we saw that $D_{\pi(i)} h=\left(D_i h\right)_\pi$ for any $h$. Applying this rule twice, we get
$$
D_{\pi(i)} D_{\pi(j)} f=D_{\pi(i)} D_{\pi(j)} f_\pi=D_{\pi(i)}\left[\left(D_j f\right)_\pi\right]=\left[D_i D_j f\right]_\pi
$$
Take first $i=j$, and choose $\pi$ so $\pi(i)=1$; this gives us $\left(D_1 D_1 f\right)(r, \ldots, r)=\left(D_i D_i f\right)(r, \ldots, r)$. Take next any $i \neq j$, and choose $\pi$ so $\pi(i)=1$ and $\pi(j)=2$; then we get $\left(D_1 D_2 f\right)(r, \ldots, r)=$ $\left(D_{\imath} D_j f\right)(r, \ldots, r)$
A straightforward computation now gives us the final lemma we need:
LEMMA 3. Suppose a quadratic form $Q$ is given by
$$
Q\left(v_1, \ldots, v_n\right)=\Sigma_i b v_i v_i+\Sigma_{i \neq j} c v_i v_j \text {. }
$$
Then for all $v$ satisfying $\Sigma v_i=0$, we have $Q(v)=(c-b) \Sigma v_i^2$
THEOREM (The Purkiss PRINCIPLE). Let $f$ and $g$ be symmetric functions with continuous second derivatives in the neighborhood of a point $P=(r, \ldots, r)$. On the set where $g$ equals $g(P)$, the function $f$ will have a local maximum or minimum at $P$ except in degenerate cases.

Proof. Assuming $\nabla g(P) \neq 0$, we know from Lemma 1 that $\nabla f(P)$ has the form $\lambda \nabla g(P)$. The second partial derivatives of $f$ and $g$ satisfy the equalities in Lemma 2, and hence the terms $\left(D_i D_j f\right)(P)-\lambda\left(D_i D_j g\right)(P)$ also satisfy those equalities. Lemma 1 shows that the vectors $v$ perpendicular to $\nabla g(P)$ are those with $\Sigma v_i=0$. Lemma 3 then shows that on those $v$ our quadratic form (if not identically zero) is positive or negative definite. The result then follows from the Lagrange multiplier criterion.

Two types of degeneracy were excluded from the proof: we assumed that $\nabla g(P)$ was nonzero, and then we assumed that the second-order terms in $f-\lambda g$ did not all vanish in the directions perpendicular to $\nabla g(P)$. Such exclusions are indeed necessary, and the Purkiss principle is not universally true. To see why a condition on $\nabla g$ is needed, consider the extreme case where $g$ is constant everywhere; there is then no constraint, and hardly any symmetric $f$ will satisfy the condition. It is a little harder to find an example that fails with the second-order degeneracy, but here is one such case. Let the functions be
$$
\begin{aligned}
& f=x_1{ }^4 x_2 x_3+x_2{ }^4 x_3 x_1+x_3{ }^4 x_1 x_2, \\
& g=x_1{ }^3 x_2{ }^3+x_2{ }^3 x_3{ }^3+x_3{ }^3 x_1{ }^3
\end{aligned}
$$
around the point $P=(1,1,1)$. Consider the curve consisting of points $(s t, s, s)$, where $t$ is close to 1 and $s$ is chosen to keep $g=3$ (that is, $s^6\left(2 t^3+1\right)=3$ ). For such points
$$
f(s t, s, s)=s^6\left(t^4+2 t\right)=3\left(t^4+2 t\right) /\left(2 t^3+1\right) .
$$
The derivative of this comes out to be $6\left(t^3-1\right)^2 /\left(2 t^3+1\right)^2$, which is zero at $t=1$ but positive on both sides nearby. Thus $f$ has values less than 3 for $t=1-\varepsilon$ and bigger than 3 for $t=1+\varepsilon$, and the Purkiss principle fails. It is not at all surprising that such exceptions should exist. What is surprising is how widely the principle turns out to be correct.

realnumber · 2023-6-7 20:29

极值,不是最值,也许我记错了,轮换对称?似乎看到有人提过,不在相等时取到最值.

realnumber · 2023-6-7 20:40

想到一个,比如x+y=0条件下,
求$(x^2-1)^2+(y^2-1)^2$的最小值,显然在x=1,y=-1时取最小0,而不是x=y=0时最小,这样似乎可以指定位置取,说不上特点了.

kuing · 2023-6-7 21:37

前两天刚看到一个例：zhihu.com/question/604200498

isee · 2023-6-7 21:39

Last edited by isee 2023-6-7 22:22

kuing 发表于 2023-6-7 21:37
前两天刚看到一个例：zhihu.com/question/604200498

发过来，源自知乎提问，我的回答：

题：若正实数 a, b 满足 $a+b=2$ ，则 $\frac1{1+a^2}+\frac1{1+b^2}$ 的最大值为 $\frac{1+\sqrt 2}2.$

以下应该是最佳解：令 $a=1+\xi,$ $b=1-\xi$ ，则令 $f=\frac1{1+a^2}+\frac1{1+b^2}=\frac{2\xi^2+4}{\xi^4+4},$ 由判别式法容易求得 $f\leqslant\frac{\sqrt 2+1}2.$

取得等号时 $\color{red}{a\ne b}$.

如果将条件改为 a+b=1，则可以强制由 Cauchy 不等式证明 $\frac1{1+a^2}+\frac1{1+b^2}\leqslant\frac85.$

这是因为 \begin{gather*}
\frac1{1+a^2}+\frac1{1+b^2}\leqslant\frac85,\\[1ex]
\iff 1-\frac{a^2}{1+a^2}+1-\frac{b^2}{1+b^2}\leqslant\frac 85,\\[1ex]
\iff \color{red}{\frac{a^2}{1+a^2}+\frac{b^2}{1+b^2}\geqslant\frac 25},
\end{gather*} 成功反向不等式，为 Cauchy 不等式（分式形式 $\frac{c^2}a+\frac{d^2}b\geqslant\frac{(c+d)^2}{a+b}$ ）创造了条件.

又有\begin{align*}
&\quad\;\frac{a^2}{1+a^2}+\frac{b^2}{1+b^2}\\[1ex]
&=\frac{a^4}{a^2\big((a+b)^2+a^2\big)}+\frac{b^4}{b^2\big((a+b)^2+b^2\big)}\\[1ex]
&\geqslant\frac{(a^2+b^2)^2}{a^4+b^4+a^2(a+b)^2+b^2(a+b)^2}\\[1ex]
&=\frac{a^4+b^4+2a^2b^2}{2a^4+2b^4+2a^2b^2+2a^3b+2ab^3},
\end{align*} 从而需证加强式 \begin{gather*}
\frac{a^4+b^4+2a^2b^2}{2a^4+2b^4+2a^2b^2+2a^3b+2ab^3}\geqslant \frac25,\\[1ex]
\iff a^4+b^4+6a^2b^2\geqslant4a^3b+4ab^3,\\[1ex]
\iff(a^2+b^2)^2+4a^2b^2\geqslant 2(a^2+b^2)\cdot 2ab,
\end{gather*} 上式由均值不等式知是成立，当且仅当 a=b 时取得等号，从而得证.

Czhang271828 · 2023-6-7 22:10

isee 发表于 2023-6-7 21:39
发过来，源自知乎提问

题：若正实数 a, b 满足 $a+b=2$ ，则 $\frac1{1+a^2}+\frac1{1+b^2}$ 的最大值为 ...

这两个灰原哀是同一个人吗(指 freeMaths 和 isee)😲

isee · 2023-6-7 22:18

Czhang271828 发表于 2023-6-7 22:10
这两个灰原哀是同一个人吗(指 freeMaths 和 isee)😲

是的是的，都是我

realnumber · 2023-6-13 08:49

$x,y\in R+,\frac{1}{x}+\frac{1}{y}=1,$求$x^2+y^2-4xy$的最小值.(不用解答，就贴一下)

kuing · 2023-6-13 10:41

a, b>0, a+b=5 求 (a+1/a)(b+1/b) 的最小值。

以往我都喜欢举这个例子

hbghlyj · 2023-8-5 21:36

Why does Group Theory not come in here?

Here is a list of questions that I find quite similar, for the one and only reason that they all show much "symmetry". Which is at the same time my problem, because I don't have a very precise notion of that supposed "symmetry". Here comes:

Take the last question as an example. Let $a=b=c=d$ , then $a^2+b^2+c^2+d^2=4\,$ is of course fulfilled, because we know that $a,b,c,d\ge 0$ , if and only if $a=b=c=d=1$ .
Nobody would be surprised (?) if $\;f(a,b,c,d)=a/(b+3)+b/(c+3)+c/(d+3)+d/(a+3)$ assumes its one and only maximum $f(a,b,c,d)=1/4+1/4+1/4+1/4=1$ precisely for these equal values of $(a,b,c,d)$ .
Very much the same phenomenon is observed with the other of the above questions. The optimizing parameters turn out to be all equal, which often may be suspected beforehand: "due to symmetry".

It's frustrating that there seems not to exist a theorem somewhere that guarantees a maximum or a minimum of a function when all variables in a problem with such high symmetry are just equal.
I mean: sort of formalization of "by symmetry" that helps us to find such solutions immediately.
Now it is supposed that group theory is the discipline that should teach us a lot about symmetries.
So the question is: why doesn't group theory routinely come in here?

Update. Unfortunately - and I think it's against the spirit of MSE as well - it often happens that the better answers are actually given as a comment . Indeed, a key reference for this sort of problems is:

Do symmetric problems have symmetric solutions?

It's only with help of this reference that I could (try to) answer another such question .

Update. It's a continuing story:

Account		Remember me	Forgot password?
Password			Register account

[不等式] 全对称,不在x=y时取最值例子?有什么特点?

Do Symmetric Problems Have Symmetric Solutions?

Comment

Why does Group Theory not come in here?