浅谈变分法

阅读本文只需要高等数学基础,推导比较细节以备忘,建议稍稍了解下变分法再阅读本文

在学习Kass的Complete Snake Model时遇到了使用变分法的推导。花了点时间翻了翻泛函简单学习了一下变分法的相关计算,这里做个记录。

什么是泛函数

不严谨地说,泛函数实现了一下映射:

某函数集合 \stackrel{泛函数}{\longrightarrow} 复数集

实际上在学习高等数学时我们就已经接触到了泛函,比如利用曲线积分计算曲线长度:

J(y)=011+y2dxJ(y) = \int_0^1 \sqrt{1+y'^2}dx

该函数将y(x)y(x)映射成实数,该实数反映了函数曲线y(x)y(x)在区间[0,1][0,1]的长度。

一般地,在一维情况下,泛函可以定义为:

J[y]=x0x1F(x,y,y)dxJ[y]=\int_{x_0}^{x_1}F(x,y,y')dx

泛函极值的定义

泛函的极值的求解方法为变分法,但其思想跟普通函数的极值求法类似。首先这里先讨论普通函数的极值求解方法。

普通函数极值的定义为:

f(x)f(x)x0x_0点取极小值,则xxx0x_0点及其附近xx0<ϵ|x-x_0|<\epsilon恒有:

f(x)f(x0)f(x) \ge f(x_0)

若取极大值则恒有:

f(x)f(x0)f(x) \le f(x_0)

所以泛函的极值可以遵循相似的定义,这里可以定义泛函的极小值如下:

J[y]J[y]y0(x)y_0(x)处取得极小值,则yyy0y_0及其附近yy0<ϵ|y-y_0|<\epsilon恒有:

J[y]J[y0]J[y] \ge J[y_0]

极大值定义同理。

教材上给出的正经的定义是这样的(我稍稍修改了一下一些符号的表达)

对于极值函数y0(x)y_0(x)及其“附近”的变量函数y0(x)+δy(x)y_0(x)+\delta y(x),恒有

J[y0+δy]J[y0]J[y_0+\delta y] \ge J[y_0]

所谓函数y0(x)+δy(x)y_0(x)+\delta y(x)在另一个函数y_0(x)的"附近",指的是:

  1. δy(x)<ϵ|\delta y(x)| < \epsilon
  2. 有时还要求(δy0)(x)<ϵ|(\delta y_0)'(x)|< \epsilon

这里δy(x)\delta y(x)称为函数y(x)y(x)变分[1]

实际上无论是泛函还是普通函数的极值,都是通过极值点与邻域的函数值的比较进行定义的。

泛函极值的求解

从上述定义我们可以对变分有一个感性的认识。下面考虑利用变分求解泛函的极值。

初始条件

变分法起源于最速下降线[2]的求解。即求解小球从起点[x0,a][x_0,a]到终点[x0,b][x_0,b]运动最快的曲线y(x)y(x)。所以对于解空间中的所有曲线,必须都过起点和终点,即:

y(x0)=a,y(x1)=by(x_0) = a, y(x_1) = b

对于极值函数y0(x)y_0(x)也不例外:

y0(x0)=a,y0(x0)=by_0(x_0)=a,y_0(x_0)=b

故:

δy(x0)=0,δy(x1)=0\delta y(x_0) = 0, \delta y(x_1) = 0

以上为变分法求解所需的初始条件

变分法求解泛函极值[3]

首先考虑泛函的差值:(代入前面泛函的定义式)

J[y0+δy]J[y0]=x0x1[F(x,y0+δy,y0+(δy))F(x,y0,y0)]dxJ[y_0+\delta y]-J[y_0]=\int^{x_1}_{x_0}[F(x,y_0+\delta y,y_0'+(\delta y)')-F(x,y_0,y_0')]dx

若函数的变分δy(x)\delta y(x)足够小,可以将被积函数在极值函数y0(x)y_0(x)处做泰勒展开:

注:因为这里xx不变所以可以看作参数,二维泰勒展开公式为:

f(x,y)=f(x0,y0)+fx(x0,y0)Δx+fy(x0,y0)Δy+12!fx(x0,y0)Δx2+12!fxy(x0,y0)ΔxΔy+12!fyx(x0,y0)ΔxΔy+12!fyy(x0,y0)Δy2+o(3)=f(x0,y0)+(Δxx+Δyy)f(x0,y0)+12!(Δxx+Δyy)2f(x0,y0)+o(3)\begin{aligned} f(x,y) &= f(x_0,y_0) \\[2ex] &+f'_x(x_0,y_0)\Delta x + f'_y(x_0,y_0)\Delta y \\[2ex] &+ \frac{1}{2!}f''_x(x_0,y_0)\Delta x^2 + \frac{1}{2!}f''_{xy}(x_0,y_0) \Delta x \Delta y \\[2ex] &+\frac{1}{2!}f_{yx}''(x_0,y_0)\Delta x \Delta y + \frac{1}{2!}f_{yy}''(x_0,y_0)\Delta y^2 + o(3) \\[2ex] &\xlongequal{若混合偏导连续} f(x_0,y_0) + (\Delta x\frac{\partial}{\partial x} + \Delta y \frac{\partial}{\partial y})f(x_0,y_0) +\frac{1}{2!} (\Delta x\frac{\partial}{\partial x} + \Delta y \frac{\partial}{\partial y})^2 f(x_0,y_0) +o(3) \end{aligned}

J[y0+δy]J[y0]=x0x1{[δyy+(δy)y]Fy=y0+12![δyy+(δy)y]2Fy=y0+o(3)}dx=δJ[y]+12!δ2J[y]+x0x1o(3)dx\begin{aligned} J[y_0+\delta y]-J[y_0] & =\int^{x_1}_{x_0}\{[\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]F|_{y=y_0} +\frac{1}{2!}[\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]^2F|_{y=y_0}+o(3)\}dx \\[2ex] & =\delta J[y] + \frac{1}{2!}\delta^2J[y] + \int_{x_0}^{x_1}o(3)dx \end{aligned}

其中:

δJ[y]=x0x1[δyy+(δy)y]Fy=y0dxδ2J[y]=x0x1[δyy+(δy)y]2Fy=y0dx\begin{aligned} \delta J[y]&=\int_{x_0}^{x_1}[\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]F|_{y=y_0}dx \\[2ex] \delta^2 J[y]&=\int_{x_0}^{x_1} [\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]^2F|_{y=y_0}dx \end{aligned}

以上两式称为泛函J[y]]J[y]]的一级变分和二级变分。一般泛函J[y]J[y]取得极小值的必要条件是泛函的一级变分为0

δJ[y]=x0x1[δyy+(δy)y]Fy=y0dx=0\delta J[y]=\int_{x_0}^{x_1}[\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]F|_{y=y_0}dx=0

将上式做分部积分:

δJ[y]=x0x1[δyy+(δy)y]Fy=y0dx=x0x1δyFyy=y0dx+x0x1(δy)Fyy=y0dx=x0x1δyFyy=y0dx+x0x1Fyy=y0dδy=x0x1δyFyy=y0dx+(δy)Fyx0,y=y0x1x0x1δyddxFyy=y0=δy(x0)=0,δy(x1)=0x0x1δy(FyddxFy)y=y0dx\begin{aligned} \delta J[y] &= \int_{x_0}^{x_1}[\delta y\frac{\partial}{\partial y}+(\delta y)'\frac{\partial}{\partial y'}]F|_{y=y_0}dx \\[2ex] &=\int_{x_0}^{x_1}\delta y\frac{\partial F}{\partial y}|_{y=y_0}dx + \int_{x_0}^{x_1}(\delta y)'\frac{\partial F}{\partial y'}|_{y=y_0}dx \\[2ex] &=\int_{x_0}^{x_1}\delta y\frac{\partial F}{\partial y}|_{y=y_0}dx + \int_{x_0}^{x_1}\frac{\partial F}{\partial y'}|_{y=y_0}d\delta y \\[2ex] &= \int_{x_0}^{x_1}\delta y\frac{\partial F}{\partial y}|_{y=y_0}dx + (\delta y)\frac{\partial F}{\partial y'}|^{x_1}_{x_0,y=y_0} - \int_{x_0}^{x_1}\delta y\frac{d}{dx}\frac{\partial F}{\partial y'}|_{y=y_0} \\[2ex] &\xlongequal{\delta y(x_0) = 0, \delta y(x_1) = 0}\int_{x_0}^{x_1}\delta y(\frac{\partial F}{\partial y}-\frac{d}{dx}\frac{\partial F}{\partial y'})|_{y=y_0}dx \end{aligned}

则:

x0x1δy(FyddxFy)y=y0dx=0\int_{x_0}^{x_1}\delta y(\frac{\partial F}{\partial y}-\frac{d}{dx}\frac{\partial F}{\partial y'})|_{y=y_0}dx =0

对于任意的δy\delta y使上式成立,只需:

Fy0ddxFy0=0\frac{\partial F}{\partial y_0}-\frac{d}{dx}\frac{\partial F}{\partial y_0'}=0

上述方程被称为Euler Lagrange方程,这是泛函J[y]J[y]取极小值的必要条件的微分形式。由该微分方程解出的函数有很大概率是极值函数。


  1. 吴崇试. 数学物理方法. 1999. 北京大学出版社 ↩︎

  2. https://zh.wikipedia.org/wiki/最速降線問題 ↩︎

  3. (参考其思想)https://zhuanlan.zhihu.com/p/20718489 ↩︎

0%