高级优化理论与方法(十四)
- Non-linear Constrained Optimization
- KKT-Theorem(FONC)
- SONC
- Definition
- SOSC
- Example 1
- Example 2
- Convex Optimization Problems
- Definition
- Lemma
- Theorem
- Lemma
- Example
- Theorem
- Theorem
- Example
- Definition
- Theorem
- Lemma
- Corollary
- Lemma
- Theorem
- Corollary
- Theorem
- Theorem
- Example
- 总结
Non-linear Constrained Optimization
KKT-Theorem(FONC)
f , h , g ∈ C 1 , x f,h,g\in C^1,x f,h,g∈C1,x: regular point & local minimizer. Then, exist λ ∗ ∈ R m , μ ∈ R p \lambda^*\in \mathbb{R}^m,\mu \in \mathbb{R}^p λ∗∈Rm,μ∈Rp:
① μ ∗ ≥ 0 \mu^*\geq 0 μ∗≥0
② D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 Df(x^*)+{\lambda^*}^T Dh(x^*)+{\mu^*}^T Dg(x^*)=0 Df(x∗)+λ∗TDh(x∗)+μ∗TDg(x∗)=0
③ μ ∗ T g ( x ∗ ) = 0 {\mu^*}^T g(x^*)=0 μ∗Tg(x∗)=0
SONC
Thm: f , h , g ∈ C 2 , x ∗ f,h,g\in C^2,x^* f,h,g∈C2,x∗:regular point & local minimizer.Then, exist λ ∗ ∈ R m , μ ∈ R p \lambda^*\in \mathbb{R}^m,\mu \in \mathbb{R}^p λ∗∈Rm,μ∈Rp:
① μ ∗ ≥ 0 , D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 , μ ∗ T g ( x ∗ ) = 0 \mu^*\geq 0,Df(x^*)+{\lambda^*}^T Dh(x^*)+{\mu^*}^T Dg(x^*)=0,{\mu^*}^T g(x^*)=0 μ∗≥0,Df(x∗)+λ∗TDh(x∗)+μ∗TDg(x∗)=0,μ∗Tg(x∗)=0
② ∀ y ∈ T ( x ∗ ) : y T L ( x , λ , μ ) y ≥ 0 \forall y\in T(x^*):y^TL(x,\lambda,\mu)y\geq 0 ∀y∈T(x∗):yTL(x,λ,μ)y≥0
注: T ( x ∗ ) = { y : D h ( x ∗ ) y = 0 , D g j ( x ∗ ) y = 0 , ∀ j ∈ J ( x ∗ ) } T(x^*)=\{y:Dh(x^*)y=0,Dg_j(x^*)y=0,\forall j\in J(x^*)\} T(x∗)={y:Dh(x∗)y=0,Dgj(x∗)y=0,∀j∈J(x∗)}
Definition
Def: T ~ ( x ∗ , μ ) = { y : D h ( x ∗ ) y = 0 , D g j ( x ∗ ) y = 0 , f o r j ∈ J ~ ( x ∗ , μ ∗ ) } \tilde{T}(x^*,\mu)=\{y:Dh(x^*)y=0,Dg_j(x^*)y=0, for j\in \tilde{J}(x^*,\mu^*)\} T~(x∗,μ)={y:Dh(x∗)y=0,Dgj(x∗)y=0,forj∈J~(x∗,μ∗)}
Remark: ∵ J ~ ( x ∗ , μ ∗ ) ⊆ J ( x ∗ ) ∴ T ( x ∗ ) ⊆ T ~ ( x ∗ , μ ∗ ) \because \tilde{J}(x^*,\mu^*)\subseteq J(x^*) \therefore T(x^*)\subseteq \tilde{T}(x^*,\mu^*) ∵J~(x∗,μ∗)⊆J(x∗)∴T(x∗)⊆T~(x∗,μ∗)
SOSC
Thm: f , h , g ∈ C 2 f,h,g\in C^2 f,h,g∈C2. If exists x ∗ ∈ R n x^*\in\mathbb{R}^n x∗∈Rn and λ ∗ ∈ R m , μ ∈ R p \lambda^*\in\mathbb{R}^m,\mu\in\mathbb{R}^p λ∗∈Rm,μ∈Rp s.t.
① μ ∗ ≥ 0 , D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 , μ ∗ T g ( x ∗ ) = 0 \mu^*\geq 0,Df(x^*)+{\lambda^*}^T Dh(x^*)+{\mu^*}^T Dg(x^*)=0,{\mu^*}^T g(x^*)=0 μ∗≥0,Df(x∗)+λ∗TDh(x∗)+μ∗TDg(x∗)=0,μ∗Tg(x∗)=0
② ∀ y ∈ T ~ ( x ∗ ) \forall y\in \tilde{T}(x^*) ∀y∈T~(x∗) with y ≠ 0 : y T L ( x ∗ , λ ∗ , μ ∗ ) y > 0 y\neq 0:y^TL(x^*,\lambda^*,\mu^*)y>0 y=0:yTL(x∗,λ∗,μ∗)y>0
then x ∗ x^* x∗ is a strict local minimizer.
Example 1
min x 1 x 2 x_1x_2 x1x2
s.t. x 1 + x 2 ≥ 2 x_1+x_2\geq 2 x1+x2≥2
x 2 ≥ x 1 x_2\geq x_1 x2≥x1
f ( x ) = x 1 x 2 f(x)=x_1x_2 f(x)=x1x2
g 1 ( x ) = 2 − x 1 − x 2 g_1(x)=2-x_1-x_2 g1(x)=2−x1−x2
g 2 ( x ) = x 1 − x 2 g_2(x)=x_1-x_2 g2(x)=x1−x2
∇ f ( x ) = [ x 2 x 1 ] , ∇ g 1 ( x ) = [ − 1 − 1 ] , ∇ g 2 ( x ) = [ 1 − 1 ] \nabla f(x)=\begin{bmatrix} x_2\\ x_1 \end{bmatrix},\nabla g_1(x)=\begin{bmatrix} -1\\ -1 \end{bmatrix},\nabla g_2(x)=\begin{bmatrix} 1\\ -1 \end{bmatrix} ∇f(x)=[x2x1],∇g1(x)=[−1−1],∇g2(x)=[1−1]
KKT-conditions: { μ 1 , μ 2 ≥ 0 x 2 − μ 1 + μ 2 = 0 x 1 − μ 1 − μ 2 = 0 μ 1 ( 2 − x 1 − x 2 ) + μ 2 ( x 1 − x 2 ) = 0 2 − x 1 − x 2 ≤ 0 x 1 − x 2 ≤ 0 \begin{cases} \mu_1,\mu_2\geq 0\\ x_2-\mu_1+\mu_2=0\\ x_1-\mu_1-\mu_2=0\\ \mu_1(2-x_1-x_2)+\mu_2(x_1-x_2)=0\\ 2-x_1-x_2\leq 0\\ x_1-x_2\leq 0 \end{cases} ⎩ ⎨ ⎧μ1,μ2≥0x2−μ1+μ2=0x1−μ1−μ2=0μ1(2−x1−x2)+μ2(x1−x2)=02−x1−x2≤0x1−x2≤0
x ∗ = [ 1 1 ] , μ ∗ = [ 1 0 ] x^*=\begin{bmatrix} 1\\ 1 \end{bmatrix},\mu^*=\begin{bmatrix} 1\\ 0 \end{bmatrix} x∗=[11],μ∗=[10]
D g 1 ( x ∗ ) = [ − 1 , − 1 ] , D g 2 ( x ∗ ) = [ 1 , − 1 ] , D f ( x ) = [ 1 , 1 ] Dg_1(x^*)=[-1,-1],Dg_2(x^*)=[1,-1],Df(x)=[1,1] Dg1(x∗)=[−1,−1],Dg2(x∗)=[1,−1],Df(x)=[1,1]
⇒ x ∗ , D g j ( x ∗ ) ∀ j ∈ J ( x ∗ ) \Rightarrow x^*,Dg_j(x^*) \forall j\in J(x^*) ⇒x∗,Dgj(x∗)∀j∈J(x∗) linearly independent
⇒ x ∗ \Rightarrow x^* ⇒x∗ regular point
T ( x ∗ ) = { y : [ − 1 , − 1 ] y = 0 , [ 1 , − 1 ] y = 0 } = { 0 } T(x^*)=\{y:[-1,-1]y=0,[1,-1]y=0\}=\{0\} T(x∗)={y:[−1,−1]y=0,[1,−1]y=0}={0}
SONC is satisfied by x ∗ , μ ∗ x^*,\mu^* x∗,μ∗
L ( x , λ , μ ) = F ( x ) + λ H ( x ) + μ G ( x ) = [ 0 1 1 0 ] + [ 1 , 0 ] [ 0 0 0 0 ] = [ 0 1 1 0 ] L(x,\lambda,\mu)=F(x)+\lambda H(x)+\mu G(x)=\begin{bmatrix} 0&1\\ 1&0 \end{bmatrix}+[1,0]\begin{bmatrix} 0&0\\ 0&0 \end{bmatrix}=\begin{bmatrix} 0&1\\ 1&0 \end{bmatrix} L(x,λ,μ)=F(x)+λH(x)+μG(x)=[0110]+[1,0][0000]=[0110]
T ~ ( x ∗ , μ ∗ ) = { y : [ − 1 , − 1 ] y = 0 } = { y : − y 1 = y 2 } \tilde{T}(x^*,\mu^*)=\{y:[-1,-1]y=0\}=\{y:-y_1=y_2\} T~(x∗,μ∗)={y:[−1,−1]y=0}={y:−y1=y2}
[ 1 , − 1 ] ∈ T ~ ( x ∗ , μ ∗ ) [1,-1]\in \tilde{T}(x^*,\mu^*) [1,−1]∈T~(x∗,μ∗)
[ 1 , − 1 ] [ 0 1 1 0 ] [ 1 − 1 ] = [ − 1 , 1 ] [ 1 − 1 ] = − 2 < 0 [1,-1]\begin{bmatrix} 0&1\\ 1&0 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix}=[-1,1]\begin{bmatrix} 1\\ -1 \end{bmatrix}=-2<0 [1,−1][0110][1−1]=[−1,1][1−1]=−2<0
SOSC fails.
no local min.
Example 2
min f ( x ) = ( x 1 − 1 ) 2 + x 2 − 2 f(x)=(x_1-1)^2+x_2-2 f(x)=(x1−1)2+x2−2
s.t. h ( x ) = x 2 − x 1 − 1 = 0 h(x)=x_2-x_1-1=0 h(x)=x2−x1−1=0
g ( x ) = x 1 + x 2 − 2 ≤ 0 g(x)=x_1+x_2-2\leq 0 g(x)=x1+x2−2≤0
D f ( x ) = [ 2 x 1 − 2 , 1 ] , D h ( x ) = [ − 1 , 1 ] , D g ( x ) = [ 1 , 1 ] Df(x)=[2x_1-2,1],Dh(x)=[-1,1],Dg(x)=[1,1] Df(x)=[2x1−2,1],Dh(x)=[−1,1],Dg(x)=[1,1]
KKT-conditions: { μ ≥ 0 2 x 1 − 2 − λ + μ = 0 1 + λ + μ = 0 μ ( x 1 + x 2 − 2 ) = 0 x 2 − x 1 − 1 = 0 x 1 + x 2 − 2 ≤ 0 \begin{cases} \mu\geq 0\\ 2x_1-2-\lambda+\mu=0\\ 1+\lambda+\mu=0\\ \mu (x_1+x_2-2)=0\\ x_2-x_1-1=0\\ x_1+x_2-2\leq 0 \end{cases} ⎩ ⎨ ⎧μ≥02x1−2−λ+μ=01+λ+μ=0μ(x1+x2−2)=0x2−x1−1=0x1+x2−2≤0
⇒ μ ∗ = 0 , x 1 ∗ = 1 2 , x 2 ∗ = 3 2 , λ ∗ = − 1 \Rightarrow \mu^*=0,x_1^*=\frac{1}{2},x_2^*=\frac{3}{2},\lambda^*=-1 ⇒μ∗=0,x1∗=21,x2∗=23,λ∗=−1
x x x regular
L ( x ∗ , λ ∗ , μ ∗ ) = F ( x ∗ ) + λ ∗ T H ( x ∗ ) + μ ∗ T G ( x ∗ ) = [ 2 0 0 0 ] L(x^*,\lambda^*,\mu^*)=F(x^*)+{\lambda^*}^TH(x^*)+{\mu^*}^TG(x^*)=\begin{bmatrix} 2&0\\ 0&0 \end{bmatrix} L(x∗,λ∗,μ∗)=F(x∗)+λ∗TH(x∗)+μ∗TG(x∗)=[2000]
T ( x ∗ ) = { y : [ − 1 , 1 ] y = 0 , [ 1 , 1 ] y = 0 } = { 0 } T(x^*)=\{y:[-1,1]y=0,[1,1]y=0\}=\{0\} T(x∗)={y:[−1,1]y=0,[1,1]y=0}={0},SONC satisfied
T ~ ( x ∗ , μ ∗ ) = { y : [ − 1 , 1 ] y = 0 } = { y : y 1 = y 2 } \tilde{T}(x^*,\mu^*)=\{y:[-1,1]y=0\}=\{y:y_1=y_2\} T~(x∗,μ∗)={y:[−1,1]y=0}={y:y1=y2}
y T [ 2 0 0 0 ] y = [ a , a ] [ 2 0 0 0 ] [ a a ] = 2 a 2 y^T\begin{bmatrix} 2&0\\ 0&0 \end{bmatrix}y=[a,a]\begin{bmatrix} 2&0\\ 0&0 \end{bmatrix}\begin{bmatrix} a\\ a \end{bmatrix}=2a^2 yT[2000]y=[a,a][2000][aa]=2a2
∀ y ≠ 0 : 2 a 2 > 0 ⇒ x ∗ = [ 1 2 3 2 ] \forall y\neq 0:2a^2>0\Rightarrow x^*=\begin{bmatrix} \frac{1}{2}\\ \frac{3}{2} \end{bmatrix} ∀y=0:2a2>0⇒x∗=[2123] strict local minimizer
Convex Optimization Problems
min f ( x ) f(x) f(x)
s.t. x ∈ Ω x\in \Omega x∈Ω
Ω : \Omega: Ω: a convex set
f : f: f: a convex function
Definition
Def: Ω : \Omega: Ω: convex set, if ∀ x , y ∈ Ω , ∀ α ∈ ( 0 , 1 ) : α x + ( 1 − α ) y ∈ Ω \forall x,y\in\Omega,\forall \alpha \in (0,1):\alpha x+(1-\alpha)y\in \Omega ∀x,y∈Ω,∀α∈(0,1):αx+(1−α)y∈Ω.
Def: The graph of f : Ω → R f:\Omega\rightarrow \mathbb{R} f:Ω→R is a set of points in Ω × R ⊆ R n + 1 \Omega \times \mathbb{R}\subseteq\mathbb{R}^{n+1} Ω×R⊆Rn+1 by { [ x f ( x ) ] : x ∈ Ω } \Bigg\{\begin{bmatrix} x\\ f(x) \end{bmatrix}:x\in\Omega\Bigg\} {[xf(x)]:x∈Ω}
Def: The epigraph of f f f, denoted by e p i ( f ) epi(f) epi(f) is a set of points: e p i ( f ) = { [ x β ] : x ∈ Ω , β ∈ R , f ( x ) ≤ β } epi(f)=\Bigg\{ \begin{bmatrix} x\\ \beta \end{bmatrix}:x\in \Omega,\beta\in\mathbb{R},f(x)\leq\beta\Bigg\} epi(f)={[xβ]:x∈Ω,β∈R,f(x)≤β}
Def: A function f : Ω → R , Ω ⊆ R n f:\Omega\rightarrow \mathbb{R},\Omega\subseteq \mathbb{R}^n f:Ω→R,Ω⊆Rn is convex on Ω \Omega Ω, if its epigraph is convex.
Lemma
Lem: If a function f : Ω → R f:\Omega\rightarrow \mathbb{R} f:Ω→R is a convex on Ω \Omega Ω, then Ω \Omega Ω is a convex set.
Theorem
Thm: A function f : Ω → R f:\Omega\rightarrow \mathbb{R} f:Ω→R is convex, if and only if ∀ x , y ∈ Ω , α ∈ ( 0 , 1 ) : f ( α x + ( 1 − α ) y ) ≤ α f ( x ) + ( 1 − α ) f ( y ) \forall x,y\in\Omega,\alpha\in (0,1): f(\alpha x+(1-\alpha)y)\leq \alpha f(x)+(1-\alpha)f(y) ∀x,y∈Ω,α∈(0,1):f(αx+(1−α)y)≤αf(x)+(1−α)f(y).
注:若把上式的小于等于号改成大于等于号,则 f f f是凹函数( concave function)。
Lemma
Lem: Suppose f , f 1 , f 2 f,f_1,f_2 f,f1,f2 are convex. Then, β f \beta f βf for β ≥ 0 \beta\geq 0 β≥0 is convex and so is f 1 + f 2 f_1+f_2 f1+f2.
Example
f ( x ) = x 1 x 2 , Ω = { x : x 1 ≥ 0 , x 2 ≥ 0 } f(x)=x_1x_2,\Omega=\{x:x_1\geq 0,x_2\geq0\} f(x)=x1x2,Ω={x:x1≥0,x2≥0}
x = [ 1 2 ] , y = [ 2 1 ] x=\begin{bmatrix} 1\\ 2 \end{bmatrix},y=\begin{bmatrix} 2\\ 1 \end{bmatrix} x=[12],y=[21]
α x + ( 1 − α ) y = [ α + 2 ( 1 − α ) 2 α + ( 1 − α ) ] = [ 2 − α 1 + α ] \alpha x+(1-\alpha)y=\begin{bmatrix} \alpha+2(1-\alpha)\\ 2\alpha+(1-\alpha) \end{bmatrix}=\begin{bmatrix} 2-\alpha\\ 1+\alpha \end{bmatrix} αx+(1−α)y=[α+2(1−α)2α+(1−α)]=[2−α1+α]
f ( α x + ( 1 − α ) y ) = ( 2 − α ) ( 1 + α ) = 2 + α − α 2 f(\alpha x+(1-\alpha)y)=(2-\alpha)(1+\alpha)=2+\alpha-\alpha^2 f(αx+(1−α)y)=(2−α)(1+α)=2+α−α2
α f ( x ) + ( 1 − α ) f ( y ) = 2 α + 2 ( 1 − α ) = 2 \alpha f(x)+(1-\alpha)f(y)=2\alpha+2(1-\alpha)=2 αf(x)+(1−α)f(y)=2α+2(1−α)=2
∵ ∀ α ∈ ( 0 , 1 ) , 2 + α − α 2 > 2 \because \forall \alpha\in (0,1),2+\alpha-\alpha^2>2 ∵∀α∈(0,1),2+α−α2>2
∴ f \therefore f ∴f is not a convex function
Theorem
Thm: Let f : Ω → R f:\Omega\rightarrow \mathbb{R} f:Ω→R and f ∈ C 1 f\in C^1 f∈C1. Ω \Omega Ω is an open convex set. Then, f f f is convex ⇔ ∀ x , y ∈ Ω : f ( y ) ≥ f ( x ) + D f ( x ) ( y − x ) \Leftrightarrow \forall x,y\in \Omega: f(y)\geq f(x)+Df(x)(y-x) ⇔∀x,y∈Ω:f(y)≥f(x)+Df(x)(y−x).
Theorem
f ∈ C 2 , Ω : f\in C^2,\Omega: f∈C2,Ω: an open convex set.
f f f convex ⇔ ∀ x ∈ Ω : F ( x ) \Leftrightarrow \forall x\in\Omega:F(x) ⇔∀x∈Ω:F(x) of f f f at x x x is positive semidefinite.
Example
- f ( x ) = − 8 x 2 ⇒ F ( x ) − − 16 < 0 f(x)=-8x^2\Rightarrow F(x)--16<0 f(x)=−8x2⇒F(x)−−16<0 (✕)
- f ( x ) = 4 x 1 2 + 3 x 2 2 + 5 x 3 2 + 6 x 1 x 2 + x 1 x 3 − 3 x 1 − 3 x 2 + 15 f(x)=4x_1^2+3x_2^2+5x_3^2+6x_1x_2+x_1x_3-3x_1-3x_2+15 f(x)=4x12+3x22+5x32+6x1x2+x1x3−3x1−3x2+15
F ( x ) = [ 8 6 1 6 6 0 1 0 10 ] F(x)=\begin{bmatrix} 8&6&1\\ 6&6&0\\ 1&0&10 \end{bmatrix} F(x)= 8616601010
Δ 1 = ∣ 8 ∣ > 0 \Delta_1=|8|>0 Δ1=∣8∣>0
Δ 2 = [ 8 6 6 6 ] > 0 \Delta_2=\begin{bmatrix} 8&6\\ 6&6 \end{bmatrix}>0 Δ2=[8666]>0
Δ 3 = [ 8 6 1 6 6 0 1 0 10 ] = 114 > 0 \Delta_3=\begin{bmatrix} 8&6&1\\ 6&6&0\\ 1&0&10 \end{bmatrix}=114>0 Δ3= 8616601010 =114>0
⇒ F ( x ) \Rightarrow F(x) ⇒F(x) positive definite - f ( x ) = 2 x 1 x 2 − x 1 2 − x 2 2 f(x)=2x_1x_2-x_1^2-x_2^2 f(x)=2x1x2−x12−x22
F ( x ) = [ − 2 2 2 − 2 ] F(x)=\begin{bmatrix} -2&2\\ 2&-2 \end{bmatrix} F(x)=[−222−2] (✕)
Definition
Def: strictly convex: f ( α x + ( 1 − α ) y ) < α f ( x ) + ( 1 − α ) f ( y ) f(\alpha x+(1-\alpha)y)<\alpha f(x)+(1-\alpha)f(y) f(αx+(1−α)y)<αf(x)+(1−α)f(y)
Def: (strictly) concave ⇔ − f \Leftrightarrow -f ⇔−f (strictly) convex
Theorem
Thm: convex optimization:
x ∗ x^* x∗ is global minimizer ⇔ x ∗ \Leftrightarrow x^* ⇔x∗ is a local minimizer.
Lemma
Lem: f : f: f: convex function on Ω \Omega Ω. Then, for all c ∈ R , Γ c = { x ∈ Ω : f ( x ) ∈ c } c\in\mathbb{R}, \Gamma_c=\{x\in\Omega:f(x)\in c\} c∈R,Γc={x∈Ω:f(x)∈c} is convex.
Corollary
f : f: f: convex function on Ω \Omega Ω. The set of all global minimizer of f f f is convex.
Lemma
Lem: f : f: f: convex function on Ω \Omega Ω. f ∈ C 1 f\in C^1 f∈C1. If x ∗ ∈ Ω x^*\in\Omega x∗∈Ω satisfies ∀ x ∈ Ω , x ≠ x ∗ : D f ( x ∗ ) ( x − x ∗ ) ≥ 0 \forall x\in\Omega,x\neq x^*:Df(x^*)(x-x^*)\geq 0 ∀x∈Ω,x=x∗:Df(x∗)(x−x∗)≥0, then x ∗ x^* x∗ is a global minimizer.
Theorem
Thm: f : f: f: convex. f ∈ C 1 f\in C^1 f∈C1
If x ∗ ∈ Ω x^*\in \Omega x∗∈Ω satisfies ∀ d ∈ R n : d T ∇ f ( x ∗ ) ≥ 0 \forall d\in \mathbb{R}^n:d^T\nabla f(x^*)\geq 0 ∀d∈Rn:dT∇f(x∗)≥0, then x ∗ x^* x∗ is a global minimizer.
Corollary
If x ∗ x^* x∗ satisfies ∇ f ( x ∗ ) = 0 \nabla f(x^*)=0 ∇f(x∗)=0, then x ∗ x^* x∗ global minimizer.
Theorem
Consider
min f ( x ) f(x) f(x)
s.t. h ( x ) = 0 h(x)=0 h(x)=0
h ∈ C 1 h\in C^1 h∈C1
Assume Ω = { x : h ( x ) = 0 } \Omega=\{x:h(x)=0\} Ω={x:h(x)=0} is convex, for example A x = b Ax=b Ax=b.
f : f: f: convex function on Ω = { x : h ( x ) = 0 } \Omega=\{x:h(x)=0\} Ω={x:h(x)=0}. If x ∗ ∈ Ω x^*\in\Omega x∗∈Ω and λ ∗ ∈ R m \lambda^*\in\mathbb{R}^m λ∗∈Rm satisfy D f ( x ∗ ) + λ ∗ D h ( x ∗ ) = 0 Df(x^*)+\lambda^*Dh(x^*)=0 Df(x∗)+λ∗Dh(x∗)=0, then x ∗ x^* x∗ is a global minimizer.
Theorem
Consider
min f ( x ) f(x) f(x)
s.t. h ( x ) = 0 h(x)=0 h(x)=0
g ( x ) ≤ 0 g(x)\leq 0 g(x)≤0
Assume: Ω = { x : h ( x ) = 0 , g ( x ) ≤ 0 } \Omega=\{x:h(x)=0,g(x)\leq 0\} Ω={x:h(x)=0,g(x)≤0}
Thm: If x ∗ ∈ Ω , λ ∗ ∈ R m x^*\in\Omega,\lambda^*\in\mathbb{R}^m x∗∈Ω,λ∗∈Rm and μ ∗ ∈ R p \mu^*\in\mathbb{R}^p μ∗∈Rp satisfy
KKT { μ ∗ ≥ 0 D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 μ ∗ T g ( x ∗ ) = 0 \begin{cases} \mu^*\geq 0\\ Df(x^*)+{\lambda^*}^TDh(x^*)+{\mu^*}^TDg(x^*)=0\\ {\mu^*}^Tg(x^*)=0 \end{cases} ⎩ ⎨ ⎧μ∗≥0Df(x∗)+λ∗TDh(x∗)+μ∗TDg(x∗)=0μ∗Tg(x∗)=0
then x ∗ x^* x∗ is a global minimizer.
Example
存钱问题: x k x_k xk表示第 k k k月存入银行的钱,银行月利率为 r r r,初始银行账户为0,存入的钱总共不超过D,求怎样存钱使得 n n n月后账户余额最多。
max y n = ( 1 + r ) n x 1 + ( 1 + r ) n − 1 x 2 + ( 1 + r ) x n y_n=(1+r)^nx_1+(1+r)^{n-1}x_2+(1+r)x_n yn=(1+r)nx1+(1+r)n−1x2+(1+r)xn
s.t. ∑ i = 1 n x i ≤ D \sum_{i=1}^n x_i\leq D ∑i=1nxi≤D
x ≥ 0 x\geq 0 x≥0
{ μ 1 ( e T − D ) = 0 , μ 1 ≥ 0 μ 2 x = 0 e T x ≤ D x ≥ 0 [ ( 1 + r ) n , ( 1 + r ) n − 1 , ⋯ , 1 ] + μ 1 e − μ 2 = 0 \begin{cases} \mu_1 (e^T-D)=0,\mu_1\geq 0\\ \mu_2 x=0\\ e^Tx\leq D\\ x\geq 0\\ [(1+r)^n,(1+r)^{n-1},\cdots,1]+\mu_1e-\mu_2=0 \end{cases} ⎩ ⎨ ⎧μ1(eT−D)=0,μ1≥0μ2x=0eTx≤Dx≥0[(1+r)n,(1+r)n−1,⋯,1]+μ1e−μ2=0
μ 1 = ( 1 + r ) n , μ 2 = ( 1 + r ) n e − c \mu_1=(1+r)^n,\mu_2=(1+r)^ne-c μ1=(1+r)n,μ2=(1+r)ne−c
x 1 = D , x 2 = ⋯ = x n = 0 ⇒ x_1=D,x_2=\cdots=x_n=0\Rightarrow x1=D,x2=⋯=xn=0⇒ global opt.
注:该问题从目标函数到限制条件全是线性函数,线性函数显然也是凸函数。于是该问题虽然是线性规划问题,但也可以用凸优化的方法来解。
总结
本文先讨论了非线性优化问题,再讨论了凸优化问题。在非线性优化问题中,介绍了FONC,SONC和SOSC。并给出了两个例子,介绍了求解非线性优化问题的思路,先通过KKT条件(FONC),求出可能的极值点,再用SONC和SOSC来验证,以此严格说明求出的点是或不是极值点。在凸优化问题部分,先介绍了凸的概念。从图的角度引入了凸函数的概念,并给出了其等价定义。最后给出了一系列定理,证明了KKT条件不仅是凸优化问题的必要条件,也是充分条件。于是,求解凸优化问题,只需要用KKT条件求解即可。