对偶
拉格朗日对偶函数
考虑优化问题
min f 0 ( x ) s.t. f i ( x ) ≤ 0 , i = 1 , … , m h i ( x ) = 0 , i = 1 , … , p \begin{array}{ll} \min & f_0\left(\mathbf{x}\right) \\ \text {s.t.} & f_i\left(\mathbf{x}\right) \leq 0, \quad i=1, \ldots, m \\ & h_i\left(\mathbf{x}\right)=0, \quad i=1, \ldots, p \end{array} mins.t.f0(x)fi(x)≤0,i=1,…,mhi(x)=0,i=1,…,p
其中 x ∈ R n \mathbf{x} \in \mathbb{R}^n x∈Rn,假设定义域 D = ⋂ i = 0 m dom f i ∩ ⋂ i = 1 p dom h i \mathcal{D}=\bigcap_{i=0}^m \operatorname{dom} f_i \cap \bigcap_{i=1}^p \operatorname{dom} h_i D=⋂i=0mdomfi∩⋂i=1pdomhi非空,最优解为 p ∗ p^{*} p∗
定义Lagrangian L : R n × R m × R p → R L: \mathbb{R}^{n} \times \mathbb{R}^{m}\times \mathbb{R}^{p}\to \mathbb{R} L:Rn×Rm×Rp→R
L ( x , λ , ν ) = f 0 ( x ) + ∑ i = 1 m λ i f i ( x ) + ∑ i = 1 p ν i h i ( x ) L\left( \mathbf{x}, \mathbf{\lambda}, \mathbf{\nu} \right) =f_{0} \left( \mathbf{x} \right) +\sum_{i=1}^{m} \lambda_{i} f_{i}\left( \mathbf{x} \right) + \sum_{i=1}^{p} \nu_{i} h_{i}\left( \mathbf{x} \right) L(x,λ,ν)=f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x)
其中 dom L = D × R m × R p \operatorname{dom} L = \mathcal{D} \times \mathbb{R}^{m} \times \mathbb{R}^{p} domL=D×Rm×Rp
这里的 λ , ν \mathbf{\lambda}, \mathbf{\nu} λ,ν被称为对偶变量(dual varibales)或者拉格朗日乘子向量(Lagrange multiplier vectors)
拉格朗日对偶函数
定义拉格朗日对偶函数 g : R m × R p → R g: \mathbb{R}^m \times \mathbb{R}^{p} \to \mathbb{R} g:Rm×Rp→R为Lagrangian关于 x \mathbf{x} x的最小值,即
g ( λ , ν ) = inf x ∈ D L ( x , λ , ν ) = inf x ∈ D ( f 0 ( x ) + ∑ i = 1 m λ i f i ( x ) + ∑ i = 1 p ν i h i ( x ) ) g \left( \mathbf{\lambda}, \mathbb{\nu} \right) = \inf_{\mathbf{x} \in \mathcal{D}} L \left( \mathbf{x}, \mathbf{\lambda}, \mathbf{\nu} \right) = \inf_{\mathbf{x} \in \mathcal{D}} \left( f_{0} \left( \mathbf{x} \right) +\sum_{i=1}^{m} \lambda_{i} f_{i}\left( \mathbf{x} \right) + \sum_{i=1}^{p} \nu_{i} h_{i}\left( \mathbf{x} \right) \right) g(λ,ν)=x∈DinfL(x,λ,ν)=x∈Dinf(f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x))
因为 L L L关于 ( λ , ν ) \left(\mathbf{\lambda}, \mathbf{\nu}\right) (λ,ν)是凹(仿射)函数,因此取 inf \inf inf后是凹函数
最优解的下界
对于任意 λ ⪰ 0 \mathbf{\lambda}\succeq \mathbf{0} λ⪰0,有
g ( λ , ν ) ≤ p ∗ g \left( \mathbf{\lambda}, \mathbf{\nu} \right) \le p^{*} g(λ,ν)≤p∗
证明:
设 x ~ \tilde{\mathbf{x}} x~是一个可行解,即 f i ( x ~ ) ≤ 0 , h i ( x ~ ) = 0 f_{i} \left( \tilde{\mathbf{x}} \right) \le 0, h_{i} \left( \tilde{\mathbf{x}} \right) = 0 fi(x~)≤0,hi(x~)=0,有
∑ i = 1 m λ i f i ( x ~ ) + ∑ i = 1 p ν i h i ( x ~ ) ≤ 0 \sum_{i=1}^{m} \lambda_{i} f_{i}\left( \tilde{\mathbf{x}} \right) + \sum_{i=1}^{p} \nu_{i} h_{i}\left( \tilde{\mathbf{x}} \right) \le 0 i=1∑mλifi(x~)+i=1∑pνihi(x~)≤0
因此
L ( x ~ , λ , ν ) = f 0 ( x ~ ) + ∑ i = 1 m λ i f i ( x ~ ) + ∑ i = 1 p ν i h i ( x ~ ) ≤ f 0 ( x ~ ) L \left( \tilde{\mathbf{x}}, \mathbf{\lambda}, \mathbf{\nu} \right) = f_{0} \left( \tilde{\mathbf{x}} \right)+ \sum_{i=1}^{m} \lambda_{i} f_{i}\left( \tilde{\mathbf{x}} \right) + \sum_{i=1}^{p} \nu_{i} h_{i}\left( \tilde{\mathbf{x}} \right) \le f_{0} \left( \tilde{\mathbf{x}} \right) L(x~,λ,ν)=f0(x~)+i=1∑mλifi(x~)+i=1∑pνihi(x~)≤f0(x~)
因此
g ( λ , ν ) ≤ p ∗ g \left( \mathbf{\lambda}, \mathbf{\nu} \right) \le p^{*} g(λ,ν)≤p∗
拉格朗日对偶函数和共轭函数
考虑
min f 0 ( x ) s.t. A x ⪯ b C x = d \begin{array}{ll} \min & f_0 \left( \mathbf{x} \right) \\ \text {s.t.} & \mathbf{A} \mathbf{x} \preceq \mathbf{b} \\ & \mathbf{C} \mathbf{x} = \mathbf{d} \end{array} mins.t.f0(x)Ax⪯bCx=d
对偶函数
g ( λ , ν ) = inf x ( f 0 ( x ) + λ T ( A x − b ) + ν T ( C x − d ) ) = − b T λ − d T ν + inf x ( f 0 ( x ) + ( A T λ + C T ν ) T x ) = − b T λ − d T ν − sup x ( ( − A T λ − C T ν ) T x − f 0 ( x ) ) = − b T λ − d T ν − f 0 ∗ ( − A T λ − C T ν ) \begin{aligned} g \left( \mathbf{\lambda}, \mathbf{\nu} \right) &= \inf_{\mathbf{x}} \left( f_{0} \left( \mathbf{x} \right) + \mathbf{\lambda}^T \left( \mathbf{A} \mathbf{x} - \mathbf{b} \right) + \mathbf{\nu}^T \left( \mathbf{C}\mathbf{x} - \mathbf{d} \right) \right) \\ &= -\mathbf{b}^T\mathbf{\lambda} - \mathbf{d}^T\mathbf{\nu} + \inf_{\mathbf{x}} \left( f_{0} \left( \mathbf{x} \right) + \left( \mathbf{A}^T\mathbf{\lambda} + \mathbf{C}^T\mathbf{\nu} \right)^T \mathbf{x}\right) \\ &= -\mathbf{b}^T\mathbf{\lambda} - \mathbf{d}^T\mathbf{\nu} - \sup_{\mathbf{x}} \left( \left( -\mathbf{A}^T\mathbf{\lambda} - \mathbf{C}^T\mathbf{\nu} \right)^T \mathbf{x}-f_{0} \left( \mathbf{x} \right)\right) \\ &= -\mathbf{b}^T\mathbf{\lambda} - \mathbf{d}^T\mathbf{\nu} - f_{0}^{*} \left( -\mathbf{A}^T\mathbf{\lambda} - \mathbf{C}^T\mathbf{\nu} \right) \end{aligned} g(λ,ν)=xinf(f0(x)+λT(Ax−b)+νT(Cx−d))=−bTλ−dTν+xinf(f0(x)+(ATλ+CTν)Tx)=−bTλ−dTν−xsup((−ATλ−CTν)Tx−f0(x))=−bTλ−dTν−f0∗(−ATλ−CTν)
其中 dom g = { ( λ , ν ) ∣ − A T λ − C T ν ∈ dom f 0 ∗ } \operatorname{dom} g = \left\{ \left( \mathbf{\lambda}, \mathbf{\nu} \right) | -\mathbf{A}^T\mathbf{\lambda} - \mathbf{C}^T\mathbf{\nu} \in \operatorname{dom} f_{0}^{*} \right\} domg={(λ,ν)∣−ATλ−CTν∈domf0∗}
例子
范数
min f 0 ∗ ∥ x ∥ s.t. A x = b \begin{array}{ll} \min & f_{0}^{*}\|\mathbf{x}\| \\ \text {s.t.} & \mathbf{A} \mathbf{x} = \mathbf{b} \end{array} mins.t.f0∗∥x∥Ax=b
根据
f 0 ∗ ( y ) = { 0 ∥ y ∥ ∗ ≤ 1 ∞ o t h e r w i s e f_{0}^{*} \left( \mathbf{y} \right) =\begin{cases} 0 & \|\mathbf{y}\|_{*} \le 1\\ \\ \infty & otherwise \end{cases} f0∗(y)=⎩ ⎨ ⎧0∞∥y∥∗≤1otherwise
有
g ( ν ) = − b T ν − f 0 ∗ ( − A T ν ) = { − b T ν ∥ A T ν ∥ ∗ ≤ 1 − ∞ o t h e r w i s e g \left( \mathbf{\nu} \right) =-\mathbf{b}^T\mathbf{\nu} - f_{0}^{*} \left( -\mathbf{A}^T \mathbf{\nu} \right) =\begin{cases} -\mathbf{b}^T\mathbf{\nu} & \|\mathbf{A}^T\mathbf{\nu}\|_{*}\le 1\\ \\ -\infty & otherwise \end{cases} g(ν)=−bTν−f0∗(−ATν)=⎩ ⎨ ⎧−bTν−∞∥ATν∥∗≤1otherwise
椭圆中的最小值
min f 0 ( x ) = log ∣ X − 1 ∣ s.t. a i T X a i ≤ 1 , i = 1 , 2 , ⋯ , m \begin{array}{ll} \min & f_0 \left( \mathbf{x} \right)=\log \left| \mathbf{X}^{-1} \right| \\ \text {s.t.} & \mathbf{a}_{i}^T \mathbf{X}\mathbf{a}_{i} \le 1, i=1,2,\cdots,m \\ \end{array} mins.t.f0(x)=log X−1 aiTXai≤1,i=1,2,⋯,m
其中 dom f 0 = S + + n \operatorname{dom} f_{0} = S_{++}^{n} domf0=S++n
f 0 ∗ ( Y ) = log ∣ ( − Y ) − 1 ∣ − n f_{0}^{*} \left( \mathbf{Y} \right) = \log \left| \left( -\mathbf{Y} \right)^{-1} \right| -n f0∗(Y)=log (−Y)−1 −n
其中 dom f 0 ∗ = − S + + n \operatorname{dom} f_{0}^{*} = -S_{++}^{n} domf0∗=−S++n
拉格朗日函数
L ( X , λ ) = log ∣ X − 1 ∣ + ∑ i = 1 m λ i ( a i T X a i − 1 ) = − λ T 1 + tr ( ( ∑ i = 1 m λ i a i a i T ) X ) + log ∣ X − 1 ∣ \begin{aligned} L\left( \mathbf{X}, \mathbf{\lambda} \right) &= \log \left| \mathbf{X}^{-1} \right| + \sum_{i=1}^{m}\lambda_{i}\left( \mathbf{a}_{i}^{T} \mathbf{X} \mathbf{a}_{i}-1 \right) \\ &= -\mathbf{\lambda}^T\mathbf{1} + \operatorname{tr}\left( \left( \sum_{i=1}^{m}\lambda_{i} \mathbf{a}_{i} \mathbf{a}_{i}^{T} \right) \mathbf{X} \right) +\log \left| \mathbf{X}^{-1} \right| \end{aligned} L(X,λ)=log X−1 +i=1∑mλi(aiTXai−1)=−λT1+tr((i=1∑mλiaiaiT)X)+log X−1
因此
g ( λ ) = − λ T 1 − f 0 ∗ ( − ( ∑ i = 1 m λ i a i a i T ) ) = − λ T 1 + log ∣ ∑ i = 1 m λ i a i a i T ∣ + n g \left( \mathbf{\lambda} \right) = -\lambda^T \mathbf{1} - f_{0}^{*} \left( -\left( \sum_{i=1}^{m}\lambda_{i} \mathbf{a}_{i} \mathbf{a}_{i}^{T} \right) \right) = -\lambda^T \mathbf{1} + \log \left| \sum_{i=1}^{m}\lambda_{i} \mathbf{a}_{i} \mathbf{a}_{i}^{T} \right| +n g(λ)=−λT1−f0∗(−(i=1∑mλiaiaiT))=−λT1+log i=1∑mλiaiaiT +n
拉格朗日对偶问题
min g ( λ , ν ) s.t. λ ⪰ 0 \begin{array}{ll} \min & g \left(\mathbf{\lambda}, \mathbf{\nu}\right) \\ \text {s.t.} & \mathbf{\lambda} \succeq \mathbf{0} \end{array} mins.t.g(λ,ν)λ⪰0
例子
标准线性规划
min c T x s.t. A x = b x ⪰ 0 \begin{array}{ll} \min & \mathbf{c}^{T} \mathbf{x} \\ \text {s.t.} & \mathbf{A} \mathbf{x} = \mathbf{b} \\ & \mathbf{x} \succeq \mathbf{0} \end{array} mins.t.cTxAx=bx⪰0
拉格朗日函数
L ( x , λ , ν ) = c T x − λ T x + ν T ( A x − b ) = − ν T b + ( c − λ + A T ν ) T x L \left( \mathbf{x}, \mathbf{\lambda}, \mathbf{\nu} \right) = \mathbf{c}^T \mathbf{x} -\mathbf{\lambda}^T\mathbf{x} + \mathbf{\nu}^T \left( \mathbf{A}\mathbf{x} - \mathbf{b} \right) = - \mathbf{\nu}^T\mathbf{b} + \left( \mathbf{c} - \mathbf{\lambda}+ \mathbf{A}^T \mathbf{\nu} \right)^{T} \mathbf{x} L(x,λ,ν)=cTx−λTx+νT(Ax−b)=−νTb+(c−λ+ATν)Tx
进而
g ( λ , ν ) = { − b T ν , c − λ + A T ν = 0 − ∞ , o t h e r w i s e g \left( \mathbf{\lambda}, \mathbf{\nu} \right) = \begin{cases} -\mathbf{b}^T\mathbf{\nu}, & \mathbf{c} - \mathbf{\lambda}+ \mathbf{A}^T \mathbf{\nu}=\mathbf{0}\\ \\ -\infty, & otherwise \end{cases} g(λ,ν)=⎩ ⎨ ⎧−bTν,−∞,c−λ+ATν=0otherwise
其中 λ ⪰ 0 \mathbf{\lambda} \succeq \mathbf{0} λ⪰0
对偶问题
max − b T ν s.t. c − λ + A T ν = 0 λ ⪰ 0 \begin{array}{ll} \max & -\mathbf{b}^T\mathbf{\nu} \\ \text {s.t.} & \mathbf{c} - \mathbf{\lambda}+ \mathbf{A}^T \mathbf{\nu}=\mathbf{0} \\ & \mathbf{\lambda} \succeq \mathbf{0} \end{array} maxs.t.−bTνc−λ+ATν=0λ⪰0
再转一转
max − b T ν s.t. c + A T ν ⪰ 0 \begin{array}{ll} \max & -\mathbf{b}^T\mathbf{\nu} \\ \text {s.t.} & \mathbf{c} + \mathbf{A}^T \mathbf{\nu} \succeq \mathbf{0} \end{array} maxs.t.−bTνc+ATν⪰0
弱对偶
假设拉格朗日对偶问题的最优解是 d ∗ d^{*} d∗,则
d ∗ ≤ p ∗ d^{*} \le p^{*} d∗≤p∗
强队偶和Slater条件
如果 d ∗ = p ∗ d^{*} = p^{*} d∗=p∗,我们称强对偶成立
考虑如下问题
min f 0 ( x ) s.t. f i ( x ) ≤ 0 , i = 1 , … , m A x = b \begin{array}{ll} \min & f_0\left(\mathbf{x}\right) \\ \text {s.t.} & f_i\left(\mathbf{x}\right) \leq 0, \quad i=1, \ldots, m \\ & \mathbf{A}\mathbf{x} = \mathbf{b} \end{array} mins.t.f0(x)fi(x)≤0,i=1,…,mAx=b
其中 f 0 , f 1 , ⋯ , f m f_0, f_1,\cdots, f_m f0,f1,⋯,fm是凸函数,为定义域 D \mathcal{D} D
Slater条件:存在 x ∈ relint D \mathbf{x} \in \operatorname{relint} \mathcal{D} x∈relintD,使得
f i ( x ) < 0 , i = 1 , ⋯ , m , A x = b f_{i} \left( \mathbf{x} \right) < 0, \quad i= 1,\cdots,m,\quad \mathbf{A}\mathbf{x} = \mathbf{b} fi(x)<0,i=1,⋯,m,Ax=b
则强对偶成立
如果 f 1 , f 2 , ⋯ , f k f_1, f_2,\cdots, f_k f1,f2,⋯,fk是仿射函数,则
广义Slater条件:存在 x ∈ relint D \mathbf{x} \in \operatorname{relint} \mathcal{D} x∈relintD,使得
f i ( x ) ≤ 0 , i = 1 , ⋯ , k f i ( x ) < 0 , i = k + 1 , ⋯ , m A x = b f_{i} \left( \mathbf{x} \right) \le 0, \quad i= 1,\cdots,k \quad f_{i} \left( \mathbf{x} \right) < 0, \quad i= k+1,\cdots,m\\ \quad \mathbf{A}\mathbf{x} = \mathbf{b} fi(x)≤0,i=1,⋯,kfi(x)<0,i=k+1,⋯,mAx=b
则强对偶成立
鞍点
min-max
先考虑没有等式约束
注意到
sup λ ≥ 0 L ( x , λ ) = sup λ ⪰ 0 ( f 0 ( x ) + ∑ i = 1 m λ i f i ( x ) ) = { f 0 ( x ) , f i ( x ) ≤ 0 , i = 1 , ⋯ , m ∞ , o t h e r w i s e \begin{aligned} \sup_{\mathbf{\lambda} \ge \mathbf{0}} L \left( \mathbf{x}, \mathbf{\lambda} \right) &= \sup_{\mathbf{\lambda} \succeq \mathbf{0}} \left( f_{0} \left( \mathbf{x} \right) + \sum_{i=1}^{m} \lambda_{i} f_{i} \left( \mathbf{x} \right) \right) \\ &= \begin{cases} f_{0} \left( \mathbf{x} \right), & f_{i} \left( \mathbf{x} \right) \le 0,\ i=1,\cdots, m \\ \infty, & otherwise \end{cases} \end{aligned} λ≥0supL(x,λ)=λ⪰0sup(f0(x)+i=1∑mλifi(x))={f0(x),∞,fi(x)≤0, i=1,⋯,motherwise
这是因为,如果 x \mathbf{x} x不是一个可行解,则存在 i i i,使得 f i ( x ) > 0 f_i \left( \mathbf{x} \right) > 0 fi(x)>0,令 λ j = 0 , j ≠ i , λ i → ∞ \lambda_j =0, j\neq i, \lambda_{i} \to \infty λj=0,j=i,λi→∞就能得到 L → ∞ L\to \infty L→∞
如果 f i ( x ) ≤ 0 , i = 1 , ⋯ m f_i \left( \mathbf{x} \right) \le 0, i=1,\cdots m fi(x)≤0,i=1,⋯m,则最优解肯定是令 λ = 0 \mathbf{\lambda} = \mathbf{0} λ=0,即可得到 f 0 ( x ) f_0 \left( \mathbf{x} \right) f0(x)
因此
p ∗ = inf x sup λ ⪰ 0 L ( x , λ ) p^{*} = \inf_{x} \sup_{\mathbf{\lambda}\succeq \mathbf{0}} L \left( \mathbf{x}, \mathbf{\lambda} \right) p∗=xinfλ⪰0supL(x,λ)
根据对偶函数的定义,有
d ∗ = sup λ ⪰ 0 inf x L ( x , λ ) d^{*} = \sup_{\mathbf{\lambda} \succeq \mathbf{0}} \inf_{\mathbf{x}} L \left( \mathbf{x}, \mathbf{\lambda} \right) d∗=λ⪰0supxinfL(x,λ)
所以弱对偶
sup λ ⪰ 0 inf x L ( x , λ ) ≤ inf x sup λ ⪰ 0 L ( x , λ ) \sup_{\mathbf{\lambda} \succeq \mathbf{0}} \inf_{\mathbf{x}} L \left( \mathbf{x}, \mathbf{\lambda} \right) \le \inf_{x} \sup_{\mathbf{\lambda}\succeq \mathbf{0}} L \left( \mathbf{x}, \mathbf{\lambda} \right) λ⪰0supxinfL(x,λ)≤xinfλ⪰0supL(x,λ)
强对偶
sup λ ⪰ 0 inf x L ( x , λ ) = inf x sup λ ⪰ 0 L ( x , λ ) \sup_{\mathbf{\lambda} \succeq \mathbf{0}} \inf_{\mathbf{x}} L \left( \mathbf{x}, \mathbf{\lambda} \right) = \inf_{x} \sup_{\mathbf{\lambda}\succeq \mathbf{0}} L \left( \mathbf{x}, \mathbf{\lambda} \right) λ⪰0supxinfL(x,λ)=xinfλ⪰0supL(x,λ)
这里可以推广一下
[!note] max-min inequality
f : Z × W → R f: Z \times W \to \mathbb{R} f:Z×W→R,有
sup z ∈ Z inf w ∈ W f ( z , w ) ≤ inf w ∈ W sup z ∈ Z f ( z , w ) \sup_{z \in Z} \inf_{w \in W} f \left( z, w \right) \le \inf_{w \in W} \sup_{z \in Z} f \left( z, w \right) z∈Zsupw∈Winff(z,w)≤w∈Winfz∈Zsupf(z,w)
证明:
令 g ( z ) = inf w ∈ W f ( z , w ) , h ( w ) = sup z ∈ Z f ( z , w ) g \left( z \right)= \inf_{w \in W} f \left( z,w \right), h \left( w \right)= \sup_{z \in Z} f \left( z, w \right) g(z)=infw∈Wf(z,w),h(w)=supz∈Zf(z,w)
∀ z ∈ Z , g ( z ) ≤ f ( z , w ) \forall z \in Z, g\left(z \right) \le f \left(z, w\right) ∀z∈Z,g(z)≤f(z,w)
∀ w ∈ W , f ( z , w ) ≤ h ( w ) \forall w \in W, f \left(z, w\right) \le h\left( w \right) ∀w∈W,f(z,w)≤h(w)
因此 ∀ z ∈ Z , w ∈ W \forall z \in Z, w \in W ∀z∈Z,w∈W,有 g ( z ) ≤ f ( z , w ) ≤ h ( w ) g\left(z \right) \le f \left(z, w\right) \le h\left( w \right) g(z)≤f(z,w)≤h(w)
进而 ∀ w ∈ W , sup z ∈ Z g ( z ) ≤ h ( w ) \forall w \in W, \sup_{z\in Z} g \left( z \right) \le h \left( w \right) ∀w∈W,supz∈Zg(z)≤h(w)
于是 ∀ w ∈ W , sup z ∈ Z g ( z ) ≤ inf w ∈ W h ( w ) \forall w \in W, \sup_{z\in Z} g \left( z \right) \le \inf_{w \in W}h \left( w \right) ∀w∈W,supz∈Zg(z)≤infw∈Wh(w)
最后就得到了
sup z ∈ Z inf w ∈ W f ( z , w ) ≤ inf w ∈ W sup z ∈ Z f ( z , w ) \sup_{z \in Z} \inf_{w \in W} f \left( z, w \right) \le \inf_{w \in W} \sup_{z \in Z} f \left( z, w \right) z∈Zsupw∈Winff(z,w)≤w∈Winfz∈Zsupf(z,w)
鞍点解释
对于 w ~ ∈ W , z ~ ∈ Z \tilde{w} \in W, \tilde{z} \in Z w~∈W,z~∈Z,如果
f ( w ~ , z ) ≤ f ( w ~ , z ~ ) ≤ f ( w , z ~ ) f \left( \tilde{w}, z \right) \le f \left( \tilde{w}, \tilde{z} \right) \le f \left( w, \tilde{z} \right) f(w~,z)≤f(w~,z~)≤f(w,z~)
则称 ( w ~ , z ~ ) \left( \tilde{w}, \tilde{z} \right) (w~,z~)为鞍点
换句话说,如果找到了 L L L的鞍点,就找到了原问题的最优解和对偶问题的最优解,反之亦然
参考:
Boyd, S., et al. (2004). Convex optimization, Cambridge university press.