用原问题(Primal Problem)中的P表示原问题,具体如下: ( P ) min f ( x ) s t . g i ( x ) ≤ 0 , i = 1 , ⋯ , m , h i ( x ) = 0 , i = 1 , ⋯ , l , x ∈ X \begin{equation}\begin{aligned} &(P)\; \;\min\; f(x)\\ &st.\;\;g_i(x)\le0,i=1,\cdots,m,\\ &\;\;\;\;\;\;h_i(x)=0,i=1,\cdots,l,\\ &\;\;\;\;\;\;x\in X\\ \end{aligned}\end{equation} (P)minf(x)st.gi(x)≤0,i=1,⋯,m,hi(x)=0,i=1,⋯,l,x∈X
用对偶问题(duality Problem)中的D表示对偶问题,具体如下: ( D ) max λ ≥ 0 , μ min x ∈ X L ( x , λ , μ ) L ( x , λ , μ ) = f ( x ) + ∑ i = 1 m λ i g i ( x ) + ∑ i = 1 l μ i h i ( x ) , x ∈ X \begin{equation}\begin{aligned} &(D)\; \;\max\limits_{\lambda\ge0,\mu}\min\limits_{x\in X}\mathrm{L}(x,\lambda,\mu)\\ &\mathrm{L}(x,\lambda,\mu)=f(x)+\sum_{i=1}^m\lambda_ig_i(x)+\sum_{i=1}^l\mu_ih_i(x),x\in X\\ \end{aligned}\end{equation} (D)λ≥0,μmaxx∈XminL(x,λ,μ)L(x,λ,μ)=f(x)+i=1∑mλigi(x)+i=1∑lμihi(x),x∈X
原问题的拉格朗日函数形式: ( P ) min x ∈ X max λ ≥ 0 , μ L ( x , λ , μ ) L ( x , λ , μ ) = f ( x ) + ∑ i = 1 m λ i g i ( x ) + ∑ i = 1 l μ i h i ( x ) , x ∈ X \begin{equation}\begin{aligned} &(P)\; \;\min\limits_{x\in X}\max\limits_{\lambda\ge0,\mu}\mathrm{L}(x,\lambda,\mu)\\ &\mathrm{L}(x,\lambda,\mu)=f(x)+\sum_{i=1}^m\lambda_ig_i(x)+\sum_{i=1}^l\mu_ih_i(x),x\in X\\ \end{aligned}\end{equation} (P)x∈Xminλ≥0,μmaxL(x,λ,μ)L(x,λ,μ)=f(x)+i=1∑mλigi(x)+i=1∑lμihi(x),x∈X
2. Slater 条件
假设: 1) 集合X为非空凸集, f ( x ) f(x) f(x)及 g i ( x ) , i = 1 , 2 , ⋯ , m g_i(x),i=1,2,\cdots,m gi(x),i=1,2,⋯,m是凸函数, h i ( x ) , i = 1 , 2 , ⋯ , l h_i(x),i=1,2,\cdots,l hi(x),i=1,2,⋯,l均为线性函数。 2) 假设存在 x ^ ∈ X \hat{x}\in X x^∈X使得 g i ( x ^ ) < 0 , i = 1 , ⋯ , m , h i ( x ^ ) = 0 , i = 1 , ⋯ , l g_i(\hat{x})<0,i=1,\cdots,m,h_i(\hat{x})=0,i=1,\cdots,l gi(x^)<0,i=1,⋯,m,hi(x^)=0,i=1,⋯,l,且 0 ∈ i n t h ( X ) 0\in \mathrm{int}\; h(X) 0∈inth(X),其中 h ( X ) = { [ h 1 ( x ) , h 2 ( x ) , ⋯ , h l ( x ) ] T ∣ x ∈ X } h(X)=\{[h_1(x),h_2(x),\cdots,h_l(x)]^T\big|x\in X\} h(X)={[h1(x),h2(x),⋯,hl(x)]Tx∈X},则强对偶成立,即: min { f ( x ) ∣ x ∈ S } = max { d ( λ , μ ) ∣ λ ≥ 0 , μ } \begin{equation} \min \{f(x)|x\in S\}=\max \{d(\lambda,\mu)\big|\lambda \ge 0,\mu\} \end{equation} min{f(x)∣x∈S}=max{d(λ,μ)λ≥0,μ}
3. 鞍点
假设原问题P有最优解 x ˉ \bar{x} xˉ,其对偶问题D有最优解 ( λ ˉ , μ ˉ ) (\bar{\lambda},\bar{\mu}) (λˉ,μˉ),满足强对偶关系, v ( P ) = v ( D ) v(P)=v(D) v(P)=v(D),且可得: g i ( x ˉ ) ≤ 0 , h i ( x ˉ ) = 0 , x ˉ ≤ X , λ ˉ ≥ 0 , f ( x ˉ ) = d ( λ ˉ , μ ˉ ) \begin{equation} g_i(\bar{x})\le 0,h_i(\bar{x})=0,\bar{x}\le X,\bar{\lambda}\ge 0,f(\bar{x})=d(\bar{\lambda},\bar{\mu}) \end{equation} gi(xˉ)≤0,hi(xˉ)=0,xˉ≤X,λˉ≥0,f(xˉ)=d(λˉ,μˉ)
因为 f ( x ˉ ) f(\bar{x}) f(xˉ)为原问题最小值,所以其他值必然大于此值: v ( P ) = f ( x ˉ ) ≤ f ( x ) \begin{equation} v(P)=f(\bar{x})\le f(x) \end{equation} v(P)=f(xˉ)≤f(x)
因为 g i ( x ) ≤ 0 , λ i > 0 , h i ( x ) = 0 g_i(x)\le 0,\lambda_i>0,h_i(x)=0 gi(x)≤0,λi>0,hi(x)=0可得: d ( λ ˉ , μ ˉ ) ≤ f ( x ˉ ) \begin{equation} d(\bar{\lambda},\bar{\mu})\le f(\bar{x}) \end{equation} d(λˉ,μˉ)≤f(xˉ)
综上所述,弱对偶定理可得: d ( λ , μ ) ≤ d ( λ ˉ , μ ˉ ) = v ( D ) ≤ v ( P ) = f ( x ˉ ) ≤ f ( x ) \begin{equation} d(\lambda,\mu)\le d(\bar{\lambda},\bar{\mu})=v(D)\le v(P)=f(\bar{x})\le f(x) \end{equation} d(λ,μ)≤d(λˉ,μˉ)=v(D)≤v(P)=f(xˉ)≤f(x)
转换为拉格朗日函数可得: L ( x ˉ , λ , μ ) ≤ L ( x ˉ , λ ˉ , μ ˉ ) ≤ L ( x , λ ˉ , μ ˉ ) \begin{equation} L(\bar{x},\lambda,\mu)\le L(\bar{x},\bar{\lambda},\bar{\mu})\le L(x,\bar{\lambda},\bar{\mu}) \end{equation} L(xˉ,λ,μ)≤L(xˉ,λˉ,μˉ)≤L(x,λˉ,μˉ)
OpenAI美东时间周四推出“GPT-4o mini”,入局“小而精”AI模型竞争,称这款新模型是“功能最强、成本偏低的模型”,计划今后整合图像、视频、音频到这个模型中。
OpenAI表示,GPT-4o mini 相较于 OpenAI 目前最先进的 AI 模型更加便…