INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function

Amir Beck’s INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB Excise 5.2

INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB. Amir Beck. 2014

本文主要涉及题目(ii)的MATLAB部分,
题目(i): https://blog.csdn.net/IYXUAN/article/details/121454946

实验目的

  • 掌握回溯法梯度下降、阻尼牛顿法和混合牛顿法;
  • 学会使用MATLAB,通过上述三种方法求解优化问题;
  • 了解这三种方法的优劣,并能够分析结果产生的原因。

实验环境

MATLAB R2021a

实验内容

Consider the Freudenstein and Roth test function

f(x)=f1(x)2+f2(x)2,x∈R2,f\left(x\right)=f_1\left(x\right)^2+f_2\left(x\right)^2,x\in \mathbb{R}^2,f(x)=f1(x)2+f2(x)2,xR2,

where

f1(x)=−13+x1+((5−x2)x2−2)x2,f_1\left(x\right)=-13+x_1+\left(\left(5-x_2\right)x_2-2\right)x_2\ ,f1(x)=13+x1+((5x2)x22)x2 ,

f2(x)=−29+x1+((x2+1)x2−14)x2.f_2\left(x\right)=-29+x_1+\left(\left(x_2+1\right)x_2-14\right)x_2\ .f2(x)=29+x1+((x2+1)x214)x2 .

Use MATLAB to employ the following three methods on the problem of minimizing fff :

  1. the gradient method with backtracking and parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
  2. the hybrid Newton’s method with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
  3. damped Gauss–Newton’s method with a backtracking line search strategy with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).

All the algorithms should use the stopping criteria ∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5} f(x)ϵ,ϵ=105. Each algorithm should be employed four times on the following four starting points: (−50,7)T,(20,7)T,(20,−18)T,(5,−10)T\left(-50,7\right)^T,\left(20,7\right)^T,\left(20,-18\right)^T,\left(5,-10\right)^T(50,7)T,(20,7)T,(20,18)T,(5,10)T. For each of the four starting points, compare the number of iterations and the point to which each method converged. If a method did not converge, explain why.

算法描述

gradient_method_backtracking

回溯法梯度下降算法描述

设定 ϵ\epsilonϵ

初始化 x0∈Rnx_0\in \mathbb{R}^nx0Rn

FOR k = 0, 1, 2, …

  1. 选取下降方向 dk=g(xk)d_k=g(x_k)dk=g(xk)
  2. 选取步长 tkt_ktk,使得 f(xk+tkdk)<f(xk)f(x_k+t_kd_k)<f(x_k)f(xk+tkdk)<f(xk)
  3. 设置 xk+1=xk+tkdkx_{k+1}=x_k+t_kd_kxk+1=xk+tkdk
  4. IF ∥g(xk+1)∥≤ϵ\Vert g(x_{k+1}) \Vert \le \epsilong(xk+1)ϵ THEN STOP,OUTPUT xk+1x_{k+1}xk+1

在算法循环的 2. 步长选取中,对于定步长的梯度下降来说,tkt_ktk 为一常量,即 tk=tˉt_k=\bar{t}tk=tˉ. 而对于采用回溯法的非精确线搜索来说,令步长的初值 tk=s,s>0t_k=s, s>0tk=s,s>0,更新步长 tk←βtk,β∈(0,1)t_k \gets \beta t_k, \beta \in (0,1)tkβtk,β(0,1),直到满足 f(xk)−f(xk+tkdk)≥−αtk∇f(xk)Tdk,α∈(0,1)f(x_k)-f(x_k+t_kd_k)\ge -\alpha t_k\nabla f(x_k)^Td_k, \alpha \in (0,1)f(xk)f(xk+tkdk)αtkf(xk)Tdk,α(0,1),可以证明,当 t∈[0,ϵ]t\in [0,\epsilon]t[0,ϵ],上述不等式总成立。

newton_backtracking

回溯牛顿算法描述:

回溯牛顿法和上述梯度下降相似,只需将1. 下降方向选取 改为牛顿方向即可。

dk=−(∇2f(xk))−1∇f(xk)d_k=-\left( \nabla^2f(x_k) \right )^{-1}\nabla f(x_k)dk=(2f(xk))1f(xk).

newton_hybrid

混合牛顿法描述:

在选取下降方向dkd_kdk时,当 Hessian 矩阵∇2f(xk)\nabla^2f(x_k)2f(xk)非正定时,使用回溯法梯度下降处理,即dk=−∇f(xk)d_k=-\nabla f(x_k)dk=f(xk)。当∇2f(xk)\nabla^2f(x_k)2f(xk)正定时,dkd_kdk可选用牛顿方向。在判定 Hessian 矩阵是否正定时,使用 Cholesky 分解 作为判定条件。

实验步骤

计算 f(x),g(x),h(x)f(x),g(x),h(x)f(x),g(x),h(x)

f(x)=2x12+12x1x22−32x1x2−84x1+2x26−8x25+2x24−80x23+12x22+864x2+1010f(x)=2x_1^2 + 12x_1x_2^2 - 32x_1x_2 - 84x_1 + 2x_2^6 - 8x_2^5 + 2x_2^4 - 80x_2^3 + 12x_2^2 + 864x_2 + 1010f(x)=2x12+12x1x2232x1x284x1+2x268x25+2x2480x23+12x22+864x2+1010

g(x)=(12x22−32x2+4x1−84,24x2−32x1+24x1x2−240x22+8x23−40x24+12x25+864)Tg(x)=\left ( 12x_2^2 - 32x_2 + 4x_1 - 84,24x_2 - 32x_1 + 24x_1x_2 - 240x_2^2 + 8x_2^3 - 40x_2^4 + 12x_2^5 + 864\right )^Tg(x)=(12x2232x2+4x184,24x232x1+24x1x2240x22+8x2340x24+12x25+864)T

h(x)=(424x2−3224x2−3260x24−160x23+24x22−480x2+24x1+24)h(x)=\begin{pmatrix} 4 &24x_2 - 32 \\ 24x_2 - 32 &60x_2^4 - 160x_2^3 + 24x_2^2 - 480x_2 + 24x_1 + 24 \end{pmatrix}h(x)=(424x23224x23260x24160x23+24x22480x2+24x1+24)

clc;clear
syms f1 f2 x1 x2 f g h
f1=-13+x1+((5-x2)*x2-2)*x2;
f2=-29+x1+((x2+1)*x2-14)*x2;
f=expand(f1^2+f2^2)
% f = 2*x1^2 + 12*x1*x2^2 - 32*x1*x2 - 84*x1 + 2*x2^6 - 8*x2^5 + 2*x2^4 - 80*x2^3 + 12*x2^2 + 864*x2 + 1010
g=gradient(f)
% g = [12*x2^2 - 32*x2 + 4*x1 - 84; 24*x2 - 32*x1 + 24*x1*x2 - 240*x2^2 + 8*x2^3 - 40*x2^4 + 12*x2^5 + 864]
h=hessian(f)
% h = [4, 24*x2 - 32; 24*x2 - 32, 60*x2^4 - 160*x2^3 + 24*x2^2 - 480*x2 + 24*x1 + 24]

绘制函数的等高线和曲面图

在这里插入图片描述

Figure 1. contour and surface plots of f(x)f(x)f(x) around the global optimal solution x∗=(5,4)x^*=(5,4)x=(5,4).

在这里插入图片描述

Figure 2. contour and surface plots of f(x)f(x)f(x) around the local optimal solution x∗=(11.4128,−0.8968)x^*=(11.4128, -0.8968)x=(11.4128,0.8968).

clc;clear
clc;clear
clc;clear
% local optimality
x1 = linspace(11.40,11.42,50);
x2 = linspace(-0.897,-0.8965,50);
% global optimality
% x1 = linspace(4.9,5.1,50);
% x2 = linspace(3.9,4.1,50);
[X1,X2] = meshgrid(x1,x2);
f = 2*X1.^2 + 12*X1.*X2.^2 - 32*X1.*X2 - 84*X1 + 2*X2.^6 - 8*X2.^5 ...+ 2*X2.^4 - 80*X2.^3 + 12*X2.^2 + 864*X2 + 1010;
createfigure(X1,X2,f)
%% 由 surfc 函数生成:
function createfigure(xdata1, ydata1, zdata1)
%CREATEFIGURE(xdata1, ydata1, zdata1)
%  XDATA1:  surface xdata
%  YDATA1:  surface ydata
%  ZDATA1:  surface zdata%  Auto-generated by MATLAB on 08-Dec-2021 15:44:51% Create figure
figure1 = figure;% Create axes
axes1 = axes('Parent',figure1);
hold(axes1,'on');% Create surf
surf(xdata1,ydata1,zdata1,'Parent',axes1);% Create contour
contour(xdata1,ydata1,zdata1,'ZLocation','zmin');view(axes1,[-113 13]);
grid(axes1,'on');
axis(axes1,'tight');
hold(axes1,'off');
% Create colorbar
colorbar(axes1);

执行函数

clc;clear%% init
s = 1;
alpha = .5;
beta = .5;
epsilon = 1e-5;
p1 = [-50;7];
p2 = [20;7];
p3 = [20;-18];
p4 = [5;-10];
f = @(x) 2*x(1)^2 + 12*x(1)*x(2)^2 - 32*x(1)*x(2) - 84*x(1) ...+ 2*x(2)^6 - 8*x(2)^5 + 2*x(2)^4 - 80*x(2)^3 + 12*x(2)^2 ...+ 864*x(2) + 1010;
g = @(x) [12*x(2)^2 - 32*x(2) + 4*x(1) - 84; 24*x(2) - 32*x(1)...+ 24*x(1)*x(2) - 240*x(2)^2 + 8*x(2)^3 - 40*x(2)^4 + 12*x(2)^5 + 864];
h = @(x) [4, 24*x(2) - 32; 24*x(2) - 32, 60*x(2)^4 - 160*x(2)^3 ...+ 24*x(2)^2 - 480*x(2) + 24*x(1) + 24];%% call func
% gradient_method_backtracking
[x_gb1,fun_val_gb1] = gradient_method_backtracking(f,g,p1,s,alpha,...
beta,epsilon);
[x_gb2,fun_val_gb2] = gradient_method_backtracking(f,g,p2,s,alpha,...
beta,epsilon);
[x_gb3,fun_val_gb3] = gradient_method_backtracking(f,g,p3,s,alpha,...
beta,epsilon);
[x_gb4,fun_val_gb4] = gradient_method_backtracking(f,g,p4,s,alpha,...
beta,epsilon);% newton_backtracking
x_nb1 = newton_backtracking(f,g,h,p1,alpha,beta,epsilon);
x_nb2 = newton_backtracking(f,g,h,p2,alpha,beta,epsilon);
x_nb3 = newton_backtracking(f,g,h,p3,alpha,beta,epsilon);
x_nb4 = newton_backtracking(f,g,h,p4,alpha,beta,epsilon);% newton_hybrid
x_nh1 = newton_hybrid(f,g,h,p1,alpha,beta,epsilon);
x_nh2 = newton_hybrid(f,g,h,p2,alpha,beta,epsilon);
x_nh3 = newton_hybrid(f,g,h,p3,alpha,beta,epsilon);
x_nh4 = newton_hybrid(f,g,h,p4,alpha,beta,epsilon);
function [x,fun_val]=gradient_method_backtracking(f,g,x0,s,alpha,...
beta,epsilon)% Gradient method with backtracking stepsize rule
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% x0......... initial point
% s ......... initial choice of stepsize
% alpha ..... tolerance parameter for the stepsize selection
% beta ...... the constant in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
grad=g(x);
fun_val=f(x);
iter=0;
while (norm(grad)>epsilon)iter=iter+1;t=s;while (fun_val-f(x-t*grad)<alpha*t*norm(grad)^2)t=beta*t;endx=x-t*grad;fun_val=f(x);grad=g(x);fprintf('iter_number = %3d norm_grad = %2.6f fun_val = %2.6f \n',...iter,norm(grad),fun_val);
end
function x=newton_backtracking(f,g,h,x0,alpha,beta,epsilon)% Newton’s method with backtracking
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
d=hval\gval;
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);d=hval\gval;
end
if (iter==10000)fprintf('did not converge\n')
end
function x=newton_hybrid(f,g,h,x0,alpha,beta,epsilon)% Hybrid Newton’s method
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
[L,p]=chol(hval,'lower');
if (p==0)d=L'\(L\gval);
elsed=gval;
end
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);[L,p]=chol(hval,'lower');if (p==0)d=L'\(L\gval);elsed=gval;end
end
if (iter==10000)fprintf('did not converge\n')
end

结果与分析

p1=(−50,7)T,p2=(20,7)T,p3=(20,−18)T,p4=(5,−10)Tp_1=\left(-50,7\right)^T,p_2=\left(20,7\right)^T,p_3=\left(20,-18\right)^T,p_4=\left(5,-10\right)^Tp1=(50,7)T,p2=(20,7)T,p3=(20,18)T,p4=(5,10)T

gradient_method_backtracking

x*fun_valiter
p15.0, 4.00.000000-
p25.0, 4.00.000000-
p311.4128, -0.896848.984254-
p45.0, 4.00.000107-
# p1
iter_number =   1 norm_grad = 16886.999242 fun_val = 11336.705024 
iter_number =   2 norm_grad = 5600.459983 fun_val = 5803.984203 
iter_number =   3 norm_grad = 1056.831987 fun_val = 4723.717473 
iter_number =   4 norm_grad = 434.668354 fun_val = 4676.341028 
iter_number =   5 norm_grad = 181.792854 fun_val = 4664.456052 
iter_number =   6 norm_grad = 481.845230 fun_val = 4645.697772 
iter_number =   7 norm_grad = 1450.431648 fun_val = 2784.316562 
iter_number =   8 norm_grad = 626.401853 fun_val = 2597.516663 
iter_number =   9 norm_grad = 220.402369 fun_val = 2565.305713 
iter_number =  10 norm_grad = 116.512243 fun_val = 2561.054079 
......
iter_number = 20849 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20850 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20851 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20852 norm_grad = 0.000092 fun_val = 0.000000 
......# p2
iter_number =   1 norm_grad = 20020.154130 fun_val = 11955.275024 
iter_number =   2 norm_grad = 8107.335439 fun_val = 3744.085975 
iter_number =   3 norm_grad = 2993.085982 fun_val = 1129.842099 
iter_number =   4 norm_grad = 949.370487 fun_val = 445.778197 
iter_number =   5 norm_grad = 192.946640 fun_val = 320.464548 
iter_number =   6 norm_grad = 86.882053 fun_val = 313.996463 
iter_number =   7 norm_grad = 41.236184 fun_val = 311.880377 
iter_number =   8 norm_grad = 223.087207 fun_val = 295.899225 
iter_number =   9 norm_grad = 91.972368 fun_val = 287.479235 
iter_number =  10 norm_grad = 40.293830 fun_val = 285.264643 
......
iter_number = 13961 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13962 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13963 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13964 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13965 norm_grad = 0.000151 fun_val = 0.000000 
......# p3
iter_number =   1 norm_grad = 10459534.883496 fun_val = 26901796.557585 
iter_number =   2 norm_grad = 4328861.498929 fun_val = 9350549.111728 
iter_number =   3 norm_grad = 1815010.486957 fun_val = 3306985.630845 
iter_number =   4 norm_grad = 764085.159510 fun_val = 1178877.486706 
iter_number =   5 norm_grad = 322588.854922 fun_val = 423840.683140 
iter_number =   6 norm_grad = 136745.858345 fun_val = 154327.967236 
iter_number =   7 norm_grad = 58355.714510 fun_val = 57260.204151 
iter_number =   8 norm_grad = 25173.561268 fun_val = 21780.708212 
iter_number =   9 norm_grad = 11034.676282 fun_val = 8504.776429 
iter_number =  10 norm_grad = 4932.916085 fun_val = 3367.298679 
......
iter_number = 11759 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11760 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11761 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11762 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11763 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11764 norm_grad = 0.000034 fun_val = 48.984254 
......# p4
iter_number =   1 norm_grad = 740484.580250 fun_val = 1125740.309089 
iter_number =   2 norm_grad = 318800.867908 fun_val = 410873.876977 
iter_number =   3 norm_grad = 135283.737154 fun_val = 147433.167712 
iter_number =   4 norm_grad = 57286.724183 fun_val = 52647.763355 
iter_number =   5 norm_grad = 24273.902315 fun_val = 18647.021114 
iter_number =   6 norm_grad = 10264.843110 fun_val = 6442.533586 
iter_number =   7 norm_grad = 4279.439838 fun_val = 2094.562330 
iter_number =   8 norm_grad = 1694.354839 fun_val = 604.979471 
iter_number =   9 norm_grad = 568.486420 fun_val = 158.093768 
iter_number =  10 norm_grad = 98.759692 fun_val = 69.674565 
......
iter_number = 6072 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6073 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6074 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6075 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6076 norm_grad = 0.000107 fun_val = 0.000000 
......

newton_backtracking

x*fun_valiter
p15.0, 4.0-0.0000008
p25.0, 4.00.00000011
p311.4128, -0.896848.98425416
p411.4128, -0.896848.98425413
# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

newton_hybrid

x*fun_valiter
p15.0, 4.0-0.0000008
p25.0, 4.00.00000011
p311.4128, -0.896848.98425416
p411.4128, -0.896848.98425413
# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

结果分析

回溯法梯度下降(gradient_method_backtracking)在f(x)f(x)f(x)函数上的表现情况比较糟糕,在所有初始点(p1-p4)上均不能满足结束条件∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5} f(x)ϵ,ϵ=105,所以其迭代步数iter→∞iter \to \inftyiter. 但是可以发现,随着迭代次数的不断增加,最优解x∗x^*x和函数值fun_valfun\_valfun_val均收敛。在起始点为p1,p2,p4时,回溯梯度下降求得的最优解xˉ→(5.0,4.0),fun_val→0.0\bar{x} \to (5.0,4.0), fun\_val \to 0.0xˉ(5.0,4.0),fun_val0.0,显然是f(x)f(x)f(x)的全局最优,但当以p3为起始点时,所得解为局部最优。此外,当以p4为起始点时,fun_val→0.000107fun\_val \to 0.000107fun_val0.000107,收敛效果不如p1,p2起始点。总之,回溯法梯度下降的结果说明,由梯度下降得到的f(x)f(x)f(x)的最优解,无论是在全局最优还是局部最优,其梯度∇f(x∗)\nabla f\left(x^*\right)f(x)难以收敛到0\mathscr{0}0,这也导致了梯度下降的发散。

回溯牛顿法(newton_backtracking)和混合牛顿法(newton_hybrid)得到的结果极为相似,二者在所有四个起始点上的收敛结果表现良好。当以p1,p2为起始点时,两种方法均在10步左右收敛到全局最优(5.0,4.0)(5.0,4.0)(5.0,4.0)。当以p3,p4为起始点时,两方法也能在16步以内收敛到局部最优(11.4128,−0.8968)(11.4128,-0.8968)(11.4128,0.8968). 观察两种方法在四个起始点上的输出结果,不难发现,输出结果完全相同,说明∇2f(xk),k=1,2,…\nabla ^2f(x_k), k=1,2,\dots2f(xk),k=1,2,是非正定的,所以混合牛顿法中的(a)步(纯牛顿)永远不执行,混合牛顿此时等价于回溯牛顿。

总之,在f(x)f(x)f(x)上,当以∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5} f(x)ϵ,ϵ=105为终止条件时,牛顿法的迭代次数明显少于梯度下降法。两种方法,当选取的起始点不同,可能会导致收敛不到全局最优解。

©️ Sylvan Ding ❤️

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/484636.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

石正丽/周鹏团队发表冠状病毒新研究,进一步证明新冠非实验室泄漏

来源&#xff1a;中国生物技术网 新冠病毒SARS-CoV-2造成了近一个世纪以来最具破坏性的大流行COVID-19&#xff0c;而其中尚未解决的科学问题之一是这种病毒的自然来源。近日&#xff0c;来自中科院武汉病毒所、中国科学院大学的研究人员在预印本网站BioRxiv上提交了一篇论文&…

L1正则化降噪,对偶函数的构造,求解含L1正则项的优化问题,梯度投影法

L1正则化降噪&#xff0c;对偶函数的构造&#xff0c;求解含L1正则项的优化问题&#xff0c;梯度投影法 本文主要实现L1正则化降噪&#xff0c;L2 正则化降噪的文章在&#xff1a; https://blog.csdn.net/IYXUAN/article/details/121565229 原创文章&#xff01;转载需注明来源…

HBase之Table.put客户端流程(续)

上篇博文中已经谈到&#xff0c;有两个流程没有讲到。一个是MetaTableAccessor.getRegionLocations&#xff0c;另外一个是ConnectionImplementation.cacheLocation。这一节&#xff0c;就让我们单独来介绍这两个流程。首先让我们来到MetaTableAccessor.getRegionLocations。1.…

普华永道:AI规模化应用,基础知识先行

来源&#xff1a;Forbes作者&#xff1a;Cindy Gordon编译&#xff1a;科技行者人工智能&#xff08;AI&#xff09;是正在改变很多行业的游戏规则。据统计&#xff0c;人工智能有望为2030年的全球经济贡献高达15.7万亿美元&#xff0c;比中国和印度目前的产出之和还要多。其中…

ADMM,ISTA,FISTA算法步骤详解,MATLAB代码,求解LASSO优化问题

ADMM,ISTA,FISTA算法步骤详解&#xff0c;MATLAB代码&#xff0c;求解LASSO优化问题 原创文章&#xff01;转载需注明来源&#xff1a;©️ Sylvan Ding’s Blog ❤️ 实验目的 了解 ADMM, ISTA, FISTA 算法的基本原理、收敛性和复杂度&#xff1b;使用上述三种算法&#…

Spring详解(八)------事务管理

目录 1、事务介绍2、事务的四个特性&#xff08;ACID&#xff09;3、Spring 事务管理的核心接口4、 PlatformTransactionManager 事务管理器5、TransactionStatus 事务状态6、TransactionDefinition 基本事务属性的定义7、Spring 编程式事务和声明式事务的区别 8、不用事务…

A股光刻胶飙涨背后:仅一家可供应高端光刻胶

来源&#xff1a;芯师爷部分素材来源&#xff1a;前瞻产业研究院、华泰证券5月27日&#xff0c;半导体光刻胶概念股开盘即走强&#xff0c;截至收盘&#xff0c;A股光刻胶板块涨幅达6.48%。其中晶瑞股份、广信材料直线拉升大涨20%封涨停&#xff0c;容大感光大涨13.28%&#xf…

二〇二二注定是踔厉奋发、笃行不怠的一年

2021年&#xff0c;我在生活上、学习上、工作上都有着太多太多遗憾&#xff0c;很多想做的事、计划好的事并没有得到落实。但正如张宇老师所说&#xff0c;人生本来就是一段段幸福夹杂着不可避免的遗憾组成的&#xff0c;所取得的成绩远远大于遗憾&#xff0c;毕竟前途似海&…

为什么机器学习算法难以优化?一文详解算法优化内部机制

来源&#xff1a; 机器之心作者&#xff1a;JONAS DEGRAVE、IRA KORSHUNOVA编辑&#xff1a;小舟选自&#xff1a;engraved.blog损失线性组合是正确的选择吗&#xff1f;这篇文章或许能够给你答案。在机器学习中&#xff0c;损失的线性组合无处不在。虽然它们带有一些陷阱&…

论文作者串通抱团、威胁审稿人,ACM Fellow炮轰「同行评审」作弊

来源&#xff1a;机器之心编辑&#xff1a;陈萍、杜伟布朗大学计算机科学系教授、机器学习研究者、2018 年 ACM Fellow 迈克尔 利特曼&#xff08;Michael L. Littman&#xff09;在 ACM 通讯上发文斥责「部分学者正在威胁计算机科学研究的诚实性」。他在文章中指出了同行评审…

JDG人脸识别课堂管理系统全栈开发流程报告-软件工程实践报告

JDG人脸识别课堂管理系统全栈开发流程报告-软件工程 ⭐️ 本报告的代码部分和程序设计参考了 计算机18-1班张宇哲&#xff08;学号181002406&#xff09;同学 在Gitee仓库发布的AI-Attendance&#xff0c;本文档基于软件工程的具体流程&#xff0c;从软件工程的角度细化的张同学…

浅谈贝叶斯统计

来源&#xff1a;京师统计团总支学生会编辑&#xff1a; ∑Gemini浅谈贝叶斯统计贝叶斯统计是英国学者托马斯贝叶斯在《论有关机遇问题的求解》中提出一种归纳推理的理论&#xff0c;后被一些统计学者发展为一种系统的统计推断方法&#xff0c;称为贝叶斯方法。本文旨在通过实际…

浅析虚拟语气 (Subjunctive mood)

浅析虚拟语气 (Subjunctive mood) 本文旨在通过一种时序逻辑上的联系&#xff0c;帮助读者更好的记忆和运用虚拟语气。文中提及的概念不一定正确&#xff0c;但一定程度上能辅助记忆&#xff0c;最终达到熟练运用的目的。&#xff08;有些语法不用刨根问底&#xff0c;都是约定…

数论重大突破:120年后,希尔伯特的第12个数学难题借助计算机获得解决

来源&#xff1a;机器之心编辑&#xff1a;nhyilin德国数学家大卫 希尔伯特&#xff08;David Hilbert&#xff09;是二十世纪最伟大的数学家之一&#xff0c;被后人称为「数学世界的亚历山大」。他对数学领域做出了广泛和重大的贡献&#xff0c;研究领域涉及代数不变式、代数…

Linux进程通信的四种方式——共享内存、信号量、无名管道、消息队列|实验、代码、分析、总结

Linux进程通信的四种方式——共享内存、信号量、无名管道、消息队列&#xff5c;实验、代码、分析、总结 每个进程各自有不同的用户地址空间&#xff0c;任何一个进程的全局变量在另一个进程中都看不到&#xff0c;所以进程之间要交换数据必须通过内核&#xff0c;在内核中开辟…

美国人测评马斯克的星链服务: 现实太骨感,梦想已破灭

来源&#xff1a; 风闻社区、3D实验室最近看到外国人测评马斯克的starlink服务&#xff0c;满篇吐槽&#xff0c;无数缺陷&#xff0c;而且都是原理上无法克服的那种缺陷&#xff0c;跟大家分享一下。首先星链服务不是你想像的那种一卡在手&#xff0c;天下我有的服务。用它之前…

编译过程中的链接地址对最终编译镜像文件的影响

MDK和交叉编译工具编译时都会指定程序的下载的地址&#xff08;其实就是告诉程序它将在那个地址上开始执行&#xff09;&#xff0c;这有什么意义吗&#xff1f; 其实这么设计有原因的&#xff0c;因为这里涉及到全局变量和全局函数指针的地址问题&#xff0c;加入当你在编译时…

三维空间中曲线绕任意轴旋转所得的旋转曲面求法

三维空间中曲线绕任意轴旋转所得的旋转曲面求法 对2023汤家凤考研高等数学讲义225页2.三维空间直线旋转曲面的解释和推广 ©️ sylvanding

彩图完美解释:麦克斯韦方程组

来源&#xff1a;微波射频网麦克斯韦方程组麦克斯韦方程组&#xff08;英语&#xff1a;Maxwells equations&#xff09;是英国物理学家麦克斯韦在19世纪建立的描述电磁场的基本方程组。它含有四个方程&#xff0c;不仅分别描述了电场和磁场的行为&#xff0c;描述了它们之间的…

基于阿里云服务网格流量泳道的全链路流量管理(二):宽松模式流量泳道

作者&#xff1a;尹航 在前文基于阿里云服务网格流量泳道的全链路流量管理&#xff08;一&#xff09;&#xff1a;严格模式流量泳道中&#xff0c;我们介绍了使用服务网格 ASM 的严格模式流量泳道进行全链路灰度管理的使用场景。该模式对于应用程序无任何要求&#xff0c;只需…