Interpreting Super-Resolution Networks with Local Attribution Maps

CVPR2021
 
问题引入
 
主要针对的是深度模型的不可解释性，本文使用attribution analysis来分析输入的哪些像素对SR的结果贡献最大；
本文提出了Local Attribution Maps，是一种归因方法，延伸了path integral gradient method，在其基础上有两点改进，本文的结果显示involved input pixel越多SR的结果越好；
 
方法
 
针对分类的归因方法总结：输入图像 I ∈ R d I\in\mathbb{R}^d I∈Rd，一个分类网络 S : R d → R S:\mathbb{R}^d\rightarrow\mathbb{R} S:Rd→R，归因方法得到attribution maps  A t t r S : R d → R d Attr_S:\mathbb{R}^d\rightarrow\mathbb{R}^d AttrS​:Rd→Rd，尺寸和输入是一样的；最直观的方法是输出的类别对输入求导数来作为归因图 G r a d S ( I ) = ∂ S ( I ) ∂ I Grad_S(I)=\frac{\partial S(I)}{\partial I} GradS​(I)=∂I∂S(I)​，但是该方法存在saturation的问题，也就是梯度很小，无法指示重要特征；提出了输入和梯度的元素乘来解决saturation  I ⊙ ∂ S ( I ) ∂ I I\odot \frac{\partial S(I)}{\partial I} I⊙∂I∂S(I)​；还有方法提出Integrated Gradients(IG)方法 ( I − I ′ ) ⋅ ∫ 0 1 ∂ S ( I ′ + α ( I − I ′ ) ) ∂ I d α (I-I')\cdot \int_0^1 \frac{\partial S(I' + \alpha(I-I'))}{\partial I}d\alpha (I−I′)⋅∫01​∂I∂S(I′+α(I−I′))​dα，其中 I ′ I' I′是baseline input，类似于控制变量，假如要考虑某因素的影响，baseline input就不含该因素，作为对比；
设计针对SR模型归因分析方法需要考虑的方面：首先，分类网络解释的整张图范围对输出类别的影响，而SR网络的输出具有局部性；其次需要着重对SR的困难区域进行分析；最后分类网络使用输出的类别概率来计算梯度得到归因图，但是SR的输出是像素值，和图像的intensity相关，直接利用该信息求归因图，其值也和像素intensity相关，所以应该解释特征而不是像素值；
本文提出的Local Attribution Maps就是上面IG的升级版本，SR网络 F : R h × w → R s h × s w F:\mathbb{R}^{h\times w}\rightarrow\mathbb{R}^{sh\times sw} F:Rh×w→Rsh×sw，缩放比例为 s s s，按照上面描述的需要解决的问题，归因分析是通过attributing the existence of certain features of local patches in the output image完成的，位于 ( x , y ) (x,y) (x,y)位置的 l × l l\times l l×l的patch是否存在某特征通过 D x y : R l × l → R D_{xy}:\mathbb{R}^{l\times l}\rightarrow \mathbb{R} Dxy​:Rl×l→R来量化，本文使用的是gradient detector  D x y ( I ) = ∑ i ∈ [ x , x + l ] , j ∈ [ y , y + l ] ∇ i j I D_{xy}(I)=\sum_{i\in[x,x+l],j\in[y,y+l]}\nabla_{ij}I Dxy​(I)=∑i∈[x,x+l],j∈[y,y+l]​∇ij​I来说明，baseline image  I ′ I' I′本文选取的是blur的图片，也就是去掉高频分量的LR  I ′ = w ( σ ) ⊗ I I'=w(\sigma)\otimes I I′=w(σ)⊗I，经过模糊卷积操作，本质上就是要求取 D ( F ( I ) ) D(F(I)) D(F(I))的归因图，假设一个路径 γ ( α ) : [ 0 , 1 ] → R h × w , γ ( 0 ) = I ′ , γ ( 1 ) = I \gamma(\alpha):[0,1]\rightarrow\mathbb{R}^{h\times w},\gamma(0)=I',\gamma(1)=I γ(α):[0,1]→Rh×w,γ(0)=I′,γ(1)=I；归因图一个元素的求取： L A M F , D ( γ ) i : ∫ 0 1 ∂ D ( F ( γ ( α ) ) ) ∂ γ ( α ) i × ∂ γ ( α ) i ∂ α d α LAM_{F,D}(\gamma)_i:\int_0^1\frac{\partial D(F(\gamma(\alpha)))}{\partial \gamma(\alpha)_i}\times \frac{\partial \gamma(\alpha)_i}{\partial \alpha}d\alpha LAMF,D​(γ)i​:∫01​∂γ(α)i​∂D(F(γ(α)))​×∂α∂γ(α)i​​dα；专门设计的progressive blurring path function  γ p b ( α ) = w ( σ − α σ ) ⊗ I \gamma_{pb}(\alpha) = w(\sigma-\alpha\sigma)\otimes I γpb​(α)=w(σ−ασ)⊗I；最终求归因图的公式近似为:
 
  m m m是将积分用求和近似的步数