分类目录:《深入浅出Pytorch函数》总目录
torch.nn.init
模块中的所有函数都用于初始化神经网络参数,因此它们都在torc.no_grad()
模式下运行,autograd
不会将其考虑在内。
根据Glorot, X.和Bengio, Y.在《Understanding the difficulty of training deep feedforward neural networks》中描述的方法,用一个均匀分布生成值,填充输入的张量或变量。结果张量中的值采样自 U ( − a , a ) U(-a, a) U(−a,a),其中:
a = gain × 6 fan_in + fan_put a=\text{gain}\times\sqrt{\frac{6}{\text{fan\_in}+\text{fan\_put}}} a=gain×fan_in+fan_put6
这种方法也被称为Glorot initialization。
语法
torch.nn.init.xavier_uniform_(tensor, gain=1)
参数
tensor
:[Tensor
] 一个 N N N维张量torch.Tensor
gain
:[float
] 可选的缩放因子
返回值
一个torch.Tensor
实例
w = torch.empty(3, 5)
nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))
函数实现
def xavier_uniform_(tensor: Tensor, gain: float = 1.) -> Tensor:r"""Fills the input `Tensor` with values according to the methoddescribed in `Understanding the difficulty of training deep feedforwardneural networks` - Glorot, X. & Bengio, Y. (2010), using a uniformdistribution. The resulting tensor will have values sampled from:math:`\mathcal{U}(-a, a)` where.. math::a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in} + \text{fan\_out}}}Also known as Glorot initialization.Args:tensor: an n-dimensional `torch.Tensor`gain: an optional scaling factorExamples:>>> w = torch.empty(3, 5)>>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))"""fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)std = gain * math.sqrt(2.0 / float(fan_in + fan_out))a = math.sqrt(3.0) * std # Calculate uniform bounds from standard deviationreturn _no_grad_uniform_(tensor, -a, a)