pytorch中的torch.nn.Linear

  torch.nn.Linear是pytorch中的线性层,应该是最常见的网络层了,官方文档:torch.nn.Linear。

torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

  其中,in_features表示输入的维度;out_features表示输出的维度;bias表示是否包含偏置,默认为True。
  nn.linear的作用其实就是对输入进行了一个线性变换,中学时我们学习的线性变换是y=kx+b,但是对于神经网络来说,我们的输入、输出和权重都是一个矩阵,即: o u t p u t = i n p u t ∗ W + b output=input*W+b output=inputW+b  其中, i n p u t ∈ R n × i input\in R^{n×i} inputRn×i W ∈ R i × o W\in R^{i×o} WRi×o o u t p u t ∈ R n × o output\in R^{n×o} outputRn×o,n为输入向量的行数(通常为batch数),i为输入神经元的个数,o为输出神经元的个数。使用举例:

FC = nn.Linear(20, 40)
input = torch.randn(128, 20) # (128,20)
output = FC(input)
print(output.size())  # (128,40)

  官方源码:

import mathimport torch
from torch import Tensor
from torch.nn.parameter import Parameter, UninitializedParameter
from .. import functional as F
from .. import init
from .module import Module
from .lazy import LazyModuleMixinclass Identity(Module):r"""A placeholder identity operator that is argument-insensitive.Args:args: any argument (unused)kwargs: any keyword argument (unused)Shape:- Input: :math:`(*)`, where :math:`*` means any number of dimensions.- Output: :math:`(*)`, same shape as the input.Examples::>>> m = nn.Identity(54, unused_argument1=0.1, unused_argument2=False)>>> input = torch.randn(128, 20)>>> output = m(input)>>> print(output.size())torch.Size([128, 20])"""def __init__(self, *args, **kwargs):super(Identity, self).__init__()def forward(self, input: Tensor) -> Tensor:return inputclass Linear(Module):r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`This module supports :ref:`TensorFloat32<tf32_on_ampere>`.Args:in_features: size of each input sampleout_features: size of each output samplebias: If set to ``False``, the layer will not learn an additive bias.Default: ``True``Shape:- Input: :math:`(*, H_{in})` where :math:`*` means any number ofdimensions including none and :math:`H_{in} = \text{in\_features}`.- Output: :math:`(*, H_{out})` where all but the last dimensionare the same shape as the input and :math:`H_{out} = \text{out\_features}`.Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in\_features})`. The values areinitialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where:math:`k = \frac{1}{\text{in\_features}}`Examples::>>> m = nn.Linear(20, 30)>>> input = torch.randn(128, 20)>>> output = m(input)>>> print(output.size())torch.Size([128, 30])"""__constants__ = ['in_features', 'out_features']in_features: intout_features: intweight: Tensordef __init__(self, in_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}super(Linear, self).__init__()self.in_features = in_featuresself.out_features = out_featuresself.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))if bias:self.bias = Parameter(torch.empty(out_features, **factory_kwargs))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self) -> None:# Setting a=sqrt(5) in kaiming_uniform is the same as initializing with# uniform(-1/sqrt(in_features), 1/sqrt(in_features)). For details, see# https://github.com/pytorch/pytorch/issues/57109init.kaiming_uniform_(self.weight, a=math.sqrt(5))if self.bias is not None:fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0init.uniform_(self.bias, -bound, bound)def forward(self, input: Tensor) -> Tensor:return F.linear(input, self.weight, self.bias)def extra_repr(self) -> str:return 'in_features={}, out_features={}, bias={}'.format(self.in_features, self.out_features, self.bias is not None)# This class exists solely to avoid triggering an obscure error when scripting
# an improperly quantized attention layer. See this issue for details:
# https://github.com/pytorch/pytorch/issues/58969
# TODO: fail fast on quantization API usage error, then remove this class
# and replace uses of it with plain Linear
class NonDynamicallyQuantizableLinear(Linear):def __init__(self, in_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:super().__init__(in_features, out_features, bias=bias,device=device, dtype=dtype)[docs]class Bilinear(Module):r"""Applies a bilinear transformation to the incoming data::math:`y = x_1^T A x_2 + b`Args:in1_features: size of each first input samplein2_features: size of each second input sampleout_features: size of each output samplebias: If set to False, the layer will not learn an additive bias.Default: ``True``Shape:- Input1: :math:`(*, H_{in1})` where :math:`H_{in1}=\text{in1\_features}` and:math:`*` means any number of additional dimensions including none. All but the last dimensionof the inputs should be the same.- Input2: :math:`(*, H_{in2})` where :math:`H_{in2}=\text{in2\_features}`.- Output: :math:`(*, H_{out})` where :math:`H_{out}=\text{out\_features}`and all but the last dimension are the same shape as the input.Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in1\_features}, \text{in2\_features})`.The values are initialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in1\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in1\_features}}`Examples::>>> m = nn.Bilinear(20, 30, 40)>>> input1 = torch.randn(128, 20)>>> input2 = torch.randn(128, 30)>>> output = m(input1, input2)>>> print(output.size())torch.Size([128, 40])"""__constants__ = ['in1_features', 'in2_features', 'out_features']in1_features: intin2_features: intout_features: intweight: Tensordef __init__(self, in1_features: int, in2_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}super(Bilinear, self).__init__()self.in1_features = in1_featuresself.in2_features = in2_featuresself.out_features = out_featuresself.weight = Parameter(torch.empty((out_features, in1_features, in2_features), **factory_kwargs))if bias:self.bias = Parameter(torch.empty(out_features, **factory_kwargs))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self) -> None:bound = 1 / math.sqrt(self.weight.size(1))init.uniform_(self.weight, -bound, bound)if self.bias is not None:init.uniform_(self.bias, -bound, bound)def forward(self, input1: Tensor, input2: Tensor) -> Tensor:return F.bilinear(input1, input2, self.weight, self.bias)def extra_repr(self) -> str:return 'in1_features={}, in2_features={}, out_features={}, bias={}'.format(self.in1_features, self.in2_features, self.out_features, self.bias is not None)class LazyLinear(LazyModuleMixin, Linear):r"""A :class:`torch.nn.Linear` module where `in_features` is inferred.In this module, the `weight` and `bias` are of :class:`torch.nn.UninitializedParameter`class. They will be initialized after the first call to ``forward`` is done and themodule will become a regular :class:`torch.nn.Linear` module. The ``in_features`` argumentof the :class:`Linear` is inferred from the ``input.shape[-1]``.Check the :class:`torch.nn.modules.lazy.LazyModuleMixin` for further documentationon lazy modules and their limitations.Args:out_features: size of each output samplebias: If set to ``False``, the layer will not learn an additive bias.Default: ``True``Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in\_features})`. The values areinitialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where:math:`k = \frac{1}{\text{in\_features}}`"""cls_to_become = Linear  # type: ignore[assignment]weight: UninitializedParameterbias: UninitializedParameter  # type: ignore[assignment]def __init__(self, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}# bias is hardcoded to False to avoid creating tensor# that will soon be overwritten.super().__init__(0, 0, False)self.weight = UninitializedParameter(**factory_kwargs)self.out_features = out_featuresif bias:self.bias = UninitializedParameter(**factory_kwargs)def reset_parameters(self) -> None:if not self.has_uninitialized_params() and self.in_features != 0:super().reset_parameters()def initialize_parameters(self, input) -> None:  # type: ignore[override]if self.has_uninitialized_params():with torch.no_grad():self.in_features = input.shape[-1]self.weight.materialize((self.out_features, self.in_features))if self.bias is not None:self.bias.materialize((self.out_features,))self.reset_parameters()
# TODO: PartialLinear - maybe in sparse?

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/783884.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

大模型LLM论文整理

Gemini&#xff1a;一族功能强大的多模态模 论文名称&#xff1a;Gemini: A Family of Highly Capable Multimodal Models 论文地址&#xff1a;https://arxiv.org/pdf/2312.11805 会议&#xff1a; 论文方法&#xff1a;该论文介绍了一种新的多模态模型系列&#xff0c;Gem…

C++心决之命名空间、重载函数和引用

目录 1. C关键字(C98) 2. 命名空间 2.1 命名空间定义 2.2 命名空间使用 3. C输入&输出 4. 缺省参数 4.1 缺省参数概念 4.2 缺省参数分类 5. 函数重载 5.1 函数重载概念 5.2 C支持函数重载的原理--名字修饰(name Mangling) 6. 引用 6.1 引用概念 6.2 引用特性…

代码随想录第九天: 字符串完结

语言: Java 参考资料: 代码随想录、ChatGPT3.5 28. 实现 strStr() 力扣题目链接(opens new window) 实现 strStr() 函数。 给定一个 haystack 字符串和一个 needle 字符串&#xff0c;在 haystack 字符串中找出 needle 字符串出现的第一个位置 (从0开始)。如果不存在&#…

问题大全——C语言及数据结构篇(自用)

目录 printf函数中那些%d %f等等的作用 运算符的优先级 选择排序 冒泡排序 快速排序 折半查找 归并排序 strcat strcpy scanf可以限制输入格式么 c语言实现链式队列&#xff0c;输入数字入队&#xff0c;输入字符出队。 统计一个范围内的素数 打印水仙花数 打印杨…

Android Q(10)黑暗模式适配的实现

一、引言 随着 AndroidQ&#xff08;10&#xff09;的发布&#xff0c;黑暗模式成为了系统级别的特性。为了满足用户在不同环境下的使用需求&#xff0c;应用程序需要及时进行黑暗模式的适配。本文将详细介绍如何在 AndroidQ&#xff08;10&#xff09;上实现黑暗模式的适配&a…

【架构篇】看完这篇,不要再说你不会性能调优了

调优方向 系统调优是一个复杂且多方面的任务&#xff0c;主要目标是提高系统的性能和效率。它可以针对不同的层面和组件进行&#xff0c;包括硬件、操作系统、网络、软件应用程序等。以下是一些常见的系统调优方面&#xff1a; 硬件优化&#xff1a; CPU性能优化&#xff1a;…

基于spark的大数据分析预测地震受灾情况的系统设计

基于spark的大数据分析预测地震受灾情况的系统设计 在本篇博客中,我们将介绍如何使用Apache Spark框架进行地震受灾情况的预测。我们将结合数据分析、特征工程、模型训练和评估等步骤,最终建立一个预测模型来预测地震造成的破坏程度,同时使用可视化大屏的方式展示数据的分布…

补代码随想录算法训练营第39天 | 62.不同路径、63. 不同路径 II

慢慢开始有点感觉了 初始化很巧妙 62.不同路径 本题大家掌握动态规划的方法就可以。 数论方法 有点非主流&#xff0c;很难想到。 代码随想录 视频讲解&#xff1a;动态规划中如何初始化很重要&#xff01;| LeetCode&#xff1a;62.不同路径_哔哩哔哩_bilibili 63. 不同路径…

DreamSim技术小结

paperhttps://arxiv.org/abs/2306.09344codehttps://github.com/ssundaram21/dreamsimorgMiT个人博客主页http://myhz0606.com/article/dream_sim 1 Motivation 目前较为成熟度量图片相似性的做法是通过模型将图片转为embedding&#xff0c;再用余弦相似度来度量相似性。虽然…

【数据分析面试】1. 计算年度收入百分比(SQL)

题目 你需要为公司的营收来源生成一份年度报告。计算截止目前为止&#xff0c;在表格中记录的第一年和最后一年所创造的总收入百分比。将百分比四舍五入到两位小数。 示例&#xff1a; 输入&#xff1a; annual_payments 表 列名类型amountINTEGERcreated_atDATETIMEstatusV…

Linux企业级别日志的查找

企业级别日志的查找 查看mysql数据库的日志错误日志&#xff08;Error Log&#xff09;查询日志&#xff08;General Query Log&#xff09;慢查询日志&#xff08;Slow Query Log&#xff09;事务日志&#xff08;Transaction Log&#xff09;二进制日志&#xff08;Binary Lo…

Thread 之start 和run 的区别

Java Thread 之start 和run 的区别 用start方法来启动线程&#xff0c;真正实现了多线程运行&#xff0c;这时无需等待run方法体代码执行完毕而直接继续执行下面的代码。通过调用Thread类的start()方法来启动一个线程&#xff0c;这时此线程处于就绪&#xff08;可运行&#x…

P1739 表达式括号匹配

题目:P1739 表达式括号匹配 - 洛谷 | 计算机科学教育新生态 (luogu.com.cn) 代码&#xff1a; #include<bits/stdc.h> using namespace std; int main() {char c;stack<char> s;while(cin>>c&&c!){if(c()s.push(c);if(c)){if(!s.empty())s.pop();e…

【MATLAB源码-第23期】基于matlab的短时傅里叶STFT信号变换仿真,得到信号的时频曲线图。

操作环境&#xff1a; MATLAB 2022a 1、算法描述 短时傅里叶变换&#xff08;Short-Time Fourier Transform&#xff0c;STFT&#xff09;是傅里叶变换的一种扩展&#xff0c;用于分析信号在时域和频域上的变化。描述如下&#xff1a; 1. **时域与频域分析**&#xff1a; …

【Chapter2】进程、线程与作业,计算机操作系统教程,第四版,左万利,王英

文章目录 [toc] 一、多道程序设计1.1单道程序设计的缺点1.2多道程序设计的提出1.3多道程序设计存在的问题 二、进程的引入2.1进程的概念2.2进程的组成2.2.1进程控制块2.2.2程序 2.3进程的类型及特征2.3.1进程的类型2.3.2进程的特征 2.4进程的状态及转换2.4.1进程的状态创建态就…

【对比golang和java的区别】

&#x1f308;个人主页:程序员不想敲代码啊 &#x1f3c6;CSDN优质创作者&#xff0c;CSDN实力新星&#xff0c;CSDN博客专家 &#x1f44d;点赞⭐评论⭐收藏 &#x1f91d;希望本文对您有所裨益&#xff0c;如有不足之处&#xff0c;欢迎在评论区提出指正&#xff0c;让我们共…

【LeetCode: 330. 按要求补齐数组 + 贪心 + 构造区间】

&#x1f680; 算法题 &#x1f680; &#x1f332; 算法刷题专栏 | 面试必备算法 | 面试高频算法 &#x1f340; &#x1f332; 越难的东西,越要努力坚持&#xff0c;因为它具有很高的价值&#xff0c;算法就是这样✨ &#x1f332; 作者简介&#xff1a;硕风和炜&#xff0c;…

Beans模块之工厂模块DisposableBean

博主介绍&#xff1a;✌全网粉丝5W&#xff0c;全栈开发工程师&#xff0c;从事多年软件开发&#xff0c;在大厂呆过。持有软件中级、六级等证书。可提供微服务项目搭建与毕业项目实战&#xff0c;博主也曾写过优秀论文&#xff0c;查重率极低&#xff0c;在这方面有丰富的经验…

JS:错误捕获(try...catch/window.onerror/window.addEventListener)

一、try...catch 1.在同步任务中 <script>let a 0;try {//要执行的代码console.log(b);} catch (e) {//如果有错误&#xff0c;执行这里的代码console.log(e); //ReferenceError: b is not defined}</script> 2.在异步任务中 2.1 promise中 <script>new …

【STM32嵌入式系统设计与开发】——12IWDG(独立看门狗应用)

这里写目录标题 一、任务描述二、任务实施1、ActiveBeep工程文件夹创建2、函数编辑&#xff08;1&#xff09;主函数编辑&#xff08;2&#xff09;USART1初始化函数(usart1_init())&#xff08;3&#xff09;USART数据发送函数&#xff08; USART1_Send_Data&#xff08;&…