【深入pytorch】transforms.functional 梯度流动问题

实验环境:

torch.__version__
Out[3]: '1.12.1+cu113'

首先测试一下:

import torch
from torchvision.transforms import functional as F
from torch.autograd import Function
img = torch.randn(1, 3, 224, 224)
startpoints = torch.FloatTensor([[0., 0.], [0., 224.], [224., 0.], [224., 224.]])
endpoints = torch.FloatTensor([[0., 0.], [0., 200.], [200., 0.], [200., 200.]])
t = F.perspective(img, startpoints, endpoints)
print(t.requires_grad)

没有梯度

查看源码torchvision.transforms.functional.perspective,是transform的对外接口,负责处理输入数据类型等问题,

def perspective(img: Tensor,startpoints: List[List[int]],endpoints: List[List[int]],interpolation: InterpolationMode = InterpolationMode.BILINEAR,fill: Optional[List[float]] = None,
) -> Tensor:"""Perform perspective transform of the given image.If the image is torch Tensor, it is expectedto have [..., H, W] shape, where ... means an arbitrary number of leading dimensions.Args:img (PIL Image or Tensor): Image to be transformed.startpoints (list of list of ints): List containing four lists of two integers corresponding to four corners``[top-left, top-right, bottom-right, bottom-left]`` of the original image.endpoints (list of list of ints): List containing four lists of two integers corresponding to four corners``[top-left, top-right, bottom-right, bottom-left]`` of the transformed image.interpolation (InterpolationMode): Desired interpolation enum defined by:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.BILINEAR``.If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.For backward compatibility integer values (e.g. ``PIL.Image[.Resampling].NEAREST``) are still accepted,but deprecated since 0.13 and will be removed in 0.15. Please use InterpolationMode enum.fill (sequence or number, optional): Pixel fill value for the area outside the transformedimage. If given a number, the value is used for all bands respectively... note::In torchscript mode single int/float value is not supported, please use a sequenceof length 1: ``[value, ]``.Returns:PIL Image or Tensor: transformed Image."""if not torch.jit.is_scripting() and not torch.jit.is_tracing():_log_api_usage_once(perspective)coeffs = _get_perspective_coeffs(startpoints, endpoints)# Backward compatibility with integer valueif isinstance(interpolation, int):warnings.warn("Argument 'interpolation' of type int is deprecated since 0.13 and will be removed in 0.15. ""Please use InterpolationMode enum.")interpolation = _interpolation_modes_from_int(interpolation)if not isinstance(interpolation, InterpolationMode):raise TypeError("Argument interpolation should be a InterpolationMode")if not isinstance(img, torch.Tensor):pil_interpolation = pil_modes_mapping[interpolation]return F_pil.perspective(img, coeffs, interpolation=pil_interpolation, fill=fill)return F_t.perspective(img, coeffs, interpolation=interpolation.value, fill=fill)

继续看torchvision.transforms.functional_tensor.perspective,是tensor版本的功能性函数接口。

调用 _perspective_grid 函数生成透视变换的栅格

调用 _apply_grid_transform 函数,将生成的栅格应用到原始图像上,执行透视变换

def perspective(img: Tensor, perspective_coeffs: List[float], interpolation: str = "bilinear", fill: Optional[List[float]] = None
) -> Tensor:if not (isinstance(img, torch.Tensor)):raise TypeError("Input img should be Tensor.")_assert_image_tensor(img)_assert_grid_transform_inputs(img,matrix=None,interpolation=interpolation,fill=fill,supported_interpolation_modes=["nearest", "bilinear"],coeffs=perspective_coeffs,)ow, oh = img.shape[-1], img.shape[-2]dtype = img.dtype if torch.is_floating_point(img) else torch.float32grid = _perspective_grid(perspective_coeffs, ow=ow, oh=oh, dtype=dtype, device=img.device)return _apply_grid_transform(img, grid, interpolation, fill=fill)

其中:_apply_grid_transform 在给定的网格上对图像进行变换。将输入图像转换为与网格相同的数据类型,进行填充色处理。

def _apply_grid_transform(img: Tensor, grid: Tensor, mode: str, fill: Optional[List[float]]) -> Tensor:img, need_cast, need_squeeze, out_dtype = _cast_squeeze_in(img, [grid.dtype])if img.shape[0] > 1:# Apply same grid to a batch of imagesgrid = grid.expand(img.shape[0], grid.shape[1], grid.shape[2], grid.shape[3])# Append a dummy mask for customized fill colors, should be faster than grid_sample() twiceif fill is not None:dummy = torch.ones((img.shape[0], 1, img.shape[2], img.shape[3]), dtype=img.dtype, device=img.device)img = torch.cat((img, dummy), dim=1)img = grid_sample(img, grid, mode=mode, padding_mode="zeros", align_corners=False)# Fill with required colorif fill is not None:mask = img[:, -1:, :, :]  # N * 1 * H * Wimg = img[:, :-1, :, :]  # N * C * H * Wmask = mask.expand_as(img)len_fill = len(fill) if isinstance(fill, (tuple, list)) else 1fill_img = torch.tensor(fill, dtype=img.dtype, device=img.device).view(1, len_fill, 1, 1).expand_as(img)if mode == "nearest":mask = mask < 0.5img[mask] = fill_img[mask]else:  # 'bilinear'img = img * mask + (1.0 - mask) * fill_imgimg = _cast_squeeze_out(img, need_cast, need_squeeze, out_dtype)return img

提到 grid生成

生成一个透视栅格,输入参数是透视变换的系数(coeffs)、输出图像的宽度(ow)和高度(oh)。theta1和theta2,用于计算变换后的 x 和 y 坐标。创建一个基础栅格(base_grid),其中包含输出图像的所有像素位置。基础栅格与theta1和theta2进行矩阵乘法操作,生成变换后的位置。

def _perspective_grid(coeffs: List[float], ow: int, oh: int, dtype: torch.dtype, device: torch.device) -> Tensor:# https://github.com/python-pillow/Pillow/blob/4634eafe3c695a014267eefdce830b4a825beed7/src/libImaging/Geometry.c#L394## x_out = (coeffs[0] * x + coeffs[1] * y + coeffs[2]) / (coeffs[6] * x + coeffs[7] * y + 1)# y_out = (coeffs[3] * x + coeffs[4] * y + coeffs[5]) / (coeffs[6] * x + coeffs[7] * y + 1)#theta1 = torch.tensor([[[coeffs[0], coeffs[1], coeffs[2]], [coeffs[3], coeffs[4], coeffs[5]]]], dtype=dtype, device=device)theta2 = torch.tensor([[[coeffs[6], coeffs[7], 1.0], [coeffs[6], coeffs[7], 1.0]]], dtype=dtype, device=device)d = 0.5base_grid = torch.empty(1, oh, ow, 3, dtype=dtype, device=device)x_grid = torch.linspace(d, ow * 1.0 + d - 1.0, steps=ow, device=device)base_grid[..., 0].copy_(x_grid)y_grid = torch.linspace(d, oh * 1.0 + d - 1.0, steps=oh, device=device).unsqueeze_(-1)base_grid[..., 1].copy_(y_grid)base_grid[..., 2].fill_(1)rescaled_theta1 = theta1.transpose(1, 2) / torch.tensor([0.5 * ow, 0.5 * oh], dtype=dtype, device=device)output_grid1 = base_grid.view(1, oh * ow, 3).bmm(rescaled_theta1)output_grid2 = base_grid.view(1, oh * ow, 3).bmm(theta2.transpose(1, 2))output_grid = output_grid1 / output_grid2 - 1.0return output_grid.view(1, oh, ow, 2)

探索

https://github.com/python-pillow/Pillow/blob/4634eafe3c695a014267eefdce830b4a825beed7/src/libImaging/Geometry.c#L394

可梯度方式

https://github.com/pytorch/vision/pull/7925

torchvision.transforms.functional_tensor._perspective_grid中修改

    theta1 = torch.tensor([[[coeffs[0], coeffs[1], coeffs[2]], [coeffs[3], coeffs[4], coeffs[5]]]], dtype=dtype, device=device)theta2 = torch.tensor([[[coeffs[6], coeffs[7], 1.0], [coeffs[6], coeffs[7], 1.0]]], dtype=dtype, device=device)# 修改这两句,试图可微分theta1 = torch.reshape(coeffs[:6], (1, 2, 3)).to(dtype=dtype, device=device)theta2 = torch.reshape(torch.cat((coeffs[6:], torch.ones(1, device=coeffs.device))), (1, 1, 3)).expand(-1, 2, -1).to(dtype=dtype, device=device)

修改后报错:

C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional.py:629: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).b_matrix = torch.tensor(startpoints, dtype=torch.float).view(8)
Traceback (most recent call last):File "C:\conda\envs\CUDA110_torch\lib\site-packages\IPython\core\interactiveshell.py", line 3505, in run_codeexec(code_obj, self.user_global_ns, self.user_ns)File "<ipython-input-3-087b1cddce7b>", line 1, in <module>F.perspective(img, startpoints, endpoints)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional.py", line 688, in perspectivereturn F_t.perspective(img, coeffs, interpolation=interpolation.value, fill=fill)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional_tensor.py", line 748, in perspectivegrid = _perspective_grid(perspective_coeffs, ow=ow, oh=oh, dtype=dtype, device=img.device)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional_tensor.py", line 709, in _perspective_gridtheta1 = torch.reshape(coeffs[:6], (1, 2, 3)).to(dtype=dtype, device=device)
TypeError: reshape(): argument 'input' (position 1) must be Tensor, not list

修改为:

theta1 = torch.reshape(torch.tensor(coeffs[:6]), (1, 2, 3)).to(dtype=dtype, device=device)

报错

C:\conda\envs\CUDA110_torch\python.exe D:\myGit\ipad_attack\mydata\test_torch自带的transform丢失梯度问题.py 
C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional.py:629: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).b_matrix = torch.tensor(startpoints, dtype=torch.float).view(8)
Traceback (most recent call last):File "D:\myGit\ipad_attack\mydata\test_torch自带的transform丢失梯度问题.py", line 30, in <module>t = F.perspective(img, startpoints, endpoints)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional.py", line 688, in perspectivereturn F_t.perspective(img, coeffs, interpolation=interpolation.value, fill=fill)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional_tensor.py", line 748, in perspectivegrid = _perspective_grid(perspective_coeffs, ow=ow, oh=oh, dtype=dtype, device=img.device)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional_tensor.py", line 711, in _perspective_grid(coeffs[6:], torch.ones(1, device=coeffs.device))), (1, 1, 3)).expand(-1, 2, -1).to(dtype=dtype, device=device)
AttributeError: 'list' object has no attribute 'device'Process finished with exit code 1

打印coeffs

[1.1200001239776611,2.2111888142717362e-07,-2.181137097068131e-05,-2.3566984452827455e-07,1.1200006008148193,9.300353667640593e-06,-5.271598157996493e-10,1.9876917889405377e-09]

修改为:

运行测试代码:

import torch
from torchvision.transforms import functional as F
from torch.autograd import Function
img = torch.randn(1, 3, 224, 224)
startpoints = torch.tensor([[0., 0.], [0., 224.], [224., 0.], [224., 224.]], requires_grad=True)
endpoints = torch.tensor([[0., 0.], [0., 200.], [200., 0.], [200., 200.]], requires_grad=True)print(startpoints.requires_grad)
t = F.perspective(img, startpoints, endpoints)
y = t**3
y.backward()
print(t.requires_grad,y)

发现:

C:\conda\envs\CUDA110_torch\lib\site-packages\torchvision\transforms\functional.py:629: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).b_matrix = torch.tensor(startpoints, dtype=torch.float).view(8)
Traceback (most recent call last):File "D:\myGit\ipad_attack\mydata\test_torch自带的transform丢失梯度问题.py", line 33, in <module>t.backward()File "C:\conda\envs\CUDA110_torch\lib\site-packages\torch\_tensor.py", line 396, in backwardtorch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)File "C:\conda\envs\CUDA110_torch\lib\site-packages\torch\autograd\__init__.py", line 173, in backwardVariable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[12], line 119 t = F.perspective(img, startpoints, endpoints)10 y = t**3
---> 11 y.backward()12 print(t.requires_grad,y)File C:\conda\envs\CUDA110_torch\lib\site-packages\torch\_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)387 if has_torch_function_unary(self):388     return handle_torch_function(389         Tensor.backward,390         (self,),(...)394         create_graph=create_graph,395         inputs=inputs)
--> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)File C:\conda\envs\CUDA110_torch\lib\site-packages\torch\autograd\__init__.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)168     retain_graph = create_graph170 # The reason we repeat same the comment below is that171 # some Python versions print out the first line of a multi-line function172 # calls in the traceback and some print out the last line
--> 173 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass174     tensors, grad_tensors_, retain_graph, create_graph, inputs,175     allow_unreachable=True, accumulate_grad=True)RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/230801.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

.NET Core中鉴权 Authentication Authorization

Authentication: 鉴定身份信息&#xff0c;例如用户有没有登录&#xff0c;用户基本信息 Authorization: 判定用户有没有权限 使用框架提供的Cookie鉴权方式 1.首先在服务容器注入鉴权服务和Cookie服务支持 services.AddAuthentication(options > {options.DefaultAuthe…

【性能优化】一、使用JMeter进行压力测试并进行简单调优

压力测试 压力测试不同于功能测试&#xff0c;其目的是为了测试出系统在高并发&#xff0c;高数据量的情况下可能会出现的问题&#xff08;内存泄露、并发、同步&#xff09; 一种典型的内存泄漏就是对象在创建之后由很多用户进行调用&#xff0c;导致对象被不断新建但复用率…

2020 年网络安全应急响应分析报告

2020 年全年奇安信集团安服团队共参与和处置了全国范围内 660起网络安全应急响应事件。2020 年全年应急响应处置事件行业 TOP3 分别为:政府部门行业(146 起)医疗卫生行业(90 起)以及事业单位(61 起&#xff0c;事件处置数分别占应急处置所有行业的 22.1%、13.6%、9.2%。2020 年…

防篡改、控权限,一键搞定!迅软DLP助您轻松应对企业外发风险

由于电子文档传播性强&#xff0c;政企单位在与客户或合作伙伴分享重要资料时&#xff0c;存在非法篡改和无序传播的风险。因此&#xff0c;为了保护自身利益并确保与外界的安全交流&#xff0c;对外发文件的有效安全管控变得至关重要。 迅软DLP提供了针对外发文件的严格安全管…

OAuth 2.0进阶指南:解锁高级功能的秘密

欢迎来到我的博客&#xff0c;代码的世界里&#xff0c;每一行都是一个故事 OAuth 2.0进阶指南&#xff1a;解锁高级功能的秘密 前言令牌管理与刷新令牌的生命周期&#xff1a;刷新机制&#xff1a;有效管理访问令牌&#xff0c;防止令牌泄漏的方法&#xff1a; 客户端凭证客户…

宝塔Linux:部署His医疗项目通过jar包的方式

&#x1f4da;&#x1f4da; &#x1f3c5;我是默&#xff0c;一个在CSDN分享笔记的博主。&#x1f4da;&#x1f4da; ​​​ &#x1f31f;在这里&#xff0c;我要推荐给大家我的专栏《Linux》。&#x1f3af;&#x1f3af; &#x1f680;无论你是编程小白&#xff0c;还是有…

OpenAI发布官方提示工程指南和示例

OpenAI提供了一系列策略和技巧&#xff0c;以帮助用户更有效地使用ChatGPT。这些方法可以单独使用也可以组合使用&#xff0c;以获得更好的效果。官方给出了6 个大提示策略&#xff08;并给出了具体教程和示例&#xff09; 主要策略&#xff1a; 1、清晰的指令&#xff1a; 告…

测试估算:确保项目成功的关键

引言&#xff1a; 在软件开发过程中&#xff0c;测试是不可或缺的一环。它可以帮助发现和修复软件中的错误和缺陷&#xff0c;提高软件的质量和可靠性。然而&#xff0c;测试工作需要耗费大量的时间和资源&#xff0c;因此进行测试估算是至关重要的。本文将介绍测试估算的重要性…

微信小程序校园跑腿系统怎么做,如何做,要做多久

​ 在这个互联网快速发展、信息爆炸的时代&#xff0c;人人都离不开手机&#xff0c;每个人都忙于各种各样的事情&#xff0c;大学生也一样&#xff0c;有忙于学习&#xff0c;忙于考研&#xff0c;忙着赚学分&#xff0c;忙于参加社团&#xff0c;当然也有忙于打游戏的&#x…

js中国标准时间转换

一、将中国标准时间转换为 例如 2023-12-18 08:00:00 // 获取今天的日期let today new Date();// 设置 beginDate 为今天的上午8点let beginDate new Date(today.getFullYear(), today.getMonth(), today.getDate(), 8, 0, 0, 0);// 设置 finishDate 为 beginDate 的后三天的…

快速排序(一)

目录 快速排序&#xff08;hoare版本&#xff09; 初级实现 问题改进 中级实现 时空复杂度 高级实现 三数取中 快速排序&#xff08;hoare版本&#xff09; 历史背景&#xff1a;快速排序是Hoare于1962年提出的一种基于二叉树思想的交换排序方法 基本思想&#xff1a…

Flink系列之:窗口去重

Flink系列之&#xff1a;窗口去重 一、窗口去重二、示例三、限制 一、窗口去重 适用于Streaming窗口去重是一种特殊的去重&#xff0c;它根据指定的多个列来删除重复的行&#xff0c;保留每个窗口和分区键的第一个或最后一个数据。对于流式查询&#xff0c;与普通去重不同&…

软件测试技术分享| Appium用例录制

下载及安装 下载地址&#xff1a; github.com/appium/appi… 下载对应系统的 Appium 版本&#xff0c;安装完成之后&#xff0c;点击 “Start Server”&#xff0c;就启动了 Appium Server。 在启动成功页面点击右上角的放大镜&#xff0c;进入到创建 Session 页面。配置好…

QT作业3

完善对话框&#xff0c;点击登录对话框&#xff0c;如果账号和密码匹配&#xff0c;则弹出信息对话框&#xff0c;给出提示”登录成功“&#xff0c;提供一个Ok按钮&#xff0c;用户点击Ok后&#xff0c;关闭登录界面&#xff0c;跳转到其他界面 如果账号和密码不匹配&#xf…

Java 程序的命令行解释器

前几天我写了一个简单的词法分析器项目&#xff1a;https://github.com/MarchLiu/oliva/tree/main/lora-data-generator 。 通过词法分析快速生成 lora 训练集。在这个过程中&#xff0c;我需要通过命令行参数给这个 java 程序传递一些参数。 这个工作让我想起了一些不好的回忆…

对Arthas-Trace命令的一次深度剖析,竟发现...

前言&#xff1a;此文仅为笔者学习Arthas源码的一次尝试&#xff0c;不对本文结论负全部责任。 一、背景 笔者在学习arthas这个十分方便的小工具的过程中&#xff0c;发现&#xff1a; 目前据arthas官方解释&#xff1a;因为trace多层是十分消耗资源的&#xff0c;因此trace命…

【期刊出版征稿】2024年艺术、教育和管理国际学术会议(ICAEM2024)

2024年艺术、教育和管理国际学术会议 2024 International Conference on Arts, Education and Management&#xff08;ICAEM2024&#xff09; 2024年艺术、教育和管理国际学术会议&#xff08;ICAEM2024&#xff09;将于2024年2月02-04日在马来西亚-吉隆坡召开。会议主题主要…

跨境助手:提升跨境电商卖家运营效率的利器

在如今全球化的商业环境中&#xff0c;跨境电商成为越来越多卖家追逐的商机。然而&#xff0c;对于新手卖家来说&#xff0c;跨境电商的复杂性和竞争激烈的市场环境可能会成为入坑的风险。如何降低风险、提高运营效率成为卖家们关注的焦点。而跨境助手作为一款专为跨境电商卖家…

Python Pandas 如何增加/插入一列数据(第5讲)

Python Pandas 如何增加/插入一列数据(第5讲)         🍹博主 侯小啾 感谢您的支持与信赖。☀️ 🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹꧔ꦿ🌹…

Spring Boot JSON中文文档

本文为官方文档直译版本。原文链接 Spring Boot JSON中文文档 引言Jackson自定义序列化器和反序列化器混入 GsonJSON-B 引言 Spring Boot 提供与三个 JSON 映射库的集成&#xff1a; GsonJacksonJSON-B Jackson 是首选的默认库。 Jackson Spring-boot-starter-json 提供了…