1.torch训练的yolov5转trt出现问题如下:
Using CUDA device0 _CudaDeviceProperties(name='NVIDIA GeForce RTX 3080', total_memory=10017MB)Find Pytorch weight
Traceback (most recent call last):File "export.py", line 243, in <module>ckpt = torch.load(opt.weight, map_location=device)File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 592, in loadreturn _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 851, in _loadresult = unpickler.load()
ModuleNotFoundError: No module named 'models'
2.解决办法:
直接先用yolov5自带的export.py转成.onnx模型,再通过onnx转trt,问题解决
Find ONNX weightTensorRT: starting export with TensorRT 8.4.0.6...
[08/24/2023-18:57:25] [TRT] [I] [MemUsageChange] Init CUDA: CPU +359, GPU +0, now: CPU 426, GPU 401 (MiB)
[08/24/2023-18:57:26] [TRT] [I] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 444 MiB, GPU 401 MiB
[08/24/2023-18:57:27] [TRT] [I] [MemUsageSnapshot] End constructing builder kernel library: CPU 819 MiB, GPU 523 MiB
[08/24/2023-18:57:27] [TRT] [I] ----------------------------------------------------------------
[08/24/2023-18:57:27] [TRT] [I] Input filename: ../best.onnx
[08/24/2023-18:57:27] [TRT] [I] ONNX IR version: 0.0.6
[08/24/2023-18:57:27] [TRT] [I] Opset version: 11
[08/24/2023-18:57:27] [TRT] [I] Producer name: pytorch
[08/24/2023-18:57:27] [TRT] [I] Producer version: 1.9
[08/24/2023-18:57:27] [TRT] [I] Domain:
[08/24/2023-18:57:27] [TRT] [I] Model version: 0
[08/24/2023-18:57:27] [TRT] [I] Doc string:
[08/24/2023-18:57:27] [TRT] [I] ----------------------------------------------------------------
[08/24/2023-18:57:27] [TRT] [W] onnx2trt_utils.cpp:365: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
TensorRT: Network Description:
TensorRT: input "images" with shape (1, 3, 640, 640) and dtype DataType.FLOAT
TensorRT: output "output" with shape (1, 25200, 20) and dtype DataType.FLOAT
TensorRT: building FP16 engine in ../best.engine
[08/24/2023-18:57:29] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 11.3.0
[08/24/2023-18:57:29] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +637, GPU +268, now: CPU 1545, GPU 791 (MiB)
[08/24/2023-18:57:29] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +356, GPU +258, now: CPU 1901, GPU 1049 (MiB)
[08/24/2023-18:57:29] [TRT] [W] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.0.5
[08/24/2023-18:57:29] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[08/24/2023-18:58:37] [TRT] [I] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes.
[08/24/2023-19:06:05] [TRT] [I] Detected 1 inputs and 4 output network tensors.
[08/24/2023-19:06:08] [TRT] [I] Total Host Persistent Memory: 218880
[08/24/2023-19:06:08] [TRT] [I] Total Device Persistent Memory: 1197056
[08/24/2023-19:06:08] [TRT] [I] Total Scratch Memory: 0
[08/24/2023-19:06:08] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 48 MiB, GPU 2470 MiB
[08/24/2023-19:06:08] [TRT] [I] [BlockAssignment] Algorithm ShiftNTopDown took 29.1457ms to assign 9 blocks to 142 nodes requiring 25804804 bytes.
[08/24/2023-19:06:08] [TRT] [I] Total Activation Memory: 25804804
[08/24/2023-19:06:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +40, GPU +42, now: CPU 40, GPU 42 (MiB)
export.py:172: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.from cryptography.fernet import Fernet
TensorRT: export success, saved as ../best.engine
3.原因及其他解决办法
网上查了一下,主要原因是在保存训练的模型时,使用的torch.save(model, path),而在加载时使用的model = torch.load(path);export.py中对pt的加载源码如下:
if pt:logger.info("Find Pytorch weight")ckpt = torch.load(opt.weight, map_location=device)if opt.noema:model = ckpt['model']else:model = ckpt['ema'] if ckpt.get('ema') else ckpt['model']meta = get_meta_data(ckpt, model, meta)if opt.int8:zero_scale_fix(model, device)if model.__name__ != "EfficentYolo":for sub_fusion_list in op_concat_fusion_list[model.__name__]:ops = [get_module(model, op_name) for op_name in sub_fusion_list]concat_quant_amax_fuse(ops)for sub_fusion_list in op_concat_fusion_list[model.type]:ops = [get_module(model, op_name) for op_name in sub_fusion_list]concat_quant_amax_fuse(ops)model.float()if not opt.int8:model.fuse()model.to(device)model.eval()if opt.int8:quant_nn.TensorQuantizer.use_fb_fake_quant = Trueim = torch.zeros(1, 3, *imgsz).to(device)# 模型detect layer为了支持onnx的导出,所必须的更改# model.detect.inplace = Falseif not(hasattr(model, 'type') and model.type in ['anchorfree', 'anchorbase']):model.type = 'anchorbase'model.detect.dynamic = dynamicmodel.detect.export = True # 减少输出数量# 验证torch模型是否正常for _ in range(2):y = model(im) # dry runs# 从模型中读取模型的labels,并保存到labels.txt下labels = str({i:l for i,l in enumerate(model.labels)})with open(file.parents[0]/'labels.txt','w') as f:f.write(labels)logger.info("the torch model is very successful, it's no possible!")if 'onnx' in opt.include or 'trt' in opt.include:try:import tensorrt as trtif model.type == 'anchorfree':export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)elif model.type == 'anchorbase':if int(trt.__version__[0]) == 7: # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012model.detect.inplace = Falsegrid = model.detect.anchor_gridmodel.detect.anchor_grid = [a[..., :1, :1, :] for a in grid]export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple) # opset 12model.detect.anchor_grid = gridelse: # TensorRT >= 8export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple) # opset 13except:logger.info("TRT ERROR, will custom onnx!")export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)onnx_file = file.with_suffix('.onnx')add_meta_to_model(onnx_file, meta)if opt.int8:get_remove_qdq_onnx_and_cache(file.with_suffix('.onnx'))add_meta_to_model(str(onnx_file).replace('.onnx', '_wo_qdq.onnx'), meta)if 'trt' in opt.include:if opt.old:meta = Falseexport_engine(onnx_file, None, meta=meta, half=opt.half, int8=opt.int8, workspace=opt.worker, encode=opt.encode, verbose=opt.verbose)else: logger.info("Find ONNX weight")if not opt.old:meta = get_meta_data(file, None, meta)meta['half'] = opt.halfmeta['int8'] = opt.int8meta['encode'] = opt.encodeif opt.old:meta = False
猜测可能是模型在训练时,保存了一些其他参数信息,这些参数可能涉及到训练模型的位置等,模型迁移到其他机器上时,比如需要使用的机器上转trt时,找不到该位置了,可以先转成通用的onnx模型,再转trt。