Yolov10网络详解与实战(附数据集)

文章目录

  • 摘要
  • 模型详解
  • 模型实战
    • 训练COCO数据集
      • 下载数据集
    • COCO转yolo格式数据集(适用V4,V5,V6,V7,V8)
      • 配置yolov10环境
      • 训练
      • 断点训练
      • 测试
    • 训练自定义数据集
      • Labelme数据集
      • 格式转换
      • 训练
      • 测试
  • 总结

摘要

模型详解

模型实战

训练COCO数据集

本次使用2017版本的COCO数据集作为例子,演示如何使用YoloV10训练和预测。

下载数据集

Images:

  • 2017 Train images [118K/18GB] :http://images.cocodataset.org/zips/train2017.zip
  • 2017 Val images [5K/1GB]:http://images.cocodataset.org/zips/val2017.zip
  • 2017 Test images [41K/6GB]:http://images.cocodataset.org/zips/unlabeled2017.zip

Annotations:

  • 2017 annotations_trainval2017 [241MB]:http://images.cocodataset.org/annotations/annotations_trainval2017.zip

COCO转yolo格式数据集(适用V4,V5,V6,V7,V8)

最初的研究论文中,COCO中有91个对象类别。然而,在2014年的第一次发布中,仅发布了80个标记和分割图像的对象类别。2014年发布之后,2017年发布了后续版本。详细的类别如下:

IDOBJECT (PAPER)OBJECT (2014 REL.)OBJECT (2017 REL.)SUPER CATEGORY
1personpersonpersonperson
2bicyclebicyclebicyclevehicle
3carcarcarvehicle
4motorcyclemotorcyclemotorcyclevehicle
5airplaneairplaneairplanevehicle
6busbusbusvehicle
7traintraintrainvehicle
8trucktrucktruckvehicle
9boatboatboatvehicle
10trafficlighttraffic lighttraffic lightoutdoor
11fire hydrantfire hydrantfire hydrantoutdoor
12streetsign--
13stop signstop signstop signoutdoor
14parking meterparking meterparking meteroutdoor
15benchbenchbenchoutdoor
16birdbirdbirdanimal
17catcatcatanimal
18dogdogdoganimal
19horsehorsehorseanimal
20sheepsheepsheepanimal
21cowcowcowanimal
22elephantelephantelephantanimal
23bearbearbearanimal
24zebrazebrazebraanimal
25giraffegiraffegiraffeanimal
26hat--accessory
27backpackbackpackbackpackaccessory
28umbrellaumbrellaumbrellaaccessory
29shoe--accessory
30eye glasses--accessory
31handbaghandbaghandbagaccessory
32tietietieaccessory
33suitcasesuitcasesuitcaseaccessory
34frisbeefrisbeefrisbeesports
35skisskisskissports
36snowboardsnowboardsnowboardsports
37sports ballsports ballsports ballsports
38kitekitekitesports
39baseball batbaseball batbaseball batsports
40baseball glovebaseball glovebaseball glovesports
41skateboardskateboardskateboardsports
42surfboardsurfboardsurfboardsports
43tennis rackettennis rackettennis racketsports
44bottlebottlebottlekitchen
45plate--kitchen
46wine glasswine glasswine glasskitchen
47cupcupcupkitchen
48forkforkforkkitchen
49knifeknifeknifekitchen
50spoonspoonspoonkitchen
51bowlbowlbowlkitchen
52bananabananabananafood
53appleappleapplefood
54sandwichsandwichsandwichfood
55orangeorangeorangefood
56broccolibroccolibroccolifood
57carrotcarrotcarrotfood
58hot doghot doghot dogfood
59pizzapizzapizzafood
60donutdonutdonutfood
61cakecakecakefood
62chairchairchairfurniture
63couchcouchcouchfurniture
64potted plantpotted plantpotted plantfurniture
65bedbedbedfurniture
66mirror--furniture
67dining tabledining tabledining tablefurniture
68window--furniture
69desk--furniture
70toilettoilettoiletfurniture
71door--furniture
72tvtvtvelectronic
73laptoplaptoplaptopelectronic
74mousemousemouseelectronic
75remoteremoteremoteelectronic
76keyboardkeyboardkeyboardelectronic
77cell phonecell phonecell phoneelectronic
78microwavemicrowavemicrowaveappliance
79ovenovenovenappliance
80toastertoastertoasterappliance
81sinksinksinkappliance
82refrigeratorrefrigeratorrefrigeratorappliance
83blender--appliance
84bookbookbookindoor
85clockclockclockindoor
86vasevasevaseindoor
87scissorsscissorsscissorsindoor
88teddy bearteddy bearteddy bearindoor
89hair drierhair drierhair drierindoor
90toothbrushtoothbrushtoothbrushindoor
91hair brush--indoor

可以看到,2014年和2017年发布的对象列表是相同的,它们是论文中最初91个对象类别中的80个对象。所以在转换的时候,要重新对类别做映射,映射函数如下:

def coco91_to_coco80_class():  # converts 80-index (val2014) to 91-index (paper)# https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/# a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')# b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')# x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco# x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknetx = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,None, 73, 74, 75, 76, 77, 78, 79, None]return x

接下来,开始格式转换,工程的目录如下:
在这里插入图片描述

  • coco:存放解压后的数据集。
    -out:保存输出结果。
    -coco2yolo.py:转换脚本。

转换代码如下:

import json
import glob
import os
import shutil
from pathlib import Path
import numpy as np
from tqdm import tqdmdef make_folders(path='../out/'):# Create foldersif os.path.exists(path):shutil.rmtree(path)  # delete output folderos.makedirs(path)  # make new output folderos.makedirs(path + os.sep + 'labels')  # make new labels folderos.makedirs(path + os.sep + 'images')  # make new labels folderreturn pathdef convert_coco_json(json_dir='./coco/annotations_trainval2017/annotations/'):jsons = glob.glob(json_dir + '*.json')coco80 = coco91_to_coco80_class()# Import jsonfor json_file in sorted(jsons):fn = 'out/labels/%s/' % Path(json_file).stem.replace('instances_', '')  # folder namefn_images = 'out/images/%s/' % Path(json_file).stem.replace('instances_', '')  # folder nameos.makedirs(fn,exist_ok=True)os.makedirs(fn_images,exist_ok=True)with open(json_file) as f:data = json.load(f)print(fn)# Create image dictimages = {'%g' % x['id']: x for x in data['images']}# Write labels filefor x in tqdm(data['annotations'], desc='Annotations %s' % json_file):if x['iscrowd']:continueimg = images['%g' % x['image_id']]h, w, f = img['height'], img['width'], img['file_name']file_path='coco/'+fn.split('/')[-2]+"/"+f# The Labelbox bounding box format is [top left x, top left y, width, height]box = np.array(x['bbox'], dtype=np.float64)box[:2] += box[2:] / 2  # xy top-left corner to centerbox[[0, 2]] /= w  # normalize xbox[[1, 3]] /= h  # normalize yif (box[2] > 0.) and (box[3] > 0.):  # if w > 0 and h > 0with open(fn + Path(f).stem + '.txt', 'a') as file:file.write('%g %.6f %.6f %.6f %.6f\n' % (coco80[x['category_id'] - 1], *box))file_path_t=fn_images+fprint(file_path,file_path_t)shutil.copy(file_path,file_path_t)def coco91_to_coco80_class():  # converts 80-index (val2014) to 91-index (paper)# https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/# a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')# b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')# x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco# x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknetx = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,None, 73, 74, 75, 76, 77, 78, 79, None]return xconvert_coco_json()

开始运行:
在这里插入图片描述

转换完成后,验证转换的结果:

import cv2
import osdef draw_box_in_single_image(image_path, txt_path):# 读取图像image = cv2.imread(image_path)# 读取txt文件信息def read_list(txt_path):pos = []with open(txt_path, 'r') as file_to_read:while True:lines = file_to_read.readline()  # 整行读取数据if not lines:break# 将整行数据分割处理,如果分割符是空格,括号里就不用传入参数,如果是逗号, 则传入‘,'字符。p_tmp = [float(i) for i in lines.split(' ')]pos.append(p_tmp)  # 添加新读取的数据# Efield.append(E_tmp)passreturn pos# txt转换为boxdef convert(size, box):xmin = (box[1]-box[3]/2.)*size[1]xmax = (box[1]+box[3]/2.)*size[1]ymin = (box[2]-box[4]/2.)*size[0]ymax = (box[2]+box[4]/2.)*size[0]box = (int(xmin), int(ymin), int(xmax), int(ymax))return boxpos = read_list(txt_path)print(pos)tl = int((image.shape[0]+image.shape[1])/2)lf = max(tl-1,1)for i in range(len(pos)):label = str(int(pos[i][0]))print('label is '+label)box = convert(image.shape, pos[i])image = cv2.rectangle(image,(box[0], box[1]),(box[2],box[3]),(0,0,255),2)cv2.putText(image,label,(box[0],box[1]-2), 0, 1, [0,0,255], thickness=2, lineType=cv2.LINE_AA)passif pos:cv2.imwrite('./Data/see_images/{}.png'.format(image_path.split('\\')[-1][:-4]), image)else:print('None')img_folder = "./out/images/val2017"
img_list = os.listdir(img_folder)
img_list.sort()label_folder = "./out/labels/val2017"
label_list = os.listdir(label_folder)
label_list.sort()
if not os.path.exists('./Data/see_images'):os.makedirs('./Data/see_images')
for i in range(len(img_list)):image_path = img_folder + "\\" + img_list[i]txt_path = label_folder + "\\" + label_list[i]draw_box_in_single_image(image_path, txt_path)

结果展示:
在这里插入图片描述

配置yolov10环境

可以直接安装requirements.txt里面所有的库文件,执行安装命令:

pip install -r requirements.txt

如果不想安装这么多库文件,在运行的时候,查看缺少哪个库,就安装哪个库

训练

下载代码:https://github.com/THU-MIG/yolov10,通过下载的方式可以下载到源码。
接下来,创建训练脚本,可以使用yaml文件创建,例如:

from ultralytics import YOLOv10
if __name__ == '__main__':model = YOLOv10(model="ultralytics/cfg/models/v10/yolov10l.yaml")  # 从头开始构建新模型# If you want to finetune the model with pretrained weights, you could load the# pretrained weights like below# model = YOLOv10.from_pretrained('jameslahm/yolov10{n/s/m/b/l/x}')# or# wget https://github.com/THU-MIG/yolov10/releases/download/v1.1/yolov10{n/s/m/b/l/x}.pt# model = YOLOv10('yolov10{n/s/m/b/l/x}.pt')# Use the modelresults = model.train(data="VOC.yaml",  patience=0, epochs=150, device='0', batch=8, seed=42)  # 训练模

模型文件在ultralytics/cfg/models/v10下面,如图:

在这里插入图片描述

也可以使用预训练模型创建。例如:

model = YOLOv10('yolov10n.pt')

然后开启训练。

# Use the model
model.train(data="coco128.yaml", epochs=3)  # train the model

数据集的配置文件在:ultralytics/datasets/下面,如图:
在这里插入图片描述

是不是很简单!!!!

接下来,我们配置自己的环境。
第一步 找到ultralytics/cfg/datasets/coco.yaml文件。
在这里插入图片描述

然后将其复制到根目录
在这里插入图片描述

将里面的路径修改为:

# Ultralytics YOLO 🚀, GPL-3.0 license
# COCO 2017 dataset http://cocodataset.org by Microsoft
# Example usage: yolo train data=coco.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco  ← downloads here (20.1 GB)# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]train: ./coco/images/train2017  # train images (relative to 'path') 118287 images
val: ./coco/images/val2017  # val images (relative to 'path') 5000 images
test: test-dev2017.txt  # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794

关于数据集的路径,大家可以自行尝试,我经过多次尝试发现,YoloV8会自行添加datasets这个文件,所以设置./coco/images/train2017,则实际路径是datasets/coco/images/train2017
第二步 新建train.py脚本。

from ultralytics import YOLOv10
if __name__ == '__main__':model = YOLOv10(model="ultralytics/cfg/models/v10/yolov10l.yaml")  # 从头开始构建新模型# If you want to finetune the model with pretrained weights, you could load the# pretrained weights like below# model = YOLOv10.from_pretrained('jameslahm/yolov10{n/s/m/b/l/x}')# or# wget https://github.com/THU-MIG/yolov10/releases/download/v1.1/yolov10{n/s/m/b/l/x}.pt# model = YOLOv10('yolov10{n/s/m/b/l/x}.pt')# Use the modelresults = model.train(data="coco.yaml", epochs=3,device='3')  # 训练模型

然后,点击train.py可以运行了。
如果设置多卡,可以在device中设置,例如我使用四张卡,可以设置为:

results = model.train(data="coco.yaml", epochs=3,device='0,1,2,3')  # 训练模型

在这里插入图片描述

第三步 修改参数,在ultralytics/cfg/default.yaml文件中查看。例如:

# Train settings -------------------------------------------------------------------------------------------------------
model:  # path to model file, i.e. yolov8n.pt, yolov8n.yaml
data:  # path to data file, i.e. coco128.yaml
epochs: 100  # number of epochs to train for
patience: 50  # epochs to wait for no observable improvement for early stopping of training
batch: 16  # number of images per batch (-1 for AutoBatch)
imgsz: 640  # size of input images as integer or w,h
save: True  # save train checkpoints and predict results
save_period: -1 # Save checkpoint every x epochs (disabled if < 1)
cache: False  # True/ram, disk or False. Use cache for data loading
device:  # device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu
workers: 8  # number of worker threads for data loading (per RANK if DDP)
project:  # project name
name:  # experiment name, results saved to 'project/name' directory
exist_ok: False  # whether to overwrite existing experiment
pretrained: False  # whether to use a pretrained model
optimizer: SGD  # optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp']
verbose: True  # whether to print verbose output
seed: 0  # random seed for reproducibility
deterministic: True  # whether to enable deterministic mode
single_cls: False  # train multi-class data as single-class
image_weights: False  # use weighted image selection for training
rect: False  # support rectangular training if mode='train', support rectangular evaluation if mode='val'
cos_lr: False  # use cosine learning rate scheduler
close_mosaic: 10  # disable mosaic augmentation for final 10 epochs
resume: False  # resume training from last checkpoint

上面是训练过程中常用的参数,我们调用yolo函数可以自行修改。
等待测试完成后,就可以看到结果,如下图:

在这里插入图片描述

断点训练

训练过程中,有时候会出现意外中断的情况,如果想要接着训练,则需要将resume设置为True。代码如下:

from ultralytics import YOLOv10
if __name__ == '__main__':# 加载模型model = YOLOv10("runs/detect/train8/weights/last.pt")  # 从头开始构建新模型print(model.model)# Use the modelresults = model.train(data="VOC.yaml", epochs=100, device='0', batch=16,workers=0,resume=True)  # 训练模型

然后点击run,就可以继续接着训练。

测试

新建测试脚本test.py.

from ultralytics import YOLOv10# Load a model
model = YOLOv10("runs/detect/train11/weights/best.pt")  # load a pretrained model (recommended for training)results = model.predict(source="ultralytics/assets",device='3')  # predict on an image
print(results)

这个results保存了所有的结果。如下图:
在这里插入图片描述

predict的参数也可以在ultralytics/cfg/default.yaml文件中查看。例如:

# Prediction settings --------------------------------------------------------------------------------------------------
source:  # source directory for images or videos
show: False  # show results if possible
save_txt: False  # save results as .txt file
save_conf: False  # save results with confidence scores
save_crop: False  # save cropped images with results
hide_labels: False  # hide labels
hide_conf: False  # hide confidence scores
vid_stride: 1  # video frame-rate stride
line_thickness: 3  # bounding box thickness (pixels)
visualize: False  # visualize model features
augment: False  # apply image augmentation to prediction sources
agnostic_nms: False  # class-agnostic NMS
classes:  # filter results by class, i.e. class=0, or class=[0,2,3]
retina_masks: False  # use high-resolution segmentation masks
boxes: True  # Show boxes in segmentation predictions

训练自定义数据集

Labelme数据集

数据集选用我以前自己标注的数据集。下载链接:
https://download.csdn.net/download/hhhhhhhhhhwwwwwwwwww/63242994。
类别如下: [‘c17’, ‘c5’, ‘helicopter’, ‘c130’, ‘f16’, ‘b2’,
‘other’, ‘b52’, ‘kc10’, ‘command’, ‘f15’, ‘kc135’, ‘a10’,
‘b1’, ‘aew’, ‘f22’, ‘p3’, ‘p8’, ‘f35’, ‘f18’, ‘v22’, ‘f4’,
‘globalhawk’, ‘u2’, ‘su-27’, ‘il-38’, ‘tu-134’, ‘su-33’,
‘an-70’, ‘su-24’, ‘tu-22’, ‘il-76’]

格式转换

将Lableme数据集转为yolov10格式的数据集,转换代码如下:

import os
import shutilimport numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from os import getcwddef convert(size, box):dw = 1. / (size[0])dh = 1. / (size[1])x = (box[0] + box[1]) / 2.0 - 1y = (box[2] + box[3]) / 2.0 - 1w = box[1] - box[0]h = box[3] - box[2]x = x * dww = w * dwy = y * dhh = h * dhreturn (x, y, w, h)def change_2_yolo5(files, txt_Name):imag_name=[]for json_file_ in files:json_filename = labelme_path + json_file_ + ".json"out_file = open('%s/%s.txt' % (labelme_path, json_file_), 'w')json_file = json.load(open(json_filename, "r", encoding="utf-8"))# image_path = labelme_path + json_file['imagePath']imag_name.append(json_file_+'.jpg')height, width, channels = cv2.imread(labelme_path + json_file_ + ".jpg").shapefor multi in json_file["shapes"]:points = np.array(multi["points"])xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0label = multi["label"].lower()if xmax <= xmin:passelif ymax <= ymin:passelse:cls_id = classes.index(label)b = (float(xmin), float(xmax), float(ymin), float(ymax))bb = convert((width, height), b)out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')# print(json_filename, xmin, ymin, xmax, ymax, cls_id)return imag_namedef image_txt_copy(files,scr_path,dst_img_path,dst_txt_path):""":param files: 图片名字组成的list:param scr_path: 图片的路径:param dst_img_path: 图片复制到的路径:param dst_txt_path: 图片对应的txt复制到的路径:return:"""for file in files:img_path=scr_path+fileprint(file)shutil.copy(img_path, dst_img_path+file)scr_txt_path=scr_path+file.split('.')[0]+'.txt'shutil.copy(scr_txt_path, dst_txt_path + file.split('.')[0]+'.txt')if __name__ == '__main__':classes = ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2','other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10','b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4','globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33','an-70', 'su-24', 'tu-22', 'il-76']# 1.标签路径labelme_path = "USA-Labelme/"isUseTest = True  # 是否创建test集# 3.获取待处理文件files = glob(labelme_path + "*.json")files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]for i in files:print(i)trainval_files, test_files = train_test_split(files, test_size=0.1, random_state=55)# splittrain_files, val_files = train_test_split(trainval_files, test_size=0.1, random_state=55)train_name_list=change_2_yolo5(train_files, "train")print(train_name_list)val_name_list=change_2_yolo5(val_files, "val")test_name_list=change_2_yolo5(test_files, "test")#创建数据集文件夹。file_List = ["train", "val", "test"]for file in file_List:if not os.path.exists('./VOC/images/%s' % file):os.makedirs('./VOC/images/%s' % file)if not os.path.exists('./VOC/labels/%s' % file):os.makedirs('./VOC/labels/%s' % file)image_txt_copy(train_name_list,labelme_path,'./VOC/images/train/','./VOC/labels/train/')image_txt_copy(val_name_list, labelme_path, './VOC/images/val/', './VOC/labels/val/')image_txt_copy(test_name_list, labelme_path, './VOC/images/test/', './VOC/labels/test/')

运行完成后就得到了yolov10格式的数据集。
在这里插入图片描述

训练

将生成的yolo数据集放到datasets文件夹下面,如下图:
在这里插入图片描述

然后新建VOC.yaml文件,添加内容:


train: ./VOC/images/train # train images
val: VOC/images/val # val images
test: VOC/images/test # test images (optional)names: ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2','other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10','b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4','globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33','an-70', 'su-24', 'tu-22', 'il-76']

然后新建train.py,添加代码:

from ultralytics import YOLOv10
if __name__ == '__main__':# 加载模型model = YOLOv10("ultralytics/models/v108/yolov10l.yaml")  # 从头开始构建新模型print(model.model)# Use the modelresults = model.train(data="VOC.yaml", epochs=100, device='0', batch=16,workers=0)  # 训练模型

然后就可以看是训练了,点击run开始运行train.py。
在这里插入图片描述

训练150个epoch后的结果:

YOLOv10l summary (fused): 461 layers, 25765712 parameters, 0 gradients, 126.6 GFLOPsClass     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 15/15 [00:02<00:00,  7.06it/s]all        230       1412      0.847       0.94      0.978      0.711c17        230        131      0.842      0.977       0.98       0.79c5        230         68      0.835      0.941      0.958      0.788helicopter        230         43      0.804      0.953      0.948      0.564c130        230         85      0.967      0.953      0.984      0.642f16        230         57      0.732      0.912      0.918      0.589b2        230          2      0.397          1      0.995      0.696other        230         86      0.805       0.93      0.954       0.51b52        230         70      0.893      0.957      0.962      0.795kc10        230         62      0.985      0.968      0.983      0.803command        230         40      0.831          1      0.987      0.778f15        230        123      0.837      0.959       0.98      0.648kc135        230         91      0.879      0.989      0.969      0.662a10        230         27      0.885      0.963       0.93      0.435b1        230         20      0.666          1      0.985      0.699aew        230         25      0.816          1      0.989      0.789f22        230         17      0.844          1       0.99      0.692p3        230        105      0.957      0.971      0.993      0.803p8        230          1      0.902          1      0.995      0.697f35        230         32       0.82      0.938      0.967      0.558f18        230        125      0.938      0.984      0.986      0.796v22        230         41      0.959          1      0.995      0.714su-27        230         31      0.915          1      0.995      0.853il-38        230         27       0.94          1      0.994      0.817tu-134        230          1          1          0      0.995      0.895su-33        230          2      0.785          1      0.995      0.697an-70        230          2      0.728          1      0.995      0.697tu-22        230         98       0.91       0.99      0.986      0.783

测试

新建test.py脚本,插入代码:

from ultralytics import YOLOv10# Load a model
model = YOLOv10("runs/detect/train/weights/best.pt")  # load a pretrained model (recommended for training)
results = model.predict(source="datasets/VOC/images/test",device='0',save=True)  # predict on an image

预测参数如下:

# Prediction settings --------------------------------------------------------------------------------------------------
source:  # source directory for images or videos
show: False  # show results if possible
save_txt: False  # save results as .txt file
save_conf: False  # save results with confidence scores
save_crop: False  # save cropped images with results
hide_labels: False  # hide labels
hide_conf: False  # hide confidence scores
vid_stride: 1  # video frame-rate stride
line_thickness: 3  # bounding box thickness (pixels)
visualize: False  # visualize model features
augment: False  # apply image augmentation to prediction sources
agnostic_nms: False  # class-agnostic NMS
classes:  # filter results by class, i.e. class=0, or class=[0,2,3]
retina_masks: False  # use high-resolution segmentation masks
boxes: True  # Show boxes in segmentation predictions

我们发现并没有像yolov5那样,保存测试图片的参数,通过查看源码:
在这里插入图片描述
找到了save这个参数,所以,将save设置为True就可以保存测试的图片。如下图:

在这里插入图片描述
如果觉得官方封装的太多了,不太灵活,可以使用下面的推理代码:

import cv2
import time
import random
import numpy as np
import torch, torchvisiondef load_model(model_path):model = torch.load(model_path, map_location='cpu')category_list = model.get('CLASSES', model.get('model').names)model = (model.get('ema') or model['model']).to("cuda:0").float()  # FP32 modelmodel.__setattr__('CLASSES', category_list)model.fuse().eval()return model# def data_preprocess(model, img, img_scale):
#     stride, auto = 32, True
#     stride = max(int(model.stride.max()), 32)
#     img = letterbox(img, new_shape=img_scale, stride=stride, auto=auto)[0]  # padded resize
#     img = np.ascontiguousarray(img.transpose((2, 0, 1))[::-1])  # HWC to CHW, BGR to RGB,contiguous
#     img = torch.from_numpy(img).to("cuda:0")  # ndarray to tensor
#     img = img.float()  # uint8 to fp32
#     img /= 255  # 0 - 255 to 0.0 - 1.0
#     if len(img.shape) == 3:
#         img = img[None]  # expand for batch dim
#     return imgdef data_preprocess(model, img, img_scale):# 定义步长和是否自动调整stride, auto = 32, True# 确保步长至少为模型的最大步长或32stride = max(int(model.stride.max()), 32)# 对图像进行填充并调整大小,以适应模型输入img = letterbox(img, new_shape=img_scale, stride=stride, auto=auto)[0]  # padded resize# 将图像的维度从(高度, 宽度, 通道)转换为(通道, 高度, 宽度),并将数据类型从uint8转为fp32img = np.ascontiguousarray(img.transpose((2, 0, 1))[::-1])  # HWC to CHW, BGR to RGB,contiguous# 将numpy数组转换为PyTorch张量,并将数据移动到GPU上img = torch.from_numpy(img).to("cuda:0")  # ndarray to tensor# 将像素值从0-255的范围缩放到0.0-1.0img = img.float()  # uint8 to fp32img /= 255  ## 如果图像是单通道的,则在其前面添加一个维度以模拟批处理大小if len(img.shape) == 3:img = img[None]  # expand for batch dimreturn imgdef letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):# 获取图像当前形状 [高度, 宽度]shape = im.shape[:2]# 如果 new_shape 是一个整数,将其转换为元组 (宽度, 高度)if isinstance(new_shape, int):new_shape = (new_shape, new_shape)# 计算缩放比例 (新尺寸 / 旧尺寸)r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])# 如果不允许放大,则只进行缩小操作 (为更好的验证 mAP)if not scaleup:r = min(r, 1.0)# 计算缩放后的尺寸和填充ratio = r, r  # 宽度和高度比例new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # 宽度和高度填充# 如果 auto 为 True,则按 stride 取模 (最小矩形)if auto:dw, dh = np.mod(dw, stride), np.mod(dh, stride)# 如果 scaleFill 为 True,则拉伸填充elif scaleFill:dw, dh = 0.0, 0.0new_unpad = (new_shape[1], new_shape[0])ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # 宽度和高度比例# 将填充分为两部分,每部分为原来的一半dw /= 2dh /= 2# 如果原始尺寸与缩放后的尺寸不同,则进行缩放操作if shape[::-1] != new_unpad:im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)# 在图像周围添加边框,高度和宽度分别为上面计算得到的 dw 和 dhtop, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))left, right = int(round(dw - 0.1)), int(round(dw + 0.1))im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # 添加边框return im, ratio, (dw, dh)  # 返回处理后的图像、宽高比例和填充值def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False, multi_label=False,labels=(), max_det=300, nc=0, max_time_img=0.05, max_nms=30000, max_wh=7680, ):# Checksassert 0 <= conf_thres <= 1, f'Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0'assert 0 <= iou_thres <= 1, f'Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0'if isinstance(prediction, (list, tuple)):  # YOLOv8 model in validation model, output = (inference_out, loss_out)prediction = prediction[0]  # select only inference outputdevice = prediction.devicemps = 'mps' in device.type  # Apple MPSif mps:  # MPS not fully supported yet, convert tensors to CPU before NMSprediction = prediction.cpu()bs = prediction.shape[0]  # batch sizenc = nc or (prediction.shape[1] - 4)  # number of classesnm = prediction.shape[1] - nc - 4mi = 4 + nc  # mask start indexxc = prediction[:, 4:mi].amax(1) > conf_thres  # candidates# Settings# min_wh = 2  # (pixels) minimum box width and heighttime_limit = 0.5 + max_time_img * bs  # seconds to quit aftermulti_label &= nc > 1  # multiple labels per box (adds 0.5ms/img)prediction = prediction.transpose(-1, -2)  # shape(1,84,6300) to shape(1,6300,84)prediction[..., :4] = xywh2xyxy(prediction[..., :4])  # xywh to xyxyt = time.time()output = [torch.zeros((0, 6 + nm), device=prediction.device)] * bsfor xi, x in enumerate(prediction):  # image index, image inference# Apply constraints# x[((x[:, 2:4] < min_wh) | (x[:, 2:4] > max_wh)).any(1), 4] = 0  # width-heightx = x[xc[xi]]  # confidence# Cat apriori labels if autolabellingif labels and len(labels[xi]):lb = labels[xi]v = torch.zeros((len(lb), nc + nm + 4), device=x.device)v[:, :4] = xywh2xyxy(lb[:, 1:5])  # boxv[range(len(lb)), lb[:, 0].long() + 4] = 1.0  # clsx = torch.cat((x, v), 0)# If none remain process next imageif not x.shape[0]:continue# Detections matrix nx6 (xyxy, conf, cls)box, cls, mask = x.split((4, nc, nm), 1)if multi_label:i, j = torch.where(cls > conf_thres)x = torch.cat((box[i], x[i, 4 + j, None], j[:, None].float(), mask[i]), 1)else:  # best class onlyconf, j = cls.max(1, keepdim=True)x = torch.cat((box, conf, j.float(), mask), 1)[conf.view(-1) > conf_thres]# Filter by classif classes is not None:x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]# Check shapen = x.shape[0]  # number of boxesif not n:  # no boxescontinueif n > max_nms:  # excess boxesx = x[x[:, 4].argsort(descending=True)[:max_nms]]  # sort by confidence and remove excess boxes# Batched NMSc = x[:, 5:6] * (0 if agnostic else max_wh)  # classesboxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scoresi = torchvision.ops.nms(boxes, scores, iou_thres)  # NMSi = i[:max_det]  # limit detectionsoutput[xi] = x[i]if mps:output[xi] = output[xi].to(device)if (time.time() - t) > time_limit:print(f'WARNING ⚠️ NMS time limit {time_limit:.3f}s exceeded')break  # time limit exceededreturn outputdef xywh2xyxy(x):"""Convert bounding box coordinates from (x, y, width, height) format to (x1, y1, x2, y2) format where (x1, y1) is thetop-left corner and (x2, y2) is the bottom-right corner.Args:x (np.ndarray | torch.Tensor): The input bounding box coordinates in (x, y, width, height) format.Returns:y (np.ndarray | torch.Tensor): The bounding box coordinates in (x1, y1, x2, y2) format."""assert x.shape[-1] == 4, f'input shape last dimension expected 4 but input shape is {x.shape}'y = torch.empty_like(x) if isinstance(x, torch.Tensor) else np.empty_like(x)  # faster than clone/copydw = x[..., 2] / 2  # half-widthdh = x[..., 3] / 2  # half-heighty[..., 0] = x[..., 0] - dw  # top left xy[..., 1] = x[..., 1] - dh  # top left yy[..., 2] = x[..., 0] + dw  # bottom right xy[..., 3] = x[..., 1] + dh  # bottom right yreturn ydef scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None, padding=True):"""Rescales bounding boxes (in the format of xyxy) from the shape of the image they were originally specified in(img1_shape) to the shape of a different image (img0_shape).Args:img1_shape (tuple): The shape of the image that the bounding boxes are for, in the format of (height, width).boxes (torch.Tensor): the bounding boxes of the objects in the image, in the format of (x1, y1, x2, y2)img0_shape (tuple): the shape of the target image, in the format of (height, width).ratio_pad (tuple): a tuple of (ratio, pad) for scaling the boxes. If not provided, the ratio and pad will becalculated based on the size difference between the two images.padding (bool): If True, assuming the boxes is based on image augmented by yolo style. If False then do regularrescaling.Returns:boxes (torch.Tensor): The scaled bounding boxes, in the format of (x1, y1, x2, y2)"""if ratio_pad is None:  # calculate from img0_shapegain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / newpad = round((img1_shape[1] - img0_shape[1] * gain) / 2 - 0.1), round((img1_shape[0] - img0_shape[0] * gain) / 2 - 0.1)  # wh paddingelse:gain = ratio_pad[0][0]pad = ratio_pad[1]if padding:boxes[..., [0, 2]] -= pad[0]  # x paddingboxes[..., [1, 3]] -= pad[1]  # y paddingboxes[..., :4] /= gainclip_boxes(boxes, img0_shape)return boxesdef clip_boxes(boxes, shape):"""Takes a list of bounding boxes and a shape (height, width) and clips the bounding boxes to the shape.Args:boxes (torch.Tensor): the bounding boxes to clipshape (tuple): the shape of the image"""if isinstance(boxes, torch.Tensor):  # faster individuallyboxes[..., 0].clamp_(0, shape[1])  # x1boxes[..., 1].clamp_(0, shape[0])  # y1boxes[..., 2].clamp_(0, shape[1])  # x2boxes[..., 3].clamp_(0, shape[0])  # y2else:  # np.array (faster grouped)boxes[..., [0, 2]] = boxes[..., [0, 2]].clip(0, shape[1])  # x1, x2boxes[..., [1, 3]] = boxes[..., [1, 3]].clip(0, shape[0])  # y1, y2def plot_result(det_cpu, dst_img, category_names, image_name):for i, item in enumerate(det_cpu):rand_color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))# 画boxbox_x1, box_y1, box_x2, box_y2 = item[0:4].astype(np.int32)cv2.rectangle(dst_img, (box_x1, box_y1), (box_x2, box_y2), color=rand_color, thickness=2)# 画labellabel = category_names[int(item[5])]score = item[4]org = (min(box_x1, box_x2), min(box_y1, box_y2) - 8)text = '{}|{:.2f}'.format(label, score)cv2.putText(dst_img, text, org=org, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.8, color=rand_color,thickness=2)cv2.imshow('result', dst_img)cv2.waitKey()cv2.imwrite(image_name, dst_img)if __name__ == '__main__':img_path = "./ultralytics/assets/bus.jpg"image_name = img_path.split('/')[-1]ori_img = cv2.imread(img_path)# load modelmodel = load_model("runs/detect/train2/weights/best.pt")# 数据预处理img = data_preprocess(model, ori_img, [640, 640])# 推理result = model(img, augment=False)preds = result[0]# NMSdet = non_max_suppression(preds, conf_thres=0.35, iou_thres=0.45, nc=len(model.CLASSES))[0]# bbox还原至原图尺寸det[:, :4] = scale_boxes(img.shape[2:], det[:, :4], ori_img.shape)category_names = model.CLASSES# showplot_result(det.cpu().numpy(), ori_img, category_names, image_name)

总结

本文对yolov10的模型做了讲解,并且带大家一起实战!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/bicheng/52255.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

CeresPCL 岭回归拟合(曲线拟合)

文章目录 一、简介二、实现代码三、实现效果参考资料一、简介 由于在使用最小二乘插值拟合时,会涉及到矩阵求逆的操作,但是如果这个矩阵接近于奇异时,那么拟合的结果就会与我们期望的结果存在较大差距,因此就有学者提出在最小二乘的误差函数中添加正则项,即: 这里我们也可…

OpenGL-ES 学习(8) ---- FBO

目录 FBO OverViewFBO 优点使用FBO的步骤 FBO OverView FBO(FrameBuffer Object) 指的是帧缓冲对象&#xff0c;实际上是一个可以添加缓冲区容器&#xff0c;可以为其添加纹理或者渲染缓冲区对象(RBO) FBO(FrameBuffer Object) 本身不能用于渲染&#xff0c;只有添加了纹理或者…

Stability AI发布了单目视频转4D模型的新AI模型:Stable Video 4D

开放生成式人工智能初创公司Stability AI在3月发布了Stable Video 3D&#xff0c;是一款可以根据图像中的物体生成出可旋转的3D模型视频工具。Stability AI在7月24日发布了新一代的Stable Video 4D&#xff0c;增添了赋予3D模移动作的功能。 Stable Video 4D能在约40秒内生成8…

数字乡村+智慧农业数字化转型大数据平台建设方案

1. 数字农业发展趋势 数字农业正经历全环节数字技术应用、全流程生产经营再造、全方位线上线下对接和管理服务全生命周期覆盖的四大趋势&#xff0c;标志着我国农业进入高质量发展新阶段。 2. 数字乡村的战略意义 数字乡村作为数字化、网络化和信息化的产物&#xff0c;对于…

Wemos D1 Mini pro/ nodeMcu / ESP8266 驱动 240*320 ILI9431 SPI液晶屏

Wemos D1 Mini / nodeMcu / ESP8266 驱动 240*320 ILI9431 SPI液晶屏 效果展示器件硬件连接引脚连接原理图引脚对照表 安装TFT_eSPI库TFT_eSPI库中User_Setup.h文件的参数修改User_Setup.h文件的位置User_Setup.h文件中需要修改的参数User_Setup.h完成源码 例程 缘起&#xff1…

网络间通信

1、udp通信 特点&#xff1a;&#xff08;1&#xff09;无连接 &#xff08;2&#xff09;不可靠 2、udp编程&#xff08;c/s模型&#xff09; ssize_t recvfrom(int sockfd, //socket的fd void *buf, //保存数据的一块空间的地址 …

高效分页策略:掌握 LIMIT 语句的正确使用方法与最佳实践

本文主要介绍limit 分页的弊端及线上应该怎么用 LIMIT M,N 平时经常见到使用 <limit m,n> 合适的 order by 来实现分页查询&#xff0c;这样做到底性能如何呢&#xff1f; 先来简单分析下&#xff0c;然后再实际验证一下。 无索引条件下&#xff0c;需要做大量的文件排…

Linux tail -f 报错 No space left on device

问题&#xff1a; 执行tail -f my_file 时报错&#xff1a;No space left on device df -h 检查磁盘剩余空间&#xff0c;剩余空间都很充足&#xff1b; df -i 检测iNode使用情况&#xff0c;剩余iNode也很充足&#xff1b; 参考这篇文章解决了问题 tail: cannot watch /v…

探索802.1X:构筑安全网络的认证之盾

在现代网络安全的世界里&#xff0c;有一个极其重要但又常常被忽视的角色&#xff0c;它就是802.1x认证协议。这个协议可以被称作网络安全的守护者&#xff0c;为我们提供了强有力的防护。今天&#xff0c;我们就来深入探讨一下802.1x的原理、应用和测试&#xff0c;看看它是如…

[000-01-022].第09节:RabbitMQ中的消息分发策略

我的后端学习大纲 RabbitMQ学习大纲 1.不公平分发&#xff1a; 1.1.什么是不公平分发&#xff1a; 1.在最开始的时候我们学习到 RabbitMQ 分发消息采用的轮训分发&#xff0c;但在某种场景下这种策略并不是很好&#xff0c;比方说有两个消费者在处理任务&#xff0c;其中有个…

js 实现对一个元素得拉伸

前言&#xff1a; 最近写一个项目遇到了需要拉伸调整一个元素得大小&#xff08;宽高&#xff09;。所以打算实现一下。 思路就是用 mousedown、mousemove、mouseup 来实现。 mousemove是动态获取坐标&#xff0c;然后 动态改变元素宽度 js自己实现&#xff1a; html里实现…

使用html-docx-js + fileSaver实现前端导出word

因为html-docx-js是16年的老库了&#xff0c;它代码里面用到的with语法现在严格模式不允许&#xff0c;用npm直接引入会报错&#xff0c;所以我们需要用其它方式引入 首先要将html-docx-js的代码放到项目中 html-docx-js/dist/html-docx.js at master evidenceprime/html-do…

Coze插件发布!PDF转Markdown功能便捷集成,打造你的专属智能体

近日&#xff0c;TextIn开发的PDF转Markdown插件正式上架Coze。 在扣子搜索“pdf转markdown”&#xff0c;或在Coze搜索“pdf2markdown” 即可找到插件&#xff0c;在你的专属智能体中便捷使用文档解析功能。 如果想测试解析插件在你需要的场景下表现如何&#xff0c;可以直接…

网络安全之xss靶场练习

目录 一、xss靶场练习 1、Ma Spaghet! 2、Jefff 第一个方法 第二个方法 3、Ugandan Knuckles 4、Ricardo Milos 5、Ah Thats Hawt 6、Ligma 7、Mafia​编辑 8、Ok, Boomer 一、xss靶场练习 靶场地址 https://xss.pwnfunction.com/ 页面显示如下 1、Ma Spaghet! 分析…

谈一谈数据虚拟化的技术核心和应用架构

数据虚拟化&#xff08;Data Virtualization&#xff09;是对数据资源的抽象&#xff0c;通过屏蔽数据资源的存储位置和访问方式&#xff0c;能够将不同数据源、不同格式的数据资源&#xff0c;进行逻辑上的整合集成。这一技术方案与过去面对传统数仓的弊端&#xff0c;业界过去…

板子电源接线

目的 就是电源接板子时&#xff0c;分清正负 过程 AC、交流电 没有正负 分火线和0线 AC-L 交流火线 AC-N 交流0线 FG&#xff1a;接的是大地 G&#xff1a;是直流输出的地 U&#xff1a;表示的是电压 DC是直流正&#xff0c;DC-是直流负 2个AC是接交流的&#xff0c;一般是左…

免费的真是太香了!Chainlit接入抖音 Coze AI知识库接口快速实现自定义用户聊天界面

前言 由于Coze 只提供了一个分享用的网页应用&#xff0c;网页访问地址没法自定义&#xff0c;虽然可以接入NextWeb/ChatGPT web/open webui等开源应用。但是如果我们想直接给客户应用&#xff0c;还需要客户去设置配置&#xff0c;里面还有很多我们不想展示给客户的东西怎么办…

源代码一定要加密!10款超级好用的源代码加密软件排行榜

在当今高度竞争的商业环境中&#xff0c;源代码不仅是软件产品的基础&#xff0c;更是企业的核心资产之一。保护源代码免受未经授权的访问和盗窃至关重要。为此&#xff0c;许多企业采用源代码加密软件来为这一重要资产增加额外的安全层。以下是2024年企业通用的十大源代码加密…

session、cookie、token概念介绍

一、Cookie 1、cookie介绍 Cookie是网站为了辨别用户身份而储存在用户本地终端&#xff08;Client Side&#xff09;上的小型文本文件。 作用&#xff1a;Cookie主要用于保存用户登录信息、浏览记录等&#xff0c;以便用户再次访问时能够自动识别并提供个性化服务。存储位置…

SEO优化:如何优化自己的文章,解决搜索引擎不收录的问题

可以使用bing的URL检查&#xff0c;来检查自己的文章是不是负荷收录准测&#xff0c;如果页面有严重的错误&#xff0c;搜索引擎是不会进行收录的&#xff0c;而且还会判定文章为低质量文章&#xff01; 检查是否有问题。下面的页面就是有问题&#xff0c;当然如果是误报你也可…