自定义数据集训练 Yolo V10

上次介绍了Yolo 推理，本文我们将使用自己的数据集训练 Yolo V10，训练过程简单：

首先准备数据集，包括图片、标注
训练
推理

数据集准备

本次采用的数据集为内部数据，标注方法为 VOC 格式，首先我们需要建 VOC 格式的标注数据装换为 Yolo 格式的标注文件。

VOC 格式标注文件

VOC 为 XML 格式文件，坐标为标注框的左上和右下。
在这里插入图片描述

YOLO格式标注文件

YOLO 标注文件格式为

类别ID：表示对象类别的整数。
中心X：边界框中心的x坐标，归一化为图像宽度的值（介于0和1之间）。
中心Y：边界框中心的y坐标，归一化为图像高度的值（介于0和1之间）。
宽度：边界框的宽度，归一化为图像宽度的值（介于0和1之间）。
高度：边界框的高度，归一化为图像高度的值（介于0和1之间）。

在这里插入图片描述

VOC 装 YOLO 格式

通过Python 讲 VOC 格式转换为 YOLO 格式。

import os
import xml.etree.ElementTree as ET# 从文件中读取类别名称并创建类别到索引的映射
def load_classes(file_path):with open(file_path, 'r') as file:classes = file.read().strip().split('\n')return {cls: idx for idx, cls in enumerate(classes)}# 将VOC XML注释转换为YOLO格式
def convert_voc_to_yolo(xml_file, output_dir, class_mapping):tree = ET.parse(xml_file)root = tree.getroot()image_filename = root.find('filename').textimage_path = os.path.join(output_dir, os.path.splitext(image_filename)[0] + '.txt')size = root.find('size')width = int(size.find('width').text)height = int(size.find('height').text)with open(image_path, 'w') as yolo_file:for obj in root.findall('object'):class_name = obj.find('name').textif class_name not in class_mapping:continueclass_id = class_mapping[class_name]xmlbox = obj.find('bndbox')xmin = int(xmlbox.find('xmin').text)ymin = int(xmlbox.find('ymin').text)xmax = int(xmlbox.find('xmax').text)ymax = int(xmlbox.find('ymax').text)# 计算YOLO格式的中心坐标、宽度和高度x_center = (xmin + xmax) / 2.0 / widthy_center = (ymin + ymax) / 2.0 / heightbbox_width = (xmax - xmin) / float(width)bbox_height = (ymax - ymin) / float(height)yolo_file.write(f"{class_id} {x_center} {y_center} {bbox_width} {bbox_height}\n")voc_annotation_dir = 'datasets/TG/TJHVOC2007/Annotations/'
yolo_output_dir = 'datasets/TG/TJHVOC2007/labels'
classes_file = 'classes.txt'if not os.path.exists(yolo_output_dir):os.makedirs(yolo_output_dir)class_mapping = load_classes(classes_file)for xml_file in os.listdir(voc_annotation_dir):if xml_file.endswith('.xml'):convert_voc_to_yolo(os.path.join(voc_annotation_dir, xml_file), yolo_output_dir, class_mapping)

#class.txt, 根据自己的分类进行修改
行人
汽车
红绿灯

拆分为训练集和验证集

将上面装换的数据集进行拆分，分为数据集和验证集，代码如下

import os
import shutil
import random# Set the path to your images and annotations
image_dir = './images'
annotation_dir = './labels'# Set the paths for the train and val directories
train_image_dir = '../images/train'
val_image_dir = '../images/val'
train_annotation_dir = '../labels/train'
val_annotation_dir = '../labels/val'# Create directories if they don't exist
os.makedirs(train_image_dir, exist_ok=True)
os.makedirs(val_image_dir, exist_ok=True)
os.makedirs(train_annotation_dir, exist_ok=True)
os.makedirs(val_annotation_dir, exist_ok=True)# Get list of all files
images = [f for f in os.listdir(image_dir) if os.path.isfile(os.path.join(image_dir, f))]
annotations = [f for f in os.listdir(annotation_dir) if os.path.isfile(os.path.join(annotation_dir, f))]# Ensure that the number of images and annotations match
assert len(images) == len(annotations), "Number of images and annotations do not match!"# Combine images and annotations into a list of tuples
data = list(zip(images, annotations))# Shuffle the data
random.shuffle(data)# Define the split ratio
train_ratio = 0.8
val_ratio = 0.2# Calculate the number of training samples
num_train_samples = int(train_ratio * len(data))# Split the data into training and validation sets
train_data = data[:num_train_samples]
val_data = data[num_train_samples:]# Copy files to the respective directories
for image, annotation in train_data:shutil.copy(os.path.join(image_dir, image), os.path.join(train_image_dir, image))shutil.copy(os.path.join(annotation_dir, annotation), os.path.join(train_annotation_dir, annotation))for image, annotation in val_data:shutil.copy(os.path.join(image_dir, image), os.path.join(val_image_dir, image))shutil.copy(os.path.join(annotation_dir, annotation), os.path.join(val_annotation_dir, annotation))print(f"Training data: {len(train_data)} samples")
print(f"Validation data: {len(val_data)} samples")

运行训练

创建配置文件VOC.yaml，运行训练

## VOC.yaml# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: TG
train: # train images (relative to 'path')  16551 images- TJHVOC2007
val: # val images (relative to 'path')  4952 images- TJHVOC2007
test: # test images (optional)- TJHVOC2007# Classes 按需修改
names:0: 人## 执行
results = model.train(data="./VOC.yaml", epochs=20, imgsz=1920, batch=4)

运行结果

精确率 71%，召回率 89%，不是很高，增大 epoch，结果应该会好一下。
在这里插入图片描述

推理

模型路径改为训练好的模型位置，之后进行推理即可。

YOLOv10('runs/detect/train25/weights/best.pt')

总结

训练比较简单，本文采用的是参数比较少的模型，由于图片尺寸较大，如果 Batch 过大会导致内存溢出，可以根据 GPU 内存情况进行调整。

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/diannao/20715.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！