最近需要用的coco格式的数据集,但是在网上找的很多 毕竟麻烦,简单记录一下!
1、调整目录结构(以GC10-DET数据集为例)
YOLO格式数据集目录结构如下:
简单来说就是images文件夹里面有train、val、test三个文件夹都放的图片;
labels文件夹也有train、val、test三个文件夹都放的对应的标注!
2、使用代码进行转换!(修改两个路径and换成你自己数据集的类别名称即可)
import os
import json
from PIL import Image# 设置数据集路径
output_dir = "D:\\AAAAA\\GC10_coco" #修改为YOLO格式的数据集路径;
dataset_path = "D:\\AAAAA\\GC10_yolo" # 修改你想输出的coco格式数据集路径
images_path = os.path.join(dataset_path, "images")
labels_path = os.path.join(dataset_path, "labels")# 类别映射
categories = [{"id": 0, "name": "1_chongkong"},{"id": 1, "name": "2_hanfeng"},{"id": 2, "name": "3_yueyawan"},{"id": 3, "name": "4_shuiban"},{"id": 4, "name": "5_youban"},{"id": 5, "name": "6_siban"},{"id": 6, "name": "7_yiwu"},{"id": 7, "name": "8_yahen"},{"id": 8, "name": "9_zhehen"},{"id": 9, "name": "10_yaozhe"},# 添加更多类别
]# YOLO格式转COCO格式的函数
def convert_yolo_to_coco(x_center, y_center, width, height, img_width, img_height):x_min = (x_center - width / 2) * img_widthy_min = (y_center - height / 2) * img_heightwidth = width * img_widthheight = height * img_heightreturn [x_min, y_min, width, height]# 初始化COCO数据结构
def init_coco_format():return {"images": [],"annotations": [],"categories": categories}# 处理每个数据集分区
for split in ['train', 'test', 'val']:coco_format = init_coco_format()annotation_id = 1for img_name in os.listdir(os.path.join(images_path, split)):if img_name.lower().endswith(('.png', '.jpg', '.jpeg')):img_path = os.path.join(images_path, split, img_name)label_path = os.path.join(labels_path, split, img_name.replace("jpg", "txt"))img = Image.open(img_path)img_width, img_height = img.sizeimage_info = {"file_name": img_name,"id": len(coco_format["images"]) + 1,"width": img_width,"height": img_height}coco_format["images"].append(image_info)if os.path.exists(label_path):with open(label_path, "r") as file:for line in file:category_id, x_center, y_center, width, height = map(float, line.split())bbox = convert_yolo_to_coco(x_center, y_center, width, height, img_width, img_height)annotation = {"id": annotation_id,"image_id": image_info["id"],"category_id": int(category_id) + 1,"bbox": bbox,"area": bbox[2] * bbox[3],"iscrowd": 0}coco_format["annotations"].append(annotation)annotation_id += 1# 为每个分区保存JSON文件with open(os.path.join(output_dir, f"{split}_coco_format.json"), "w") as json_file:json.dump(coco_format, json_file, indent=4)
3、转化完之后,把图片挪过去就行了
🎈大功告成,转化工作虽然不是全自动的,但是相对简单轻松!
🤞代码是参考的一篇博客,但是时间长,找不到了该博客的链接了!