目标检测YOLO数据集的三种格式及转换

目标检测YOLO数据集的三种格式

在目标检测领域,YOLO(You Only Look Once)算法是一个流行的选择。为了训练和测试YOLO模型,需要将数据集格式化为YOLO可以识别的格式。以下是三种常见的YOLO数据集格式及其特点和转换方法。

1. YOLO的TXT格式

YOLO的TXT格式是最简单的数据集格式之一。它要求图片和标签分别存放在两个文件夹中,并按照一定的比例分为训练集和验证集。每个TXT文件包含目标的类别和边界框的坐标,格式如下:
在这里插入图片描述如图中所示,以.txt存放边界框信息

类别 [x_center, y_center, w, h]
  • 类别:目标的类别编号。
  • x_center, y_center, w, h:边界框的中心坐标和宽度、高度,这些值都是相对于整张图片的比例,小于1。

例如,使用makesense.ai标注后直接输出的就是TXT标签文件。为了训练,需要编写一个custom.yaml配置文件,指定路径和类别名称,然后使用官方的训练脚本。

2. VOC格式

VOC格式是一种更为复杂的数据集格式,它包含图片、标注和数据集分割。VOC格式由以下部分组成:
在这里插入图片描述VOC格式是以xml方式給出标注信息的

  • JPEGImages:存放数据集图片的文件夹。
  • Annotations:存放与图片对应的XML标注文件的文件夹。
  • ImageSets/Main:包含train.txt和val.txt文件,用于区分训练集和验证集。
<annotation><folder>17</folder> # 图片所处文件夹<filename>77258.bmp</filename> # 图片名<path>~/frcnn-image/61/ADAS/image/frcnn-image/17/77258.bmp</path><source>  #图片来源相关信息<database>Unknown</database>  </source><size> #图片尺寸<width>640</width><height>480</height><depth>3</depth></size><segmented>0</segmented>  #是否有分割label<object> 包含的物体<name>car</name>  #物体类别<pose>Unspecified</pose>  #物体的姿态<truncated>0</truncated>  #物体是否被部分遮挡(>15%)<difficult>0</difficult>  #是否为难以辨识的物体, 主要指要结体背景才能判断出类别的物体。虽有标注, 但一般忽略这类物体<bndbox>  #物体的bound box<xmin>2</xmin>     #左<ymin>156</ymin>   #上<xmax>111</xmax>   #右<ymax>259</ymax>   #下</bndbox></object>
</annotation>
'''
XML文件记录的是像素坐标,不是比例,因此在图像尺寸变化时可能会有轻微误差。可以使用Python脚本来将TXT格式的标注转换为XML格式,并进一步生成用于VOC格式的train.txt和val.txt文件。#### 3. COCO格式
COCO格式是一种广泛使用的数据集格式,它将多个TXT文件转换为单个JSON文件。COCO格式的转换涉及以下步骤:- 创建一个包含图像、标注和类别信息的字典。
- 将类别信息添加到JSON字典中。
- 将图像信息添加到JSON字典中。
- 将标注信息添加到JSON字典中。COCO格式的转换通常通过编写Python脚本完成,该脚本读取TXT标注文件,并将它们转换为JSON格式。转换后,可以使用YOLOv4或YOLOr等YOLO变体进行训练。### 转换示例
以下是一些用于转换数据集格式的Python代码示例:#### TXT转XML
```python
def makexml(picPath, txtPath, xmlPath):# 省略了函数的具体实现,用于将YOLO格式的TXT标注文件转换为VOC格式的XML文件pass

格式转换

代码来源于知乎高赞文章,亲测好用无bug~

coco转voc

from pycocotools.coco import COCO
import os
from lxml import etree, objectify
import shutil
from tqdm import tqdm
import sys
import argparse
'''# 将类别名字和id建立索引
def catid2name(coco):classes = dict()for cat in coco.dataset['categories']:classes[cat['id']] = cat['name']return classes# 将标签信息写入xml
def save_anno_to_xml(filename, size, objs, save_path):E = objectify.ElementMaker(annotate=False)anno_tree = E.annotation(E.folder("DATA"),E.filename(filename),E.source(E.database("The VOC Database"),E.annotation("PASCAL VOC"),E.image("flickr")),E.size(E.width(size['width']),E.height(size['height']),E.depth(size['depth'])),E.segmented(0))for obj in objs:E2 = objectify.ElementMaker(annotate=False)anno_tree2 = E2.object(E.name(obj[0]),E.pose("Unspecified"),E.truncated(0),E.difficult(0),E.bndbox(E.xmin(obj[1]),E.ymin(obj[2]),E.xmax(obj[3]),E.ymax(obj[4])))anno_tree.append(anno_tree2)anno_path = os.path.join(save_path, filename[:-3] + "xml")etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)# 利用cocoAPI从json中加载信息
def load_coco(anno_file, xml_save_path):if os.path.exists(xml_save_path):shutil.rmtree(xml_save_path)os.makedirs(xml_save_path)coco = COCO(anno_file)classes = catid2name(coco)imgIds = coco.getImgIds()classesIds = coco.getCatIds()for imgId in tqdm(imgIds):size = {}img = coco.loadImgs(imgId)[0]filename = img['file_name']width = img['width']height = img['height']size['width'] = widthsize['height'] = heightsize['depth'] = 3annIds = coco.getAnnIds(imgIds=img['id'], iscrowd=None)anns = coco.loadAnns(annIds)objs = []for ann in anns:object_name = classes[ann['category_id']]# bbox:[x,y,w,h]bbox = list(map(int, ann['bbox']))xmin = bbox[0]ymin = bbox[1]xmax = bbox[0] + bbox[2]ymax = bbox[1] + bbox[3]obj = [object_name, xmin, ymin, xmax, ymax]objs.append(obj)save_anno_to_xml(filename, size, objs, xml_save_path)def parseJsonFile(data_dir, xmls_save_path):assert os.path.exists(data_dir), "data dir:{} does not exits".format(data_dir)if os.path.isdir(data_dir):data_types = ['train2017', 'val2017']for data_type in data_types:ann_file = 'instances_{}.json'.format(data_type)xmls_save_path = os.path.join(xmls_save_path, data_type)load_coco(ann_file, xmls_save_path)elif os.path.isfile(data_dir):anno_file = data_dirload_coco(anno_file, xmls_save_path)if __name__ == '__main__':"""脚本说明:该脚本用于将coco格式的json文件转换为voc格式的xml文件参数说明:data_dir:json文件的路径xml_save_path:xml输出路径"""parser = argparse.ArgumentParser()parser.add_argument('-d', '--data-dir', type=str, default='./data/labels/coco/train.json', help='json path')parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')opt = parser.parse_args()print(opt)if len(sys.argv) > 1:parseJsonFile(opt.data_dir, opt.save_path)else:data_dir = './data/labels/coco/train.json'xml_save_path = './data/convert/voc'parseJsonFile(data_dir=data_dir, xmls_save_path=xml_save_path)

coco转yolo

from pycocotools.coco import COCO
import os
import shutil
from tqdm import tqdm
import sys
import argparseimages_nums = 0
category_nums = 0
bbox_nums = 0# 将类别名字和id建立索引
def catid2name(coco):classes = dict()for cat in coco.dataset['categories']:classes[cat['id']] = cat['name']return classes# 将[xmin,ymin,xmax,ymax]转换为yolo格式[x_center, y_center, w, h](做归一化)
def xyxy2xywhn(object, width, height):cat_id = object[0]xn = object[1] / widthyn = object[2] / heightwn = object[3] / widthhn = object[4] / heightout = "{} {:.5f} {:.5f} {:.5f} {:.5f}".format(cat_id, xn, yn, wn, hn)return outdef save_anno_to_txt(images_info, save_path):filename = images_info['filename']txt_name = filename[:-3] + "txt"with open(os.path.join(save_path, txt_name), "w") as f:for obj in images_info['objects']:line = xyxy2xywhn(obj, images_info['width'], images_info['height'])f.write("{}\n".format(line))# 利用cocoAPI从json中加载信息
def load_coco(anno_file, xml_save_path):if os.path.exists(xml_save_path):shutil.rmtree(xml_save_path)os.makedirs(xml_save_path)coco = COCO(anno_file)classes = catid2name(coco)imgIds = coco.getImgIds()classesIds = coco.getCatIds()with open(os.path.join(xml_save_path, "classes.txt"), 'w') as f:for id in classesIds:f.write("{}\n".format(classes[id]))for imgId in tqdm(imgIds):info = {}img = coco.loadImgs(imgId)[0]filename = img['file_name']width = img['width']height = img['height']info['filename'] = filenameinfo['width'] = widthinfo['height'] = heightannIds = coco.getAnnIds(imgIds=img['id'], iscrowd=None)anns = coco.loadAnns(annIds)objs = []for ann in anns:object_name = classes[ann['category_id']]# bbox:[x,y,w,h]bbox = list(map(float, ann['bbox']))xc = bbox[0] + bbox[2] / 2.yc = bbox[1] + bbox[3] / 2.w = bbox[2]h = bbox[3]obj = [ann['category_id'], xc, yc, w, h]objs.append(obj)info['objects'] = objssave_anno_to_txt(info, xml_save_path)def parseJsonFile(json_path, txt_save_path):assert os.path.exists(json_path), "json path:{} does not exists".format(json_path)if os.path.exists(txt_save_path):shutil.rmtree(txt_save_path)os.makedirs(txt_save_path)assert json_path.endswith('json'), "json file:{} It is not json file!".format(json_path)load_coco(json_path, txt_save_path)if __name__ == '__main__':"""脚本说明:该脚本用于将coco格式的json文件转换为yolo格式的txt文件参数说明:json_path:json文件的路径txt_save_path:txt保存的路径"""parser = argparse.ArgumentParser()parser.add_argument('-jp', '--json-path', type=str, default='./data/labels/coco/train.json', help='json path')parser.add_argument('-s', '--save-path', type=str, default='./data/convert/yolo', help='txt save path')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)parseJsonFile(opt.json_path, opt.save_path)# print("image nums: {}".format(images_nums))# print("category nums: {}".format(category_nums))# print("bbox nums: {}".format(bbox_nums))else:json_path = './data/labels/coco/train.json'  # r'D:\practice\compete\goodsDec\data\train\train.json'txt_save_path = './data/convert/yolo'parseJsonFile(json_path, txt_save_path)# print("image nums: {}".format(images_nums))# print("category nums: {}".format(category_nums))# print("bbox nums: {}".format(bbox_nums))

voc转coco

import xml.etree.ElementTree as ET
import os
import json
from datetime import datetime
import sys
import argparsecoco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []category_set = dict()
image_set = set()category_item_id = -1
image_id = 000000
annotation_id = 0def addCatItem(name):global category_item_idcategory_item = dict()category_item['supercategory'] = 'none'category_item_id += 1category_item['id'] = category_item_idcategory_item['name'] = namecoco['categories'].append(category_item)category_set[name] = category_item_idreturn category_item_iddef addImgItem(file_name, size):global image_idif file_name is None:raise Exception('Could not find filename tag in xml file.')if size['width'] is None:raise Exception('Could not find width tag in xml file.')if size['height'] is None:raise Exception('Could not find height tag in xml file.')image_id += 1image_item = dict()image_item['id'] = image_idimage_item['file_name'] = file_nameimage_item['width'] = size['width']image_item['height'] = size['height']image_item['license'] = Noneimage_item['flickr_url'] = Noneimage_item['coco_url'] = Noneimage_item['date_captured'] = str(datetime.today())coco['images'].append(image_item)image_set.add(file_name)return image_iddef addAnnoItem(object_name, image_id, category_id, bbox):global annotation_idannotation_item = dict()annotation_item['segmentation'] = []seg = []# bbox[] is x,y,w,h# left_topseg.append(bbox[0])seg.append(bbox[1])# left_bottomseg.append(bbox[0])seg.append(bbox[1] + bbox[3])# right_bottomseg.append(bbox[0] + bbox[2])seg.append(bbox[1] + bbox[3])# right_topseg.append(bbox[0] + bbox[2])seg.append(bbox[1])annotation_item['segmentation'].append(seg)annotation_item['area'] = bbox[2] * bbox[3]annotation_item['iscrowd'] = 0annotation_item['ignore'] = 0annotation_item['image_id'] = image_idannotation_item['bbox'] = bboxannotation_item['category_id'] = category_idannotation_id += 1annotation_item['id'] = annotation_idcoco['annotations'].append(annotation_item)def read_image_ids(image_sets_file):ids = []with open(image_sets_file, 'r') as f:for line in f.readlines():ids.append(line.strip())return idsdef parseXmlFilse(data_dir, json_save_path, split='train'):assert os.path.exists(data_dir), "data path:{} does not exist".format(data_dir)labelfile = split + ".txt"image_sets_file = os.path.join(data_dir, "ImageSets", "Main", labelfile)xml_files_list = []if os.path.isfile(image_sets_file):ids = read_image_ids(image_sets_file)xml_files_list = [os.path.join(data_dir, "Annotations", f"{i}.xml") for i in ids]elif os.path.isdir(data_dir):# 修改此处xml的路径即可# xml_dir = os.path.join(data_dir,"labels/voc")xml_dir = data_dirxml_list = os.listdir(xml_dir)xml_files_list = [os.path.join(xml_dir, i) for i in xml_list]for xml_file in xml_files_list:if not xml_file.endswith('.xml'):continuetree = ET.parse(xml_file)root = tree.getroot()# 初始化size = dict()size['width'] = Nonesize['height'] = Noneif root.tag != 'annotation':raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))# 提取图片名字file_name = root.findtext('filename')assert file_name is not None, "filename is not in the file"# 提取图片 size {width,height,depth}size_info = root.findall('size')assert size_info is not None, "size is not in the file"for subelem in size_info[0]:size[subelem.tag] = int(subelem.text)if file_name is not None and size['width'] is not None and file_name not in image_set:# 添加coco['image'],返回当前图片IDcurrent_image_id = addImgItem(file_name, size)print('add image with name: {}\tand\tsize: {}'.format(file_name, size))elif file_name in image_set:raise Exception('file_name duplicated')else:raise Exception("file name:{}\t size:{}".format(file_name, size))# 提取一张图片内所有目标object标注信息object_info = root.findall('object')if len(object_info) == 0:continue# 遍历每个目标的标注信息for object in object_info:# 提取目标名字object_name = object.findtext('name')if object_name not in category_set:# 创建类别索引current_category_id = addCatItem(object_name)else:current_category_id = category_set[object_name]# 初始化标签列表bndbox = dict()bndbox['xmin'] = Nonebndbox['xmax'] = Nonebndbox['ymin'] = Nonebndbox['ymax'] = None# 提取box:[xmin,ymin,xmax,ymax]bndbox_info = object.findall('bndbox')for box in bndbox_info[0]:bndbox[box.tag] = int(box.text)if bndbox['xmin'] is not None:if object_name is None:raise Exception('xml structure broken at bndbox tag')if current_image_id is None:raise Exception('xml structure broken at bndbox tag')if current_category_id is None:raise Exception('xml structure broken at bndbox tag')bbox = []# xbbox.append(bndbox['xmin'])# ybbox.append(bndbox['ymin'])# wbbox.append(bndbox['xmax'] - bndbox['xmin'])# hbbox.append(bndbox['ymax'] - bndbox['ymin'])print('add annotation with object_name:{}\timage_id:{}\tcat_id:{}\tbbox:{}'.format(object_name,current_image_id,current_category_id,bbox))addAnnoItem(object_name, current_image_id, current_category_id, bbox)json_parent_dir = os.path.dirname(json_save_path)if not os.path.exists(json_parent_dir):os.makedirs(json_parent_dir)json.dump(coco, open(json_save_path, 'w'))print("class nums:{}".format(len(coco['categories'])))print("image nums:{}".format(len(coco['images'])))print("bbox nums:{}".format(len(coco['annotations'])))if __name__ == '__main__':""""""parser = argparse.ArgumentParser()parser.add_argument('-d', '--voc-dir', type=str, default='data/label/voc', help='voc path')parser.add_argument('-s', '--save-path', type=str, default='./data/convert/coco/train.json', help='json save path')parser.add_argument('-t', '--type', type=str, default='train', help='only use in voc2012/2007')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)parseXmlFilse(opt.voc_dir, opt.save_path, opt.type)else:# voc_data_dir = r'D:\dataset\VOC2012\VOCdevkit\VOC2012'voc_data_dir = './data/labels/voc'json_save_path = './data/convert/coco/train.json'split = 'train'parseXmlFilse(data_dir=voc_data_dir, json_save_path=json_save_path, split=split)

voc转yolo

import os
import json
import argparse
import sys
import shutil
from lxml import etree
from tqdm import tqdmcategory_set = set()
image_set = set()
bbox_nums = 0def parse_xml_to_dict(xml):"""将xml文件解析成字典形式,参考tensorflow的recursive_parse_xml_to_dictArgs:xml: xml tree obtained by parsing XML file contents using lxml.etreeReturns:Python dictionary holding XML contents."""if len(xml) == 0:  # 遍历到底层,直接返回tag对应的信息return {xml.tag: xml.text}result = {}for child in xml:child_result = parse_xml_to_dict(child)  # 递归遍历标签信息if child.tag != 'object':result[child.tag] = child_result[child.tag]else:if child.tag not in result:  # 因为object可能有多个,所以需要放入列表里result[child.tag] = []result[child.tag].append(child_result[child.tag])return {xml.tag: result}def write_classIndices(category_set):class_indices = dict((k, v) for v, k in enumerate(category_set))json_str = json.dumps(dict((val, key) for key, val in class_indices.items()), indent=4)with open('class_indices.json', 'w') as json_file:json_file.write(json_str)def xyxy2xywhn(bbox, size):bbox = list(map(float, bbox))size = list(map(float, size))xc = (bbox[0] + (bbox[2] - bbox[0]) / 2.) / size[0]yc = (bbox[1] + (bbox[3] - bbox[1]) / 2.) / size[1]wn = (bbox[2] - bbox[0]) / size[0]hn = (bbox[3] - bbox[1]) / size[1]return (xc, yc, wn, hn)def parser_info(info: dict, only_cat=True, class_indices=None):filename = info['annotation']['filename']image_set.add(filename)objects = []width = int(info['annotation']['size']['width'])height = int(info['annotation']['size']['height'])for obj in info['annotation']['object']:obj_name = obj['name']category_set.add(obj_name)if only_cat:continuexmin = int(obj['bndbox']['xmin'])ymin = int(obj['bndbox']['ymin'])xmax = int(obj['bndbox']['xmax'])ymax = int(obj['bndbox']['ymax'])bbox = xyxy2xywhn((xmin, ymin, xmax, ymax), (width, height))if class_indices is not None:obj_category = class_indices[obj_name]object = [obj_category, bbox]objects.append(object)return filename, objectsdef parseXmlFilse(voc_dir, save_dir):assert os.path.exists(voc_dir), "ERROR {} does not exists".format(voc_dir)if os.path.exists(save_dir):shutil.rmtree(save_dir)os.makedirs(save_dir)xml_files = [os.path.join(voc_dir, i) for i in os.listdir(voc_dir) if os.path.splitext(i)[-1] == '.xml']for xml_file in xml_files:with open(xml_file) as fid:xml_str = fid.read()xml = etree.fromstring(xml_str)info_dict = parse_xml_to_dict(xml)parser_info(info_dict, only_cat=True)with open(save_dir + "/classes.txt", 'w') as classes_file:for cat in sorted(category_set):classes_file.write("{}\n".format(cat))class_indices = dict((v, k) for k, v in enumerate(sorted(category_set)))xml_files = tqdm(xml_files)for xml_file in xml_files:with open(xml_file) as fid:xml_str = fid.read()xml = etree.fromstring(xml_str)info_dict = parse_xml_to_dict(xml)filename, objects = parser_info(info_dict, only_cat=False, class_indices=class_indices)if len(objects) != 0:global bbox_numsbbox_nums += len(objects)with open(save_dir + "/" + filename.split(".")[0] + ".txt", 'w') as f:for obj in objects:f.write("{} {:.5f} {:.5f} {:.5f} {:.5f}\n".format(obj[0], obj[1][0], obj[1][1], obj[1][2], obj[1][3]))if __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument('--voc-dir', type=str, default='./data/labels/voc')parser.add_argument('--save-dir', type=str, default='./data/convert/yolo')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)parseXmlFilse(**vars(opt))print("image nums: {}".format(len(image_set)))print("category nums: {}".format(len(category_set)))print("bbox nums: {}".format(bbox_nums))else:voc_dir = './data/labels/voc'save_dir = './data/convert/yolo'parseXmlFilse(voc_dir, save_dir)print("image nums: {}".format(len(image_set)))print("category nums: {}".format(len(category_set)))print("bbox nums: {}".format(bbox_nums))

yolo转coco

import argparse
import json
import os
import sys
import shutil
from datetime import datetimeimport cv2coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []category_set = dict()
image_set = set()image_id = 000000
annotation_id = 0def addCatItem(category_dict):for k, v in category_dict.items():category_item = dict()category_item['supercategory'] = 'none'category_item['id'] = int(k)category_item['name'] = vcoco['categories'].append(category_item)def addImgItem(file_name, size):global image_idimage_id += 1image_item = dict()image_item['id'] = image_idimage_item['file_name'] = file_nameimage_item['width'] = size[1]image_item['height'] = size[0]image_item['license'] = Noneimage_item['flickr_url'] = Noneimage_item['coco_url'] = Noneimage_item['date_captured'] = str(datetime.today())coco['images'].append(image_item)image_set.add(file_name)return image_iddef addAnnoItem(object_name, image_id, category_id, bbox):global annotation_idannotation_item = dict()annotation_item['segmentation'] = []seg = []# bbox[] is x,y,w,h# left_topseg.append(bbox[0])seg.append(bbox[1])# left_bottomseg.append(bbox[0])seg.append(bbox[1] + bbox[3])# right_bottomseg.append(bbox[0] + bbox[2])seg.append(bbox[1] + bbox[3])# right_topseg.append(bbox[0] + bbox[2])seg.append(bbox[1])annotation_item['segmentation'].append(seg)annotation_item['area'] = bbox[2] * bbox[3]annotation_item['iscrowd'] = 0annotation_item['ignore'] = 0annotation_item['image_id'] = image_idannotation_item['bbox'] = bboxannotation_item['category_id'] = category_idannotation_id += 1annotation_item['id'] = annotation_idcoco['annotations'].append(annotation_item)def xywhn2xywh(bbox, size):bbox = list(map(float, bbox))size = list(map(float, size))xmin = (bbox[0] - bbox[2] / 2.) * size[1]ymin = (bbox[1] - bbox[3] / 2.) * size[0]w = bbox[2] * size[1]h = bbox[3] * size[0]box = (xmin, ymin, w, h)return list(map(int, box))def parseXmlFilse(image_path, anno_path, save_path, json_name='train.json'):assert os.path.exists(image_path), "ERROR {} dose not exists".format(image_path)assert os.path.exists(anno_path), "ERROR {} dose not exists".format(anno_path)if os.path.exists(save_path):shutil.rmtree(save_path)os.makedirs(save_path)json_path = os.path.join(save_path, json_name)category_set = []with open(anno_path + '/classes.txt', 'r') as f:for i in f.readlines():category_set.append(i.strip())category_id = dict((k, v) for k, v in enumerate(category_set))addCatItem(category_id)images = [os.path.join(image_path, i) for i in os.listdir(image_path)]files = [os.path.join(anno_path, i) for i in os.listdir(anno_path)]images_index = dict((v.split(os.sep)[-1][:-4], k) for k, v in enumerate(images))for file in files:if os.path.splitext(file)[-1] != '.txt' or 'classes' in file.split(os.sep)[-1]:continueif file.split(os.sep)[-1][:-4] in images_index:index = images_index[file.split(os.sep)[-1][:-4]]img = cv2.imread(images[index])shape = img.shapefilename = images[index].split(os.sep)[-1]current_image_id = addImgItem(filename, shape)else:continuewith open(file, 'r') as fid:for i in fid.readlines():i = i.strip().split()category = int(i[0])category_name = category_id[category]bbox = xywhn2xywh((i[1], i[2], i[3], i[4]), shape)addAnnoItem(category_name, current_image_id, category, bbox)json.dump(coco, open(json_path, 'w'))print("class nums:{}".format(len(coco['categories'])))print("image nums:{}".format(len(coco['images'])))print("bbox nums:{}".format(len(coco['annotations'])))if __name__ == '__main__':"""脚本说明:本脚本用于将yolo格式的标注文件.txt转换为coco格式的标注文件.json参数说明:anno_path:标注文件txt存储路径save_path:json文件输出的文件夹image_path:图片路径json_name:json文件名字"""parser = argparse.ArgumentParser()parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/yolo', help='yolo txt path')parser.add_argument('-s', '--save-path', type=str, default='./data/convert/coco', help='json save path')parser.add_argument('--image-path', default='./data/images')parser.add_argument('--json-name', default='train.json')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)parseXmlFilse(**vars(opt))else:anno_path = './data/labels/yolo'save_path = './data/convert/coco'image_path = './data/images'json_name = 'train.json'parseXmlFilse(image_path, anno_path, save_path, json_name)

yolo转voc

import argparse
import os
import sys
import shutilimport cv2
from lxml import etree, objectify# 将标签信息写入xml
from tqdm import tqdmimages_nums = 0
category_nums = 0
bbox_nums = 0def save_anno_to_xml(filename, size, objs, save_path):E = objectify.ElementMaker(annotate=False)anno_tree = E.annotation(E.folder("DATA"),E.filename(filename),E.source(E.database("The VOC Database"),E.annotation("PASCAL VOC"),E.image("flickr")),E.size(E.width(size[1]),E.height(size[0]),E.depth(size[2])),E.segmented(0))for obj in objs:E2 = objectify.ElementMaker(annotate=False)anno_tree2 = E2.object(E.name(obj[0]),E.pose("Unspecified"),E.truncated(0),E.difficult(0),E.bndbox(E.xmin(obj[1][0]),E.ymin(obj[1][1]),E.xmax(obj[1][2]),E.ymax(obj[1][3])))anno_tree.append(anno_tree2)anno_path = os.path.join(save_path, filename[:-3] + "xml")etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)def xywhn2xyxy(bbox, size):bbox = list(map(float, bbox))size = list(map(float, size))xmin = (bbox[0] - bbox[2] / 2.) * size[1]ymin = (bbox[1] - bbox[3] / 2.) * size[0]xmax = (bbox[0] + bbox[2] / 2.) * size[1]ymax = (bbox[1] + bbox[3] / 2.) * size[0]box = [xmin, ymin, xmax, ymax]return list(map(int, box))def parseXmlFilse(image_path, anno_path, save_path):global images_nums, category_nums, bbox_numsassert os.path.exists(image_path), "ERROR {} dose not exists".format(image_path)assert os.path.exists(anno_path), "ERROR {} dose not exists".format(anno_path)if os.path.exists(save_path):shutil.rmtree(save_path)os.makedirs(save_path)category_set = []with open(anno_path + '/classes.txt', 'r') as f:for i in f.readlines():category_set.append(i.strip())category_nums = len(category_set)category_id = dict((k, v) for k, v in enumerate(category_set))images = [os.path.join(image_path, i) for i in os.listdir(image_path)]files = [os.path.join(anno_path, i) for i in os.listdir(anno_path)]images_index = dict((v.split(os.sep)[-1][:-4], k) for k, v in enumerate(images))images_nums = len(images)for file in tqdm(files):if os.path.splitext(file)[-1] != '.txt' or 'classes' in file.split(os.sep)[-1]:continueif file.split(os.sep)[-1][:-4] in images_index:index = images_index[file.split(os.sep)[-1][:-4]]img = cv2.imread(images[index])shape = img.shapefilename = images[index].split(os.sep)[-1]else:continueobjects = []with open(file, 'r') as fid:for i in fid.readlines():i = i.strip().split()category = int(i[0])category_name = category_id[category]bbox = xywhn2xyxy((i[1], i[2], i[3], i[4]), shape)obj = [category_name, bbox]objects.append(obj)bbox_nums += len(objects)save_anno_to_xml(filename, shape, objects, save_path)if __name__ == '__main__':"""脚本说明:本脚本用于将yolo格式的标注文件.txt转换为voc格式的标注文件.xml参数说明:anno_path:标注文件txt存储路径save_path:json文件输出的文件夹image_path:图片路径"""parser = argparse.ArgumentParser()parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/yolo', help='yolo txt path')parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')parser.add_argument('--image-path', default='./data/images')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)parseXmlFilse(**vars(opt))print("image nums: {}".format(images_nums))print("category nums: {}".format(category_nums))print("bbox nums: {}".format(bbox_nums))else:anno_path = './data/labels/yolo'save_path = './data/convert/voc1'image_path = './data/images'parseXmlFilse(image_path, anno_path, save_path)print("image nums: {}".format(images_nums))print("category nums: {}".format(category_nums))print("bbox nums: {}".format(bbox_nums))

图片可视化

coco格式,图片可视化

import argparse
import os
import sys
from collections import defaultdict
from xml import etree
from pycocotools.coco import COCOimport cv2
import matplotlibmatplotlib.use('TkAgg')
import matplotlib.pyplot as plt
from tqdm import tqdmcategory_set = dict()
image_set = set()
every_class_num = defaultdict(int)category_item_id = -1def addCatItem(name):global category_item_idcategory_item = dict()category_item_id += 1category_item['id'] = category_item_idcategory_item['name'] = namecategory_set[name] = category_item_idreturn category_item_iddef draw_box(img, objects, draw=True):for object in objects:category_name = object[0]every_class_num[category_name] += 1if category_name not in category_set:category_id = addCatItem(category_name)else:category_id = category_set[category_name]xmin = int(object[1])ymin = int(object[2])xmax = int(object[3])ymax = int(object[4])if draw:def hex2rgb(h):  # rgb order (PIL)return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))hex = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB','2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')palette = [hex2rgb('#' + c) for c in hex]n = len(palette)c = palette[int(category_id) % n]bgr = Falsecolor = (c[2], c[1], c[0]) if bgr else ccv2.rectangle(img, (xmin, ymin), (xmax, ymax), color)cv2.putText(img, category_name, (xmin, ymin), cv2.FONT_HERSHEY_SIMPLEX, 1, color)return img# 将类别名字和id建立索引
def catid2name(coco):classes = dict()for cat in coco.dataset['categories']:classes[cat['id']] = cat['name']return classesdef show_image(image_path, anno_path, show=False, plot_image=False):assert os.path.exists(image_path), "image path:{} dose not exists".format(image_path)assert os.path.exists(anno_path), "annotation path:{} does not exists".format(anno_path)if not anno_path.endswith(".json"):raise RuntimeError("ERROR {} dose not a json file".format(anno_path))coco = COCO(anno_path)classes = catid2name(coco)imgIds = coco.getImgIds()classesIds = coco.getCatIds()for imgId in tqdm(imgIds):size = {}img = coco.loadImgs(imgId)[0]filename = img['file_name']image_set.add(filename)width = img['width']height = img['height']size['width'] = widthsize['height'] = heightsize['depth'] = 3annIds = coco.getAnnIds(imgIds=img['id'], iscrowd=None)anns = coco.loadAnns(annIds)objs = []for ann in anns:object_name = classes[ann['category_id']]# bbox:[x,y,w,h]bbox = list(map(int, ann['bbox']))xmin = bbox[0]ymin = bbox[1]xmax = bbox[0] + bbox[2]ymax = bbox[1] + bbox[3]obj = [object_name, xmin, ymin, xmax, ymax]objs.append(obj)file_path = os.path.join(image_path, filename)img = cv2.imread(file_path)if img is None:continueimg = draw_box(img, objs, show)if show:cv2.imshow(filename, img)cv2.waitKey()cv2.destroyAllWindows()if plot_image:# 绘制每种类别个数柱状图plt.bar(range(len(every_class_num)), every_class_num.values(), align='center')# 将横坐标0,1,2,3,4替换为相应的类别名称plt.xticks(range(len(every_class_num)), every_class_num.keys(), rotation=90)# 在柱状图上添加数值标签for index, (i, v) in enumerate(every_class_num.items()):plt.text(x=index, y=v, s=str(v), ha='center')# 设置x坐标plt.xlabel('image class')# 设置y坐标plt.ylabel('number of images')# 设置柱状图的标题plt.title('class distribution')plt.savefig("class_distribution.png")plt.show()if __name__ == '__main__':"""脚本说明:该脚本用于coco标注格式(.json)的标注框可视化参数明说:image_path:图片数据路径anno_path:json标注文件路径show:是否展示标注后的图片plot_image:是否对每一类进行统计,并且保存图片"""parser = argparse.ArgumentParser()parser.add_argument('-ip', '--image-path', type=str, default='./data/images', help='image path')parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/coco/train.json', help='annotation path')parser.add_argument('-s', '--show', action='store_true', help='weather show img')parser.add_argument('-p', '--plot-image', action='store_true')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)show_image(opt.image_path, opt.anno_path, opt.show, opt.plot_image)print(every_class_num)print("category nums: {}".format(len(category_set)))print("image nums: {}".format(len(image_set)))print("bbox nums: {}".format(sum(every_class_num.values())))else:image_path = './data/images'anno_path = './data/labels/coco/train.json'show_image(image_path, anno_path, show=True, plot_image=True)print(every_class_num)print("category nums: {}".format(len(category_set)))print("image nums: {}".format(len(image_set)))print("bbox nums: {}".format(sum(every_class_num.values())))

voc格式,图片可视化

import os
import cv2
import matplotlib.pyplot as plt
from tqdm import tqdm
from lxml import etree
from collections import defaultdict
import argparse
import syscategory_set = dict()
image_set = set()
every_class_num = defaultdict(int)category_item_id = -1def draw_box(img, objects, draw=True):for object in objects:category_name = object['name']every_class_num[category_name] += 1if category_name not in category_set:category_id = addCatItem(category_name)else:category_id = category_set[category_name]xmin = int(object['bndbox']['xmin'])ymin = int(object['bndbox']['ymin'])xmax = int(object['bndbox']['xmax'])ymax = int(object['bndbox']['ymax'])if draw:def hex2rgb(h):  # rgb order (PIL)return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))hex = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB','2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')palette = [hex2rgb('#' + c) for c in hex]n = len(palette)c = palette[int(category_id) % n]bgr = Falsecolor = (c[2], c[1], c[0]) if bgr else ccv2.rectangle(img, (xmin, ymin), (xmax, ymax), color)cv2.putText(img, category_name, (xmin, ymin), cv2.FONT_HERSHEY_SIMPLEX, 1, color)return imgdef addCatItem(name):global category_item_idcategory_item = dict()category_item_id += 1category_item['id'] = category_item_idcategory_item['name'] = namecategory_set[name] = category_item_idreturn category_item_iddef parse_xml_to_dict(xml):"""将xml文件解析成字典形式,参考tensorflow的recursive_parse_xml_to_dictArgs:xml: xml tree obtained by parsing XML file contents using lxml.etreeReturns:Python dictionary holding XML contents."""if len(xml) == 0:  # 遍历到底层,直接返回tag对应的信息return {xml.tag: xml.text}result = {}for child in xml:child_result = parse_xml_to_dict(child)  # 递归遍历标签信息if child.tag != 'object':result[child.tag] = child_result[child.tag]else:if child.tag not in result:  # 因为object可能有多个,所以需要放入列表里result[child.tag] = []result[child.tag].append(child_result[child.tag])return {xml.tag: result}def show_image(image_path, anno_path, show=False, plot_image=False):assert os.path.exists(image_path), "image path:{} dose not exists".format(image_path)assert os.path.exists(anno_path), "annotation path:{} does not exists".format(anno_path)anno_file_list = [os.path.join(anno_path, file) for file in os.listdir(anno_path) if file.endswith(".xml")]for xml_file in tqdm(anno_file_list):if not xml_file.endswith('.xml'):continuewith open(xml_file) as fid:xml_str = fid.read()xml = etree.fromstring(xml_str)xml_info_dict = parse_xml_to_dict(xml)filename = xml_info_dict['annotation']['filename']image_set.add(filename)file_path = os.path.join(image_path, filename)if not os.path.exists(file_path):continueimg = cv2.imread(file_path)if img is None:continueimg = draw_box(img, xml_info_dict['annotation']['object'], show)if show:cv2.imshow(filename, img)cv2.waitKey()cv2.destroyAllWindows()if plot_image:# 绘制每种类别个数柱状图plt.bar(range(len(every_class_num)), every_class_num.values(), align='center')# 将横坐标0,1,2,3,4替换为相应的类别名称plt.xticks(range(len(every_class_num)), every_class_num.keys(), rotation=90)# 在柱状图上添加数值标签for index, (i, v) in enumerate(every_class_num.items()):plt.text(x=index, y=v, s=str(v), ha='center')# 设置x坐标plt.xlabel('image class')# 设置y坐标plt.ylabel('number of images')# 设置柱状图的标题plt.title('class distribution')plt.savefig("class_distribution.png")plt.show()if __name__ == '__main__':脚本说明:该脚本用于voc标注格式(.xml)的标注框可视化参数明说:image_path:图片数据路径anno_path:xml标注文件路径show:是否展示标注后的图片plot_image:是否对每一类进行统计,并且保存图片"""parser = argparse.ArgumentParser()parser.add_argument('-ip', '--image-path', type=str, default='./data/images', help='image path')parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/voc', help='annotation path')parser.add_argument('-s', '--show', action='store_true', help='weather show img')parser.add_argument('-p', '--plot-image', action='store_true')opt = parser.parse_args()if len(sys.argv) > 1:print(opt)show_image(opt.image_path, opt.anno_path, opt.show, opt.plot_image)print(every_class_num)print("category nums: {}".format(len(category_set)))print("image nums: {}".format(len(image_set)))print("bbox nums: {}".format(sum(every_class_num.values())))else:image_path = './data/images'anno_path = './data/convert/voc'show_image(image_path, anno_path, show=True, plot_image=True)print(every_class_num)print("category nums: {}".format(len(category_set)))print("image nums: {}".format(len(image_set)))print("bbox nums: {}".format(sum(every_class_num.values())))

总结

选择合适的数据集格式对于训练和部署YOLO模型至关重要。TXT格式简单易用,适合初学者和快速原型开发;VOC格式适合需要更详细标注信息的项目;而COCO格式则因其标准化和通用性,成为许多研究和实际应用的首选。了解这些格式及其转换方法,可以帮助研究人员和开发者更有效地处理目标检测任务。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/bicheng/1397.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

计算机系统结构(二) (万字长文建议收藏)

计算机系统结构 (二) 本文首发于个人博客网站&#xff1a;http://www.blog.lekshome.top/由于CSDN并不是本人主要的内容输出平台&#xff0c;所以大多数博客直接由md文档导入且缺少审查和维护&#xff0c;如果存在图片或其他格式错误可以前往上述网站进行查看CSDN留言不一定能够…

大话设计模式-里氏代换原则

里氏代换原则&#xff08;Liskov Substitution Principle&#xff0c;LSP&#xff09; 概念 里氏代换原则是面向对象设计的基本原则之一&#xff0c;由美国计算机科学家芭芭拉利斯科夫&#xff08;Barbara Liskov&#xff09;提出。这个原则定义了子类型之间的关系&#xff0…

【人工智能基础】经典逻辑与归结原理

本章节的大部分内容与离散数学的命题、谓词两章重合。 假言推理的合式公式形式 R,R→P⇒PR,R∨P⇒P 链式推理 R→P,P→Q⇒R→QR∨P,P∨Q⇒R∨Q 互补文字&#xff1a;P和P 亲本子句&#xff1a;含有互补文字的子句 R∨P,P∨Q为亲本子句 注意&#xff1a; 必须化成析取范式…

命理八字之电子木鱼的代码实现

#uniapp# #电子木鱼# 不讲废话&#xff0c;上截图 目录结构如下图 功能描述&#xff1a; 点击一下&#xff0c;敲一下&#xff0c;伴随敲击声&#xff0c;可自动点击。自动点击需看视频广告&#xff0c;或者升级VIP会员。 疑点解答&#xff1a; 即animation动画的时候&…

Window中Jenkins部署asp/net core web主要配置

代码如下 D: cd D:\tempjenkins\src\ --git工作目录 dotnet restore -s "https://nuget.cdn.azure.cn/v3/index.json" --nuget dotnet build dotnet publish -c release -o %publishPath% --发布路径

Day08React——第八天

useEffect 概念&#xff1a;useEffect 是一个 React Hook 函数&#xff0c;用于在React组件中创建不是由事件引起而是由渲染本身引起的操作&#xff0c;比如发送AJAx请求&#xff0c;更改daom等等 需求&#xff1a;在组件渲染完毕后&#xff0c;立刻从服务器获取频道列表数据…

Java:二叉树(1)

从现在开始&#xff0c;我们进入二叉树的学习&#xff0c;二叉树是数据结构的重点部分&#xff0c;在了解这个结构之前&#xff0c;我们先来了解一下什么是树型结构吧&#xff01; 一、树型结构 1、树型结构简介 树是一种非线性的数据结构&#xff0c;它是由n&#xff08;n>…

Matlab无基础快速上手1(遗传算法框架)

本文用经典遗传算法框架模板&#xff0c;对matlab新手友好&#xff0c;快速上手看懂matlab代码&#xff0c;快速应用实践&#xff0c;源代码在文末给出。 基本原理&#xff1a; 遗传算法&#xff08;Genetic Algorithm&#xff0c;GA&#xff09;是一种受生物学启发的优化算法…

在Gtiee搭建仓库传代码/多人开发/个人代码备份---git同步---TortoiseGit+TortoiseSVN

文章目录 前言1.安装必要软件2. Gitee建立新仓库git同步2.1 Gitee建立新仓库2.2 Gitee仓库基本配置2.3 Git方式进行同步 3. TortoiseGitTortoiseSVN常用开发方式3.1 秘钥相关3.2 TortoiseGit拉取代码TortoiseGit提交代码 4. 其他功能探索总结 前言 正常企业的大型项目都会使用…

TR5 - Transformer的位置编码

&#x1f368; 本文为&#x1f517;365天深度学习训练营 中的学习记录博客&#x1f356; 原作者&#xff1a;K同学啊 目录 前言什么是位置编码1. 定义2. 三角函数3. 位置编码公式4. 位置编码示例 可视化理解位置编码1. 代码实现2. 观察不同位置对应的曲线3. 整句话的位置编码可…

排序 “贰” 之选择排序

目录 ​编辑 1. 选择排序基本思想 2. 直接选择排序 2.1 实现步骤 2.2 代码示例 2.3 直接选择排序的特性总结 3. 堆排序 3.1 实现步骤 3.2 代码示例 3.3 堆排序的特性总结 1. 选择排序基本思想 每一次从待排序的数据元素中选出最小&#xff08;或最大&#xff09;的一个…

Guitar Pro简谱输入方法 Guitar Pro简谱音高怎么调整,Guitar Pro功能介绍

一、新版本特性概览 Guitar Pro v8.1.1 Build 17在保留了前版本强大功能的基础上&#xff0c;进一步优化了用户体验和功能性能。新版本主要更新包括以下几个方面&#xff1a; 界面优化&#xff1a;新版界面更加简洁美观&#xff0c;操作更加便捷&#xff0c;即使是初学者也能快…

在线拍卖系统,基于SpringBoot+Vue+MySql开发的在线拍卖系统设计和实现

目录 一. 系统介绍 二. 功能模块 2.1. 管理员功能模块 2.2. 用户功能模块 2.3. 前台首页功能模块 2.4. 部分代码实现 一. 系统介绍 随着社会的发展&#xff0c;社会的各行各业都在利用信息化时代的优势。计算机的优势和普及使得各种信息系统的开发成为必需。 在线拍卖系…

Docker - 简介

原文地址&#xff0c;使用效果更佳&#xff01; Docker - 简介 | CoderMast编程桅杆https://www.codermast.com/dev-tools/docker/docker-introduce.html Docker是什么&#xff1f; Docker 是一个开源的应用容器引擎&#xff0c;基于 Go 语言 并遵从 Apache2.0 协议开源。 D…

vulfocus靶场couchdb 权限绕过 (CVE-2017-12635)

Apache CouchDB是一个开源数据库&#xff0c;专注于易用性和成为"完全拥抱web的数据库"。它是一个使用JSON作为存储格式&#xff0c;JavaScript作为查询语言&#xff0c;MapReduce和HTTP作为API的NoSQL数据库。应用广泛&#xff0c;如BBC用在其动态内容展示平台&…

串口RS485

1.原理 全双工&#xff1a;在同一时刻可以同时进行数据的接收和数据的发送&#xff0c;两者互不影响 半双工&#xff1a;在同一时刻只能进行数据的接收或者数据的发送&#xff0c;两者不能同时进行 差分信号幅值相同&#xff0c;相位相反&#xff0c;有更强的抗干扰能力。 干…

vlan的学习笔记1

vlan&#xff1a; 1.一般情况下:以下概念意思等同: 一个vlan一个广播域 一个网段 一个子网 2.一般情况下: &#xff08;1&#xff09;相同vlan之间可以直接通信&#xff0c;不同vlan之间不能直接通信! &#xff08;2&#xff09;vlan技术属于二层技术&…

C语言中, 文件包含处理,#include< > 与 #include ““的区别

文件包含处理 指一个源文件可以将另外一个文件的全部内容包含进来 &#xff23;语言提供了#include命令用来实现文件包含的操作 #include< > 与 #include ""的区别 <> 表示系统直接按系统指定的目录检索 "" 表示系统先在 "" 指定…

Rust序列化和反序列化

Rust 编写python 模块 必备库 docker 启动 nginx 服务 NGINX 反向代理配置

MySQL下载与安装

文章目录 1&#xff1a;MySQL下载与安装2&#xff1a;配置环境变量3&#xff1a;验证是否安装成功 1&#xff1a;MySQL下载与安装 打开MySQL官网&#xff0c;MySQL 下载链接选择合适的版本和操作系统&#xff0c;页面跳转之后选择No thanks, just start my download.等待下载即…