【VOC格式xml文件解析】—

【VOC格式xml文件解析】——Python

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/4/26 12:49
# @Author  : @linlianqin
# @Site    : 
# @File    : test1.py
# @Software: PyCharm
# @description:
import xml.etree.ElementTree as ETdef xmli(xmlpath):xmlTree = ET.parse(xmlpath) # 解析xml文件root = xmlTree.getroot() # 获得xml根节点size = root.find('size') # 查找size结点# 主要这里一定是findall，查找所有的object结点，也就是标注框的信息，否则用find返回的是Nonetypeobjects = root.findall('object') # 查找所有的object结点for obj in objects:bbox = obj.find('bndbox')# 修改相应结点的值bbox.find('ymin').text = str(222)bbox.find('ymax').text = str(222)return xmlTree # 返回更新后的xml文件句柄xmlTree = xmli(r'test.xml')
xmlTree.write('_flip_updown.xml') # 存储新的xml文件

以下转自：python VOC格式的xml文件解析

python解析XML常见的有三种方法：

    xml.dom.*模块，它是W3C DOM API的实现，若需要处理DOM API则该模块很适合；
    xml.sax.*模块，它是SAX API的实现，这个模块牺牲了便捷性来换取速度和内存占用，SAX是一个基于事件的API，这就意味着它可以“在空中”处理庞大数量的的文档，不用完全加载进内存；
    xml.etree.ElementTree模块（简称 ET），它提供了轻量级的Python式的API，相对于DOM来说ET 快了很多，而且有很多令人愉悦的API可以使用，相对于SAX来说ET的ET.iterparse也提供了 “在空中” 的处理方式，没有必要加载整个文档到内存，ET的性能的平均值和SAX差不多，但是API的效率更高一点而且使用起来很方便。

#!/usr/bin/python
# -*- coding: UTF-8 -*-
# get annotation object bndbox location
try:import xml.etree.cElementTree as ET  #解析xml的c语言版的模块
except ImportError:import xml.etree.ElementTree as ET##get object annotation bndbox loc start 
def GetAnnotBoxLoc(AnotPath):#AnotPath VOC标注文件路径tree = ET.ElementTree(file=AnotPath)  #打开文件，解析成一棵树型结构root = tree.getroot()#获取树型结构的根ObjectSet=root.findall('object')#找到文件中所有含有object关键字的地方，这些地方含有标注目标ObjBndBoxSet={} #以目标类别为关键字，目标框为值组成的字典结构for Object in ObjectSet:ObjName=Object.find('name').textBndBox=Object.find('bndbox')x1 = int(BndBox.find('xmin').text)#-1 #-1是因为程序是按0作为起始位置的y1 = int(BndBox.find('ymin').text)#-1x2 = int(BndBox.find('xmax').text)#-1y2 = int(BndBox.find('ymax').text)#-1BndBoxLoc=[x1,y1,x2,y2]if ObjBndBoxSet.__contains__(ObjName):ObjBndBoxSet[ObjName].append(BndBoxLoc)#如果字典结构中含有这个类别了，那么这个目标框要追加到其值的末尾else:ObjBndBoxSet[ObjName]=[BndBoxLoc]#如果字典结构中没有这个类别，那么这个目标框就直接赋值给其值吧return ObjBndBoxSet
##get object annotation bndbox loc end