前方我们讲解了《 FCN-数据篇》。里面包含了如何制作类似pascal voc的label。很大篇幅在谈如何着色,如何转化为索引图像。 由于一些内容参考网上的资料,所以对里面的一些操作含义也有些糊涂。 其实网上的东西也不都对,很多人云亦云。所以需要我们仔细甄别。 其中我就发现了一个错误。我们来从头谈起。
- pascal voc数据集
当我们从网上下载pascal voc2012的数据集,会发现SegmentationClass文件里的标注都是彩色图像。
但是查看其属性,发现其位深为8.
一般情况下,彩色图像都是rgb格式,所以应该是24位,但这里是8位,其实这表明图像是索引格式。
里面8位存放的是索引值,区间范围[0,255]。图像应该还包含一个map,对应着每个索引的颜色。
我们用matlab 查看索引:
>> im=imread('H:\data\VOCtrainval_11-May-2012\VOCdevkit\VOC2012\SegmentationClass\2007_000346.png');
>> info=imfinfo('H:\data\VOCtrainval_11-May-2012\VOCdevkit\VOC2012\SegmentationClass\2007_000346.png');
>> info.Colormapans =0 0 00.5020 0 00 0.5020 00.5020 0.5020 00 0 0.50200.5020 0 0.50200 0.5020 0.50200.5020 0.5020 0.50200.2510 0 00.7529 0 00.2510 0.5020 00.7529 0.5020 00.2510 0 0.50200.7529 0 0.50200.2510 0.5020 0.50200.7529 0.5020 0.50200 0.2510 00.5020 0.2510 00 0.7529 00.5020 0.7529 00 0.2510 0.50200.5020 0.2510 0.50200 0.7529 0.50200.5020 0.7529 0.50200.2510 0.2510 00.7529 0.2510 00.2510 0.7529 00.7529 0.7529 00.2510 0.2510 0.50200.7529 0.2510 0.50200.2510 0.7529 0.50200.7529 0.7529 0.50200 0 0.25100.5020 0 0.25100 0.5020 0.25100.5020 0.5020 0.25100 0 0.75290.5020 0 0.75290 0.5020 0.75290.5020 0.5020 0.75290.2510 0 0.25100.7529 0 0.25100.2510 0.5020 0.25100.7529 0.5020 0.25100.2510 0 0.75290.7529 0 0.75290.2510 0.5020 0.75290.7529 0.5020 0.75290 0.2510 0.25100.5020 0.2510 0.25100 0.7529 0.25100.5020 0.7529 0.25100 0.2510 0.75290.5020 0.2510 0.75290 0.7529 0.75290.5020 0.7529 0.75290.2510 0.2510 0.25100.7529 0.2510 0.25100.2510 0.7529 0.25100.7529 0.7529 0.25100.2510 0.2510 0.75290.7529 0.2510 0.75290.2510 0.7529 0.75290.7529 0.7529 0.75290.1255 0 00.6275 0 00.1255 0.5020 00.6275 0.5020 00.1255 0 0.50200.6275 0 0.50200.1255 0.5020 0.50200.6275 0.5020 0.50200.3765 0 00.8784 0 00.3765 0.5020 00.8784 0.5020 00.3765 0 0.50200.8784 0 0.50200.3765 0.5020 0.50200.8784 0.5020 0.50200.1255 0.2510 00.6275 0.2510 00.1255 0.7529 00.6275 0.7529 00.1255 0.2510 0.50200.6275 0.2510 0.50200.1255 0.7529 0.50200.6275 0.7529 0.50200.3765 0.2510 00.8784 0.2510 00.3765 0.7529 00.8784 0.7529 00.3765 0.2510 0.50200.8784 0.2510 0.50200.3765 0.7529 0.50200.8784 0.7529 0.50200.1255 0 0.25100.6275 0 0.25100.1255 0.5020 0.25100.6275 0.5020 0.25100.1255 0 0.75290.6275 0 0.75290.1255 0.5020 0.75290.6275 0.5020 0.75290.3765 0 0.25100.8784 0 0.25100.3765 0.5020 0.25100.8784 0.5020 0.25100.3765 0 0.75290.8784 0 0.75290.3765 0.5020 0.75290.8784 0.5020 0.75290.1255 0.2510 0.25100.6275 0.2510 0.25100.1255 0.7529 0.25100.6275 0.7529 0.25100.1255 0.2510 0.75290.6275 0.2510 0.75290.1255 0.7529 0.75290.6275 0.7529 0.75290.3765 0.2510 0.25100.8784 0.2510 0.25100.3765 0.7529 0.25100.8784 0.7529 0.25100.3765 0.2510 0.75290.8784 0.2510 0.75290.3765 0.7529 0.75290.8784 0.7529 0.75290 0.1255 00.5020 0.1255 00 0.6275 00.5020 0.6275 00 0.1255 0.50200.5020 0.1255 0.50200 0.6275 0.50200.5020 0.6275 0.50200.2510 0.1255 00.7529 0.1255 00.2510 0.6275 00.7529 0.6275 00.2510 0.1255 0.50200.7529 0.1255 0.50200.2510 0.6275 0.50200.7529 0.6275 0.50200 0.3765 00.5020 0.3765 00 0.8784 00.5020 0.8784 00 0.3765 0.50200.5020 0.3765 0.50200 0.8784 0.50200.5020 0.8784 0.50200.2510 0.3765 00.7529 0.3765 00.2510 0.8784 00.7529 0.8784 00.2510 0.3765 0.50200.7529 0.3765 0.50200.2510 0.8784 0.50200.7529 0.8784 0.50200 0.1255 0.25100.5020 0.1255 0.25100 0.6275 0.25100.5020 0.6275 0.25100 0.1255 0.75290.5020 0.1255 0.75290 0.6275 0.75290.5020 0.6275 0.75290.2510 0.1255 0.25100.7529 0.1255 0.25100.2510 0.6275 0.25100.7529 0.6275 0.25100.2510 0.1255 0.75290.7529 0.1255 0.75290.2510 0.6275 0.75290.7529 0.6275 0.75290 0.3765 0.25100.5020 0.3765 0.25100 0.8784 0.25100.5020 0.8784 0.25100 0.3765 0.75290.5020 0.3765 0.75290 0.8784 0.75290.5020 0.8784 0.75290.2510 0.3765 0.25100.7529 0.3765 0.25100.2510 0.8784 0.25100.7529 0.8784 0.25100.2510 0.3765 0.75290.7529 0.3765 0.75290.2510 0.8784 0.75290.7529 0.8784 0.75290.1255 0.1255 00.6275 0.1255 00.1255 0.6275 00.6275 0.6275 00.1255 0.1255 0.50200.6275 0.1255 0.50200.1255 0.6275 0.50200.6275 0.6275 0.50200.3765 0.1255 00.8784 0.1255 00.3765 0.6275 00.8784 0.6275 00.3765 0.1255 0.50200.8784 0.1255 0.50200.3765 0.6275 0.50200.8784 0.6275 0.50200.1255 0.3765 00.6275 0.3765 00.1255 0.8784 00.6275 0.8784 00.1255 0.3765 0.50200.6275 0.3765 0.50200.1255 0.8784 0.50200.6275 0.8784 0.50200.3765 0.3765 00.8784 0.3765 00.3765 0.8784 00.8784 0.8784 00.3765 0.3765 0.50200.8784 0.3765 0.50200.3765 0.8784 0.50200.8784 0.8784 0.50200.1255 0.1255 0.25100.6275 0.1255 0.25100.1255 0.6275 0.25100.6275 0.6275 0.25100.1255 0.1255 0.75290.6275 0.1255 0.75290.1255 0.6275 0.75290.6275 0.6275 0.75290.3765 0.1255 0.25100.8784 0.1255 0.25100.3765 0.6275 0.25100.8784 0.6275 0.25100.3765 0.1255 0.75290.8784 0.1255 0.75290.3765 0.6275 0.75290.8784 0.6275 0.75290.1255 0.3765 0.25100.6275 0.3765 0.25100.1255 0.8784 0.25100.6275 0.8784 0.25100.1255 0.3765 0.75290.6275 0.3765 0.75290.1255 0.8784 0.75290.6275 0.8784 0.75290.3765 0.3765 0.25100.8784 0.3765 0.25100.3765 0.8784 0.25100.8784 0.8784 0.25100.3765 0.3765 0.75290.8784 0.3765 0.75290.3765 0.8784 0.75290.8784 0.8784 0.7529
或者
[im,map]=imread('H:\data\VOCtrainval_11-May-2012\VOCdevkit\VOC2012\SegmentationClass\2007_000346.png');
如果进一步地想要显示索引图像或者转为rgb,可以:
[cdata,map] = imread( filename ) %读索引图像文件
if ~isempty( map ) rgb = ind2rgb( cdata, map ); %将索引图像数据转为RGB图像数据
end
imshow(rgb )
imshow(cdata,map) %这样读也可以
根据上面显示的map,我们发现与《 FCN-数据篇》的pascal voc的colormap完全一致。
另外需要注意:cdata是从0开始,对应着map的第一行。
自定义数据
根据前面《 FCN-数据篇》 的生成
自定义数据的方法,包括:
1. 使用labelme标注图像,生成灰度图像
2. 将灰度标签图像转化为rgb 24位图像,根据pascal voc的colormap,使用函数label2rgb
3. 将24位png图转换为8位png图。生成索引图像
其中第3步很重要,先前网上的资料有误。之前的代码是:
dirs=dir('F:/xxx/*.png');
for n=1:numel(dirs)strname=strcat('F:/xxx/',dirs(n).name);img=imread(strname);[x,map]=rgb2ind(img,256);newname=strcat('F:/xxx/',dirs(n).name);imwrite(x,map,newname,'png');
end
这里也生成了一个map,当时我就很疑惑为什么这里还有一个map呢?
于是试验了一下:
>> im=imread('G:\deeplearning\FCN_train-master\xxx.png');
>> [a,map2]=rgb2ind(im,256);
>> map2map2 =0 0 00.5020 0 00 0 0.50200.5020 0 0.50200 0.5020 00.5020 0.5020 0
显然这里的map与pascal voc的map不一致了,会导致很严重的问题。
因此正确的是:
>> im=imread('G:\deeplearning\FCN_train-master\xxx.png');
>> map=labelcolormap(256);
>> x=rgb2ind(im,map);
>> imshow(x,map)
>> imwrite(x,map,'test.png','png')
其中第二步就是生成pascal voc的colormap,可以在《 FCN-数据篇》 查找。
二分类标签数据的制作
对于二分类问题而言,但我们标注好了图像语义后,我们常常得到的是黑白灰度图像。比如前景部分用255显示,背景用0显示。而且有可能24位,也可能是8位。
这时我们需要做两件事:
1. 将图像转化为24位
2. 生成8位索引图像
代码如下:
root='H:\data\IrsData\iris_ground-truth\MICHE_subset\';
input_dir=strcat(root,'ground truth\');
output_dir=strcat(root,'temp\');
src_type='tiff';
files = dir([input_dir, '*.', src_type]);
n = length(files);for i = 1:n [filename, type] = strtok(files(i).name, '.');im_src = imread([input_dir, files(i).name]);info=imfinfo([input_dir, files(i).name]);if info.BitDepth==8 %转化为24位rgbim_src=cat(3,im_src,im_src,im_src);end[x,map]=rgb2ind(im_src,2);newname=strcat(output_dir,filename,'.png');imwrite(x,map,newname,'png');end
至此生成的图像时索引图像,二分类,前景为白色,背景为黑色,索引值为0,1.