Keras版Faster-RCNN代码学习(IOU,RPN)1

最近开始使用Keras来做深度学习,发现模型搭建相较于MXnet, Caffe等确实比较方便,适合于新手练手,于是找来了目标检测经典的模型Faster-RCNN的keras代码来练练手,代码的主题部分转自知乎专栏Learning Machine,作者张潇捷,链接如下: 
keras版faster-rcnn算法详解(1.RPN计算) 
keras版faster-rcnn算法详解 (2.roi计算及其他)

我再对代码中loss的计算,config的设置等细节进行学习 
Keras版Faster-RCNN代码学习(IOU,RPN)1 
Keras版Faster-RCNN代码学习(Batch Normalization)2 
Keras版Faster-RCNN代码学习(loss,xml解析)3 
Keras版Faster-RCNN代码学习(roipooling resnet/vgg)4 
Keras版Faster-RCNN代码学习(measure_map,train/test)5

config.py

from keras import backend as K
import mathclass Config:def __init__(self):self.verbose = True self.network = 'resnet50' # setting for data augmentation self.use_horizontal_flips = False self.use_vertical_flips = False self.rot_90 = False # anchor box scales self.anchor_box_scales = [128, 256, 512] # anchor box ratios self.anchor_box_ratios = [[1, 1], [1./math.sqrt(2), 2./math.sqrt(2)], [2./math.sqrt(2), 1./math.sqrt(2)]] # size to resize the smallest side of the image self.im_size = 600 # image channel-wise mean to subtract self.img_channel_mean = [103.939, 116.779, 123.68] self.img_scaling_factor = 1.0 # number of ROIs at once self.num_rois = 4 # stride at the RPN (this depends on the network configuration) self.rpn_stride = 16 self.balanced_classes = False # scaling the stdev self.std_scaling = 4.0 self.classifier_regr_std = [8.0, 8.0, 4.0, 4.0] # overlaps for RPN self.rpn_min_overlap = 0.3 self.rpn_max_overlap = 0.7 # overlaps for classifier ROIs self.classifier_min_overlap = 0.1 self.classifier_max_overlap = 0.5 # placeholder for the class mapping, automatically generated by the parser self.class_mapping = None #location of pretrained weights for the base network # weight files can be found at: # https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_th_dim_ordering_th_kernels_notop.h5 # https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 self.model_path = 'model_frcnn.vgg.hdf5' 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59

对代码所需要的参数进行配置

data_generators.py

import cv2
import numpy as np
import copy#传递图像参数,增广配置参数,是否进行图像增广 def augment(img_data, config, augment=True): assert 'filepath' in img_data assert 'bboxes' in img_data assert 'width' in img_data assert 'height' in img_data img_data_aug = copy.deepcopy(img_data) img = cv2.imread(img_data_aug['filepath']) if augment: rows, cols = img.shape[:2] #图像水平翻转,对应的bbox的对角坐标也进行水平翻转,翻转概率为50% if config.use_horizontal_flips and np.random.randint(0, 2) == 0: img = cv2.flip(img, 1) for bbox in img_data_aug['bboxes']: x1 = bbox['x1'] x2 = bbox['x2'] bbox['x2'] = cols - x1 bbox['x1'] = cols - x2 #图像垂直翻转,对应的bbox的对角坐标也进行垂直翻转,翻转概率为50% if config.use_vertical_flips and np.random.randint(0, 2) == 0: img = cv2.flip(img, 0) for bbox in img_data_aug['bboxes']: y1 = bbox['y1'] y2 = bbox['y2'] bbox['y2'] = rows - y1 bbox['y1'] = rows - y2 #图像按90度旋转,对应的bbox的对角坐标也进行90度旋转,旋转概率为50% if config.rot_90: angle = np.random.choice([0,90,180,270],1)[0] if angle == 270: img = np.transpose(img, (1,0,2)) img = cv2.flip(img, 0) elif angle == 180: img = cv2.flip(img, -1) elif angle == 90: img = np.transpose(img, (1,0,2)) img = cv2.flip(img, 1) elif angle == 0: pass for bbox in img_data_aug['bboxes']: x1 = bbox['x1'] x2 = bbox['x2'] y1 = bbox['y1'] y2 = bbox['y2'] if angle == 270: bbox['x1'] = y1 bbox['x2'] = y2 bbox['y1'] = cols - x2 bbox['y2'] = cols - x1 elif angle == 180: bbox['x2'] = cols - x1 bbox['x1'] = cols - x2 bbox['y2'] = rows - y1 bbox['y1'] = rows - y2 elif angle == 90: bbox['x1'] = rows - y2 bbox['x2'] = rows - y1 bbox['y1'] = x1 bbox['y2'] = x2 elif angle == 0: pass img_data_aug['width'] = img.shape[1] img_data_aug['height'] = img.shape[0] return img_data_aug, img 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74

关于坐标计算,还是有点绕的,可通过矩阵计算或者画图描述(画图描述比较清晰,对应对角坐标会发生变化,对角坐标为一个最小的,一个最大的)如[x1,y1,x2,y2],则x1 < x2,y1 < y2,面积为(x2-x1)(y2-y1),所以对角坐标翻转后的坐标并不是对角坐标,需要调整,即找到最小的x,y和最大的x,y。如[x1,y1,x2,y2],则x1 < x2,y1 < y2,进行水平翻转后,cols - x2 < cols - x1,y1 < y2,重新组合的坐标为[cols - x2,y1,cols - x1,y2],其他同理。

IOU,RPN计算

from __future__ import absolute_import
import numpy as np
import cv2 import random import copy from . import data_augment import threading import itertools #并集 def union(au, bu, area_intersection): area_a = (au[2] - au[0]) * (au[3] - au[1]) area_b = (bu[2] - bu[0]) * (bu[3] - bu[1]) area_union = area_a + area_b - area_intersection return area_union #交集 def intersection(ai, bi): x = max(ai[0], bi[0]) y = max(ai[1], bi[1]) w = min(ai[2], bi[2]) - x h = min(ai[3], bi[3]) - y if w < 0 or h < 0: return 0 return w*h #交并比 def iou(a, b): # a and b should be (x1,y1,x2,y2) if a[0] >= a[2] or a[1] >= a[3] or b[0] >= b[2] or b[1] >= b[3]: return 0.0 area_i = intersection(a, b) area_u = union(a, b, area_i) return float(area_i) / float(area_u + 1e-6) #图像resize def get_new_img_size(width, height, img_min_side=600): if width <= height: f = float(img_min_side) / width resized_height = int(f * height) resized_width = img_min_side else: f = float(img_min_side) / height resized_width = int(f * width) resized_height = img_min_side return resized_width, resized_height class SampleSelector: def __init__(self, class_count): # ignore classes that have zero samples self.classes = [b for b in class_count.keys() if class_count[b] > 0] self.class_cycle = itertools.cycle(self.classes) self.curr_class = next(self.class_cycle) def skip_sample_for_balanced_class(self, img_data): class_in_img = False for bbox in img_data['bboxes']: cls_name = bbox['class'] if cls_name == self.curr_class: class_in_img = True self.curr_class = next(self.class_cycle) break if class_in_img: return False else: return True def calc_rpn(C, img_data, width, height, resized_width, resized_height, img_length_calc_function): downscale = float(C.rpn_stride) anchor_sizes = C.anchor_box_scales anchor_ratios = C.anchor_box_ratios num_anchors = len(anchor_sizes) * len(anchor_ratios) # calculate the output map size based on the network architecture (output_width, output_height) = img_length_calc_function(resized_width, resized_height) n_anchratios = len(anchor_ratios) # initialise empty output objectives y_rpn_overlap = np.zeros((output_height, output_width, num_anchors)) y_is_box_valid = np.zeros((output_height, output_width, num_anchors)) y_rpn_regr = np.zeros((output_height, output_width, num_anchors * 4)) num_bboxes = len(img_data['bboxes']) num_anchors_for_bbox = np.zeros(num_bboxes).astype(int) best_anchor_for_bbox = -1*np.ones((num_bboxes, 4)).astype(int) best_iou_for_bbox = np.zeros(num_bboxes).astype(np.float32) best_x_for_bbox = np.zeros((num_bboxes, 4)).astype(int) best_dx_for_bbox = np.zeros((num_bboxes, 4)).astype(np.float32) # get the GT box coordinates, and resize to account for image resizing gta = np.zeros((num_bboxes, 4)) for bbox_num, bbox in enumerate(img_data['bboxes']): # get the GT box coordinates, and resize to account for image resizing gta[bbox_num, 0] = bbox['x1'] * (resized_width / float(width)) gta[bbox_num, 1] = bbox['x2'] * (resized_width / float(width)) gta[bbox_num, 2] = bbox['y1'] * (resized_height / float(height)) gta[bbox_num, 3] = bbox['y2'] * (resized_height / float(height)) # rpn ground truth for anchor_size_idx in range(len(anchor_sizes)): for anchor_ratio_idx in range(n_anchratios): anchor_x = anchor_sizes[anchor_size_idx] * anchor_ratios[anchor_ratio_idx][0] anchor_y = anchor_sizes[anchor_size_idx] * anchor_ratios[anchor_ratio_idx][1] for ix in range(output_width): # x-coordinates of the current anchor box x1_anc = downscale * (ix + 0.5) - anchor_x / 2 x2_anc = downscale * (ix + 0.5) + anchor_x / 2 # ignore boxes that go across image boundaries if x1_anc < 0 or x2_anc > resized_width: continue for jy in range(output_height): # y-coordinates of the current anchor box y1_anc = downscale * (jy + 0.5) - anchor_y / 2 y2_anc = downscale * (jy + 0.5) + anchor_y / 2 # ignore boxes that go across image boundaries if y1_anc < 0 or y2_anc > resized_height: continue # bbox_type indicates whether an anchor should be a target bbox_type = 'neg' # this is the best IOU for the (x,y) coord and the current anchor # note that this is different from the best IOU for a GT bbox best_iou_for_loc = 0.0 for bbox_num in range(num_bboxes): # get IOU of the current GT box and the current anchor box curr_iou = iou([gta[bbox_num, 0], gta[bbox_num, 2], gta[bbox_num, 1], gta[bbox_num, 3]], [x1_anc, y1_anc, x2_anc, y2_anc]) # calculate the regression targets if they will be needed if curr_iou > best_iou_for_bbox[bbox_num] or curr_iou > C.rpn_max_overlap: cx = (gta[bbox_num, 0] + gta[bbox_num, 1]) / 2.0 cy = (gta[bbox_num, 2] + gta[bbox_num, 3]) / 2.0 cxa = (x1_anc + x2_anc)/2.0 cya = (y1_anc + y2_anc)/2.0 tx = (cx - cxa) / (x2_anc - x1_anc) ty = (cy - cya) / (y2_anc - y1_anc) tw = np.log((gta[bbox_num, 1] - gta[bbox_num, 0]) / (x2_anc - x1_anc)) th = np.log((gta[bbox_num, 3] - gta[bbox_num, 2]) / (y2_anc - y1_anc)) if img_data['bboxes'][bbox_num]['class'] != 'bg': # all GT boxes should be mapped to an anchor box, so we keep track of which anchor box was best if curr_iou > best_iou_for_bbox[bbox_num]: best_anchor_for_bbox[bbox_num] = [jy, ix, anchor_ratio_idx, anchor_size_idx] best_iou_for_bbox[bbox_num] = curr_iou best_x_for_bbox[bbox_num,:] = [x1_anc, x2_anc, y1_anc, y2_anc] best_dx_for_bbox[bbox_num,:] = [tx, ty, tw, th] # we set the anchor to positive if the IOU is >0.7 (it does not matter if there was another better box, it just indicates overlap) if curr_iou > C.rpn_max_overlap: bbox_type = 'pos' num_anchors_for_bbox[bbox_num] += 1 # we update the regression layer target if this IOU is the best for the current (x,y) and anchor position if curr_iou > best_iou_for_loc: best_iou_for_loc = curr_iou best_regr = (tx, ty, tw, th) # if the IOU is >0.3 and <0.7, it is ambiguous and no included in the objective if C.rpn_min_overlap < curr_iou < C.rpn_max_overlap: # gray zone between neg and pos if bbox_type != 'pos': bbox_type = 'neutral' # turn on or off outputs depending on IOUs if bbox_type == 'neg': y_is_box_valid[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 1 y_rpn_overlap[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 0 elif bbox_type == 'neutral': y_is_box_valid[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 0 y_rpn_overlap[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 0 elif bbox_type == 'pos': y_is_box_valid[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 1 y_rpn_overlap[jy, ix, anchor_ratio_idx + n_anchratios * anchor_size_idx] = 1 start = 4 * (anchor_ratio_idx + n_anchratios * anchor_size_idx) y_rpn_regr[jy, ix, start:start+4] = best_regr # we ensure that every bbox has at least one positive RPN region for idx in range(num_anchors_for_bbox.shape[0]): if num_anchors_for_bbox[idx] == 0: # no box with an IOU greater than zero ... if best_anchor_for_bbox[idx, 0] == -1: continue y_is_box_valid[ best_anchor_for_bbox[idx,0], best_anchor_for_bbox[idx,1], best_anchor_for_bbox[idx,2] + n_anchratios * best_anchor_for_bbox[idx,3]] = 1 y_rpn_overlap[ best_anchor_for_bbox[idx,0], best_anchor_for_bbox[idx,1], best_anchor_for_bbox[idx,2] + n_anchratios * best_anchor_for_bbox[idx,3]] = 1 start = 

转载于:https://www.cnblogs.com/kekexuanxaun/p/9459368.html

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/282533.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

欧拉函数模板

一、单个欧拉函数计算 可评测链接&#xff1a;http://codevs.cn/problem/4939/ 单个欧拉函数计算公式&#xff1a;φ&#xff08;n&#xff09;n*&#xff08;1-1/p1&#xff09;*&#xff08;1-1/p2&#xff09;*……*&#xff08;1-1/pn&#xff09; Step 1&#xff1a; 一边…

洛谷P1145 约瑟夫

题目描述 n个人站成一圈&#xff0c;从某个人开始数数&#xff0c;每次数到m的人就被杀掉&#xff0c;然后下一个人重新开始数&#xff0c;直到最后只剩一个人。现在有一圈人&#xff0c;k个好人站在一起&#xff0c;k个坏人站在一起。从第一个好人开始数数。你要确定一个最小的…

.NET 反向代理-YARP

什么是 YARPYARP (另一个反向代理) 设计为一个库&#xff0c;提供核心代理功能&#xff0c;你可以根据应用程序的特定需求进行自定义。YARP 是使用 .NET的基础架构构建在 .NET上的。YARP 的主要不同之处在于&#xff0c;它被设计成可以通过 .NET 代码轻松定制和调整&#xff0c…

JavaScript 开发的45个经典技巧

2019独角兽企业重金招聘Python工程师标准>>> 前言&#xff1a;此篇译文在各网站均有标注原创的声明&#xff0c;译者名字已不可考&#xff0c;暂为佚名 JavaScript是一个绝冠全球的编程语言&#xff0c;可用于Web开发、移动应用开发&#xff08;PhoneGap、Appcelera…

PHP循环输出二维数组

目的: 将二维数组中的每一个元素输出 首先定义一个二维数组 //定义数组 $arr array(array(北京,上海,深圳,广州),array(黑龙江,吉林,辽宁,江苏) ); 一 for循环输出 1.1 直接输出 //for循环遍历数组 for($i 0; $i < count($arr); $i) {for($j 0; $j < count($arr[…

回归远程 - 云原生IDE是IaC从表象触达本质的必然选择 | SmartIDE

作者&#xff1a;徐磊&#xff0c;开源云原生SmartIDE创始人、LEANOSFT创始人/首席架构师/CEO&#xff0c;微软最有价值专家MVP/微软区域技术总监Regional Director&#xff0c;华为云最有价值专家。从事软件工程咨询服务超过15年时间&#xff0c;为超过200家不同类型的企业提供…

android获取手机机型、厂商、deviceID基本信息

/*** 系统工具类*/ public class SystemUtil {/*** 获取当前手机系统语言。** return 返回当前系统语言。例如&#xff1a;当前设置的是“中文-中国”&#xff0c;则返回“zh-CN”*/public static String getSystemLanguage() {return Locale.getDefault().getLanguage();}/***…

题目1362:左旋转字符串(Move!Move!!Move!!!)

题目1362&#xff1a;左旋转字符串&#xff08;Move!Move!!Move!!!&#xff09; 时间限制&#xff1a;2 秒 内存限制&#xff1a;32 兆 特殊判题&#xff1a;否 提交&#xff1a;2306 解决&#xff1a;961 题目描述&#xff1a;汇编语言中有一种移位指令叫做循环左移&#xff0…

PHP简单实现递归

//递归 //斐波那契数列 function digui($n) {if($n > 2) {$arr[$n] digui($n-1) digui($n-2);return $arr[$n];} else {return 1;} }//使用 echo digui(5); 总结 : 首先应该想到出口是什么,将出口放在else条件里 例如,本例斐波那契数列中,出口是前两个数是1,也就是数组下…

(三)Controller接口控制器详解(二)

一、AbstractController&#xff08;简单控制器&#xff09; AbstractController使用方法&#xff1a; 首先让我们使用AbstractController来重写第二章的HelloWorldController&#xff1a; public class HelloWorldController extends AbstractController {Overrideprotected M…

[BZOJ]1095 Hide捉迷藏(ZJOI2007)

一道神题&#xff0c;两种神做法。 Description 捉迷藏 Jiajia和Wind是一对恩爱的夫妻&#xff0c;并且他们有很多孩子。某天&#xff0c;Jiajia、Wind和孩子们决定在家里玩捉迷藏游戏。他们的家很大且构造很奇特&#xff0c;由N个屋子和N-1条双向走廊组成&#xff0c;这N-1条走…

Spring4-自动装配Beans-通过注解@Autowired在构造方法上

1.创建Maven项目,项目名称springdemo19,如图所示2.配置Maven,修改项目中的pom.xml文件,修改内容如下<project xmlns"http://maven.apache.org/POM/4.0.0" xmlns:xsi"http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation"http://mave…

15个开源的工业软件

出品 | OSC开源社区&#xff08;ID&#xff1a;oschina2013)不同的工业流程&#xff0c;需要不同的工业软件。此前&#xff0c;我们已经介绍了面向研发设计环节的开源软件&#xff08;详情查看&#xff1a;20 个开源的工业设计软件&#xff09;&#xff0c;今天就来介绍一下面向…

PHP开发中保证接口安全

模拟客户端请求:<?php namespace Home\Controller; use Think\Controller;class ClientController extends Controller{const TOKEN API;//模拟前台请求服务器api接口public function getDataFromServer(){//时间戳$timeStamp time();//随机字符串$randomStr $this ->…

MySQL远程访问报错解决

2019独角兽企业重金招聘Python工程师标准>>> 我之前的一篇博客讲了MySQL配置远程访问的方法&#xff0c;但是可能配置了账户以后还是不能访问&#xff0c;这可能是防火墙的原因&#xff0c;在CentOS里&#xff0c;我们修改一下防火墙设置就可以了 1. 进入防火墙配置…

jssdk.php

/*** Created by PhpStorm.* Date: 17/8/19* Time: 下午2:24*/ class JSSDK {private $appId;private $appSecret;public function __construct($appId, $appSecret) {$this->appId $appId;$this->appSecret $appSecret;}public function getSignPackage() {$jsapiTick…

GNU/Linux与开源文化的那些人和事

一、计算机的发明 世上本无路&#xff0c;走的人多了&#xff0c;就有了路。世上本无计算机&#xff0c;琢磨的人多了……没有计算机&#xff0c;一切无从谈起。 三个人对计算机的发明功不可没&#xff0c;居功至伟。阿兰图灵&#xff08;Alan Mathison Turing&#xff09;、阿…

PHP使用PHPMailer发送邮件

1. 首先下载phpmailer插件,并将插件复制到目录下 下载地址: http://download.csdn.net/download/m_nanle_xiaobudiu/10261269 2. home/view/user/mail_chck.html <!DOCTYPE html> <html lang"en"> <head><meta charset"UTF-8"><…

python学习记录2

一、两个模块&#xff08;sys和os&#xff09; 1 #!/usr/bin/env python2 # _*_ coding: UTF-8 _*_3 # Author:taoke4 import sys5 print(sys.path)#打印环境变量6 print(sys.argv[0])#当前文件相对路径,sys.argv是一个列表&#xff0c;第一个元素为程序本身的相对路径&#xf…

cordova-config.xml配置应用图标

1. <icon src"res/icon/ios/browser.png"/> 2.规格&#xff1a; iphone平台一般要求3种规格的图片&#xff1a;1x、2x、3x&#xff0c;也是就Icon.png、Icon2x.png、Icon3x.png. 注意&#xff1a;iOS所有图标的圆角效果由系统生成&#xff0c;给到的图标本身不…