运用Tensorflow进行目标检测

对象检测是一种计算机视觉技术，它使软件系统能够从给定的图像或视频中检测、定位并跟踪物体。对象检测的一个特殊属性是它能识别对象的类别（如人、桌子、椅子等）并在给定图像中指出其具体位置坐标。这个位置通常通过在物体周围绘制一个边界框来指出。边界框可能会也可能不会准确地定位物体的位置。在图像内定位物体的能力定义了用于检测的算法的性能。人脸检测就是对象检测的一个例子。

通常，对象检测任务分为三个步骤：

生成输入的小片段，如下图所示。你可以看到，大量的边界框覆盖了整个图像。
对每个分割的矩形区域进行特征提取，以预测矩形是否包含有效物体。
将重叠的框合并成一个单一的边界矩形（非极大值抑制）。

TensorFlow是一个用于数值计算和大规模机器学习的开源库，它简化了获取数据、训练模型、提供预测和完善未来结果的过程。TensorFlow集合了机器学习和深度学习模型与算法，使用Python作为方便的前端，并在优化的C++中高效运行。

使用TensorFlow进行对象检测，如上所述，使用这个API不一定需要了解神经网络和机器学习的知识，因为我们主要使用API中提供的文件。我们需要的只是一些Python知识和完成这个项目的热情。

按照以下步骤进行：

第1步：创建一个名为ObjectDetection的文件夹，并用VS Code打开。

第2步：通过在VS Code的终端输入以下命令，从Github仓库下载Tensorflow API

git clone https://github.com/tensorflow/models

第3步：设置虚拟环境

python -m venv --system-site-packages .\venv

激活环境

.\venv\Scripts\activate

将pip版本升级到最新

python -m pip install --upgrade --ignore-installed

第4步：安装依赖项

安装并升级tensorflow

pip install tensorflow
pip install --upgrade tensorflow

安装matplotlib

pip install pillow Cython lxml jupyter matplotlib

导航到models文件夹中的research子文件夹。

cd \models\research\

第5步：现在我们需要下载Protocol Buffers（Protobuf），这是谷歌的一种语言中立、平台中立的扩展机制，用于序列化结构化数据，可以想象成XML，但更小、更快、更简单。

在models文件夹的research子文件夹中提取从上述链接下载的zip的内容，并访问bin文件夹，复制那里的protoc.exe的路径。
然后打开“编辑系统环境变量”并点击“环境变量”。
(i) 在系统变量下选择’path’并点击编辑。
(ii) 点击新建并粘贴’protoc.exe’的复制路径。
第6步：然后在VS Code的终端运行这个命令

protoc object_detection/protos/*.proto --python_out=.

第7步：在同一文件夹中创建一个名为detect.py的新Python文件，并粘贴下面给出的代码：

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import pathlib
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from IPython.display import display
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util# 当前目录不是models时，改变当前工作目录
while "models" in pathlib.Path.cwd().parts:os.chdir('..')# 加载模型函数
def load_model(model_name):base_url = 'http://download.tensorflow.org/models/object_detection/'model_file = model_name + '.tar.gz'model_dir = tf.keras.utils.get_file(fname=model_name, origin=base_url + model_file,untar=True)model_dir = pathlib.Path(model_dir)/"saved_model"model = tf.saved_model.load(str(model_dir))return model# 路径到标签文件
PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt'
# 创建类别索引
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)# 模型名称
model_name = 'ssd_inception_v2_coco_2017_11_17'
# 加载检测模型
detection_model = load_model(model_name)# 为单个图像运行推理的函数
def run_inference_for_single_image(model, image):image = np.asarray(image)# 输入需要是张量，使用`tf.convert_to_tensor`进行转换。input_tensor = tf.convert_to_tensor(image)# 模型期望图像的批量，所以使用`tf.newaxis`添加一个轴。input_tensor = input_tensor[tf.newaxis,...]# 运行推理model_fn = model.signatures['serving_default']output_dict = model_fn(input_tensor)# 所有输出都是批量张量。# 转换为numpy数组，并取索引[0]来去除批量维度。# 我们只对前num_detections个检测感兴趣。num_detections = int(output_dict.pop('num_detections'))output_dict = {key:value[0, :num_detections].numpy() for key,value in output_dict.items()}output_dict['num_detections'] = num_detections# detection_classes应该是整数。output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)# 处理带有掩模的模型：if 'detection_masks' in output_dict:# 将边框掩模调整到图像大小。detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(output_dict['detection_masks'], output_dict['detection_boxes'],image.shape[0], image.shape[1])      detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,tf.uint8)output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()return output_dict# 展示推理的函数
def show_inference(model, frame):# 从摄像头获取画面并将其转换为数组image_np = np.array(frame)# 实际检测。output_dict = run_inference_for_single_image(model, image_np)# 对检测结果进行可视化。vis_util.visualize_boxes_and_labels_on_image_array(image_np,output_dict['detection_boxes'],output_dict['detection_classes'],output_dict['detection_scores'],category_index,instance_masks=output_dict.get('detection_masks_reframed', None),use_normalized_coordinates=True,line_thickness=5)return(image_np)# 现在我们打开摄像头并开始检测物体
import cv2
video_capture = cv2.VideoCapture(0)
while True:# 逐帧捕获re,frame = video_capture.read()Imagenp=show_inference(detection_model, frame)cv2.imshow('object detection', cv2.resize(Imagenp, (800,600)))if cv2.waitKey(1) &amp; 0xFF == ord('q'):break video_capture.release()
cv2.destroyAllWindows()