TensorFlow Object Detection API 教程（Windows10、VS 2012）

TensorFlow Object Detection API环境：Windows 10，Visual Studio 2012，TensorFlow 2.0（CPU版）一、环境配置以上环境请确保已经配置完成，或者其他平台搭好了也行，关键是确保TensorFlow能...

钟治国

659人浏览 · 2018-12-28 16:03:57

钟治国 · 2018-12-28 16:03:57 发布

TensorFlow Object Detection API

环境：Windows 10，Visual Studio 2012，TensorFlow 2.0（CPU版）

一、环境配置

以上环境请确保已经配置完成，或者其他平台搭好了也行，关键是确保TensorFlow能正常Run起来就行。此处不与赘述，重点是TensorFlow的物体识别API的体（填）验（坑）。

二、下载相关文件及工具

1、下载API文件

传送门：https://codeload.github.com/tensorflow/models/zip/master 420MB左右

下载完成后将文件解压至你存放项目的文件夹中，不要放在桌面，因为如果今后移动了，路径出问题你还得去修改。

2、下载并配置依赖包

numpy、os、urllib、sys、tarfile、tensorflow、zipfile、collections、io、matplotlib、PIL

以上依赖包不管你是用 pip install 还是什么方式，装好就完事，我直接使用的VS 2012中的Python拓展包安装功能，然后输入依赖包名，回车，等待安装成功即可。

3、下载proto转换py文件工具--Protobuf

除此之外还需要一个工具来完成object_detection/protos中的所有proto文件转成py文件，以便完成编译。工具传送门：https://github.com/protocolbuffers/protobuf/releases/download/v3.4.0/protoc-3.4.0-win32.zip ，因为是Windows环境所以下载win版的，下载后解压至你的项目文件夹中，进行环境变量配置，目的是将protoc-3.4.0-win32\bin中的proto.exe配置成环境变量：

1）win+R，输入 sysdm.cpl 回车

2）高级 -> 环境变量

3）编辑，追加一条C:\Users\2hi9uo\source\pythonProjects\protoc-3.4.0-win32\bin 红色部分修改为自己刚刚解压proto工具的路径。

4、添加 “PYTHONPATH” 环境变量

1）同刚刚一样，进入环境变量设置

2）这次是新建一个项

变量名：PYTHONPATH

变量值：C:\Users\2hi9uo\source\pythonProjects\models-master\research;C:\Users\2hi9uo\source\pythonProjects\models-master\research\slim

同样，红色部分修改成自己刚刚下载的model文件路径。

3）确认完成后重启电脑，一定要重启才生效。

5、proto文件转换为py文件

打开CMD终端窗口，cd 到 models-master\research 文件中，执行：

protoc object_detection/protos/*.proto --python_out=.

如果不报错，且object_detection/protos中出现许多py文件，则成功。否则检查下：

1）自己的环境变量设置是否成功？可以用在cmd中使用 protoc --version 看看是否配置成功；

2）是否已经将CMD终端 cd 到了 models-master\research 中执行该命令？

三、测试API

1、在 VS 2012中打开model文件夹

2、找到object_detection，新建一个object_detection_tutorial.py文件，内容如下：


import numpy as np
import os
import urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from collections import defaultdict
from io import StringIO
import matplotlib as mpl
mpl.use('TkAgg')
import matplotlib.pyplot as plt
from PIL import Image
# # This is needed to display the images.
# %matplotlib inline
 
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
 
# from utils import label_map_util
# from utils import visualization_utils as vis_util
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
 
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
 
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')
 
NUM_CLASSES = 90
 
# download model
opener = urllib.request.URLopener()
# 下载模型，如果已经下载好了ssd_mobilenet_v1_coco_11_06_2017，下面这句代码可以注释掉了
# 下载模型可能会出现HTTP302错误，可以去网上下载好压缩包放在object_detection中
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())
 
# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
# Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,
                                                            use_display_name=True)
category_index = label_map_util.create_category_index(categories)
 
 
# Helper code
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)
 
 
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3)]
 
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
 
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        # Definite input and output Tensors for detection_graph
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object was detected.
        detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class label.
        detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        for image_path in TEST_IMAGE_PATHS:
            image = Image.open(image_path)
            # the array based representation of the image will be used later in order to prepare the
            # result image with boxes and labels on it.
            image_np = load_image_into_numpy_array(image)
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Each box represents a part of the image where a particular object was detected.
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Each score represent how level of confidence for each of the objects.
            # Score is shown on the result image, together with the class label.
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
                use_normalized_coordinates=True,
                line_thickness=8)
            plt.figure(figsize=IMAGE_SIZE)
            plt.imshow(image_np)
            plt.show()