使用Tensorflow+Object Detection API训练模型进行目标检测

1.环境配置

1.1版本信息

项目版本
系统Windows10专业版1909
CPUIntel Core i5 7200u
内存8GB
GPUNvidia GeForce 940MX
显存2GB
Python3.7.6
CUDA10.2
Tensorflow1.14.0
Object Detection API1.13.0

1.2目录设置

目录

1.3Object Detection API安装

(1)从tensorflow/models下载压缩包,解压。我放到了models文件夹下。
(2)进入D:/anaconda3/models/research/object_detection/protos,可以看到31个项目,大部分都是.proto文件,需要将这些.proto文件编译为.py文件,可执行以下语句,命令行进入object_detection目录。

protoc protos/*.proto --python_out=.  

命令成功执行后,protos目录下将产生与.proto文件对应的.py文件。
在未安装protobuf包的情况下,该语句无法成功编译。
我使用的是Anaconda虚拟环境,可以在命令行输入以下语句直接安装protobuf包。

conda install protobuf  

(3)将以下路径添加到环境中

D:/anaconda3/models/research
D:/anaconda3/models/research/slim

由于我的python环境是由Anaconda构建的,可在D:/anaconda3/Lib/site-packages目录下新建一个.pth文件,将路径保存进去即可。
(4)命令行进入D:/anaconda3/models/research/slim依次执行以下语句

python setup.py build 
python setup.py install

若产生以下错误,删除该目录下的BUILD文件即可。

error: could not create 'build'

(5)测试环境是否配置成功,命令行进入D:/anaconda3/models/research,执行以下语句

python object_detection/builders/model_builder_test.py 

返回以下信息则说明配置成功

----------------------------------------------------------------------
Ran 22 tests in 0.520s

OK (skipped=1)

2.将VOC数据转为TFrecord格式

(1)下载数据库 VOC2007,解压到D:/anaconda3/models/research/object_detection/voc路径下,路径设置参照前面的图,下文的路径中,voc特指D:/anaconda3/models/research/object_detection/voc。
(2)命令行进入D:/anaconda3/models/research/object_detection,分别执行以下语句,将VOC数据集的训练集和测试集转换成TFrecord格式。

#转换voc->tfrecord train
python dataset_tools/create_pascal_tf_record.py --data_dir voc/VOCdevkit --year=VOC2007  --set=train --output_path=voc/pascal_train.record 
#转换voc->tfrecord val 
python dataset_tools/create_pascal_tf_record.py --data_dir voc/VOCdevkit --year=VOC2007  --set=val --output_path=voc/pascal_val.record 

3.训练模型

(1)从COCO-trained models下载预训练的模型,解压并保存到voc/pretrained路径下,在这里我选择了ssd_mobilenet_v1_0.75_depth_coco这个模型。
(2)修改配置文件,这是接下来的训练过程中不可缺少的文件。由于格式固定,可以在object_detection/samples/configs路径下复制配置文件ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync.config进行修改。修改完成后将配置文件保存到voc/config/pipeline_config.config路径下。修改如下:

行数修改说明
1390改成20,表示训练目标的种类
1888000改成328,表示验证阶段的图片数量
146voc/pretrained/model.ckpt
180voc/pascal_train.record
182voc/pascal_label_map.pbtxt
193voc/pascal_val.tfrecord
195voc/pascal_label_map.pbtxt

(3)命令行进入D:/anaconda3/models/research/object_detection,执行以下语句开始训练。

#训练
python model_main.py --train_dir voc/train_dir/ --pipeline_config_path voc/config/pipeline_config.config

4.保存模型

模型训练完成后会保存到指定目录,并不能直接使用,需要进行转换,将训练后的模型转换成能够调用的格式。根据命令行提示,我发现我的模型保存到了C:\Users\new\AppData\Local\Temp路径下的一个随机名文件夹内,不知道为什么没保存到voc/train_dir文件夹中。。
将训练后的模型移动到voc/train_dir路径下,并新建voc/export文件夹。
命令行进入D:/anaconda3/models/research/object_detection,执行以下语句开始转换。

#导出模型
python export_inference_graph.py --input_type image_tensor --pipeline_config_path voc/config/pipeline_config.config --trained_checkpoint_prefix voc/train_dir/model.ckpt-0 --output_directory voc/export

5.测试模型

使用训练后的模型检测示例图片
命令行进入D:/anaconda3/models/research/object_detection,执行以下语句开始测试。

python voc/detect_test.py

以下为detect_test.py内容,基本来源于object_detection目录下的object_detection_tutorial.ipynb,只是注释了下载和解压模型的片段。

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
 
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
 
 
# # This is needed to display the images.
# %matplotlib inline
 
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
 
# from utils import label_map_util
# from utils import visualization_utils as vis_util
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
'''
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
''' 
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = 'voc/export/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'pascal_label_map.pbtxt')
 
NUM_CLASSES = 20
'''
# download model
opener = urllib.request.URLopener()
# 下载模型,如果已经下载好了下面这句代码可以注释掉
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())
 '''
# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
# Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,
                                                            use_display_name=True)
category_index = label_map_util.create_category_index(categories)
 
 
# Helper code
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)
 
 
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3)]
 
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
 
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        # Definite input and output Tensors for detection_graph
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object was detected.
        detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class label.
        detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        for image_path in TEST_IMAGE_PATHS:
            image = Image.open(image_path)
            # the array based representation of the image will be used later in order to prepare the
            # result image with boxes and labels on it.
            image_np = load_image_into_numpy_array(image)
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Each box represents a part of the image where a particular object was detected.
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Each score represent how level of confidence for each of the objects.
            # Score is shown on the result image, together with the class label.
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
                use_normalized_coordinates=True,
                line_thickness=8)
            plt.figure(figsize=IMAGE_SIZE)
            plt.imshow(image_np)
            plt.show()

6.标注图片并转换格式

这节主要是为了方便训练自己的数据,对自己的图片进行标注保存为pascal voc格式,再进行转换即可生成可供训练的.tfrecord文件。
(1)图片标注用到了labelImg这个工具,下载后通过其中的README.rst文件可根据不同系统环境按说明完成安装。
(2)格式转换的博文也不少,按照这篇文章可以自行修改代码,将voc格式转换成tfrecord格式。

7.小结

在完成了以上内容的过程中,我遇到了很多颇费时间的小问题。

首先就是Object Detection API的版本问题,在我搜索教程的时候,各位作者大部分都没有特别注明这个API的版本,以至于我最初直接下载了最新的版本,这导致许多问题都没有解决方法,比如说1-(5)中的测试,测试完成不返回任何信息,没办法判断是否安装成功。

其次是环境变量的问题,由于并不清楚python执行程序搜索路径的设置方法,最初直接将路径添加到系统变量的path中,发现没有生效,然后才使用添加.pth文件的方法。

还有VOC和TFrecord格式转换的问题,在API未安装成功的时候,没办法调用专用的格式转换程序,只能参考别人自己写的程序,使用效果并不太好。

还有关于Ubuntu核心的问题,曾经尝试在Ubuntu系统上运行程序,在配置CUDA的时候,发现系统核心版本太高,又花了一点时间去安装低版本的系统核心。

我的硬件性能不强,在进行模型训练的时候由于显存耗尽直接退出了,我换用了更小的模型,降低了训练集的训练尺度。

Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐