maskrcnn我都已经看了很久了,知道今天我才可以正式的在自己的数据集上训练它,也是在docker上,从最基本的安装pycocotools开始,中间经历了很多的坑,我都忘记了,这两天时间,我学会了怎样在自己的数据集行训练,首先我选的数据集是deepfashion服装数据集,我自己先做一个demo,所以我的类别只有三类,每类是20张图片,所有我一共有60张图片,以后就自己在扩大数据集
1,我的数据都是自己标注的,我使用的是labelme工具
这个工具的安装有教程,就是直接pip就可以了
我把代码改写了,这样就可以批量将json转化成json数据集了
大家直接参考博客就可以了:
实现labelme批量json_to_dataset方法 - yql_617540298的博客 - CSDN博客 https://blog.csdn.net/yql_617540298/article/details/81110685
里面写的很详细。
转换之后就生成60个文件,每个文件里面又有5个文件夹,
在这里插入图片描述我们在train_data里面建立了4个文件夹
每个文件夹里面放着内容如下:
在这里插入图片描述
其中我们labelme版本不同,生成的mask的位数不一样,一般要求是生成8位的,如果你的版本不是,还得进一步的转换的,我的版本就是8位,所以我不需要在写c++来转换了。该博客建立了四个文件,为保持统一。我也建立四个相同的文件:pic目录存放原始的图片如步骤2.1。json目录存放labelme标注的json文件如步骤2.2。label_json存放生成的datase如步骤2.3。cv2_mask存放如步骤2.6生成的特定物体对应特定颜色的8位彩色label.png图片
因为在每个json文件里面都有一个label.png 文件,我写了一个代码,将他们和自己所在的文件夹的名字相同,然后放在cv2_mask 文件里面。
下面是我的代码:

import os
path='labelme_json'
files=os.listdir(path)
for file in files:
    jpath=os.listdir(os.path.join(path,file))
#     print(file[:-5])
    new=file[:-5]
#     print(jpath[0])
#     newname=os.path.join(path,file,new)
    newnames=os.path.join('cv2_mask',new)
    filename=os.path.join(path,file,jpath[0])
    print(filename)
    print(newnames)
    os.rename(filename,newnames+'.png')

然后整整齐齐的就去运行我的代码去了,可以训练了,
我的训练数据是三类,Blazer,Blouse,Coat,每类是20张图片,然后我就将这个文件都放好,我的代码如下:
在sample下新建一个文件夹叫做clothes,里面写一个训练的文件
代码如下:
1 导入文件和路径

# -*- coding: utf-8 -*-

import os
import sys
import random
import math
import re
import time
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
ROOT_DIR = os.path.abspath("../../")

# Import Mask RCNN
sys.path.append(ROOT_DIR)  # To find local version of the library
from mrcnn.config import Config
#import utils
from mrcnn import model as modellib,utils
from mrcnn import visualize
import yaml
from mrcnn.model import log
from PIL import Image
MODEL_DIR = os.path.join(ROOT_DIR, "logs")



iter_num=0

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)



2重写config类

class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 2

    # Number of classes (including background)
    NUM_CLASSES = 1 + 3  # background + 3 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 384

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE = 100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


config = ShapesConfig()
config.display()


3重写各种文件

class DrugDataset(utils.Dataset):
    # 得到该图中有多少个实例(物体)
    def get_obj_index(self, image):
        n = np.max(image)
        return n

    # 解析labelme中得到的yaml文件,从而得到mask每一层对应的实例标签
    def from_yaml_get_class(self, image_id):
        info = self.image_info[image_id]
        with open(info['yaml_path']) as f:
            temp = yaml.load(f.read())
            labels = temp['label_names']
            del labels[0]
        return labels

    # 重新写draw_mask
    def draw_mask(self, num_obj, mask, image,image_id):
        #print("draw_mask-->",image_id)
        #print("self.image_info",self.image_info)
        info = self.image_info[image_id]
        #print("info-->",info)
        #print("info[width]----->",info['width'],"-info[height]--->",info['height'])
        for index in range(num_obj):
            for i in range(info['width']):
                for j in range(info['height']):
                    #print("image_id-->",image_id,"-i--->",i,"-j--->",j)
                    #print("info[width]----->",info['width'],"-info[height]--->",info['height'])
                    at_pixel = image.getpixel((i, j))
                    if at_pixel == index + 1:
                        mask[j, i, index] = 1
        return mask

    # 重新写load_shapes,里面包含自己的类别,可以任意添加
    # 并在self.image_info信息中添加了path、mask_path 、yaml_path
    # yaml_pathdataset_root_path = "/tongue_dateset/"
    # img_floder = dataset_root_path + "rgb"
    # mask_floder = dataset_root_path + "mask"
    # dataset_root_path = "/tongue_dateset/"
    def load_shapes(self, count, img_floder, mask_floder, imglist, dataset_root_path):
        """Generate the requested number of synthetic images.
        count: number of images to generate.
        height, width: the size of the generated images.
        """
        # Add classes,可通过这种方式扩展多个物体
        self.add_class("shapes", 1, "Blazer") # 黑色素瘤
        self.add_class("shapes", 2, "Blouse")
        self.add_class("shapes", 3, "Coat")
        for i in range(count):
            # 获取图片宽和高

            filestr = imglist[i].split(".")[0]
            #print(imglist[i],"-->",cv_img.shape[1],"--->",cv_img.shape[0])
            #print("id-->", i, " imglist[", i, "]-->", imglist[i],"filestr-->",filestr)
            #filestr = filestr.split("_")[1]
            mask_path = mask_floder + "/" + filestr + ".png"
            yaml_path = dataset_root_path + "labelme_json/" + filestr + "_json/info.yaml"
            print(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")
            cv_img = cv2.imread(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")

            self.add_image("shapes", image_id=i, path=img_floder + "/" + imglist[i],
                           width=cv_img.shape[1], height=cv_img.shape[0], mask_path=mask_path, yaml_path=yaml_path)

    # 重写load_mask
    def load_mask(self, image_id):
        """Generate instance masks for shapes of the given image ID.
        """
        global iter_num
        print("image_id",image_id)
        info = self.image_info[image_id]
        count = 1  # number of object
        img = Image.open(info['mask_path'])
        num_obj = self.get_obj_index(img)
        mask = np.zeros([info['height'], info['width'], num_obj], dtype=np.uint8)
        mask = self.draw_mask(num_obj, mask, img,image_id)
        occlusion = np.logical_not(mask[:, :, -1]).astype(np.uint8)
        for i in range(count - 2, -1, -1):
            mask[:, :, i] = mask[:, :, i] * occlusion

            occlusion = np.logical_and(occlusion, np.logical_not(mask[:, :, i]))
        labels = []
        labels = self.from_yaml_get_class(image_id)
        labels_form = []
        for i in range(len(labels)):
            if labels[i].find("Blazer") != -1:
                # print "box"
                labels_form.append("Blazer")
            elif labels[i].find("Blouse")!=-1:
                #print "column"
                labels_form.append("Blouse")
            elif labels[i].find("Coat")!=-1:
                #print "package"
                labels_form.append("Coat")
        class_ids = np.array([self.class_names.index(s) for s in labels_form])
        return mask, class_ids.astype(np.int32)


def get_ax(rows=1, cols=1, size=8):
    """Return a Matplotlib Axes array to be used in
    all visualizations in the notebook. Provide a
    central point to control graph sizes.

    Change the default size attribute to control the size
    of rendered images
    """
    _, ax = plt.subplots(rows, cols, figsize=(size * cols, size * rows))
    return ax


4基础设置和数据集准备

#基础设置
dataset_root_path="train_data/"
img_floder = dataset_root_path + "pic"
mask_floder = dataset_root_path + "cv2_mask"
#yaml_floder = dataset_root_path
imglist = os.listdir(img_floder)
count = len(imglist)

#train与val数据集准备
dataset_train = DrugDataset()
dataset_train.load_shapes(count, img_floder, mask_floder, imglist,dataset_root_path)
dataset_train.prepare()

#print("dataset_train-->",dataset_train._image_ids)
dataset_val = DrugDataset()
dataset_val.load_shapes(7, img_floder, mask_floder, imglist,dataset_root_path)
dataset_val.prepare()

print("dataset_val-->",dataset_val._image_ids)



5设置训练方式和权重文件

# Load and display random samples
#image_ids = np.random.choice(dataset_train.image_ids, 4)
#for image_id in image_ids:
#    image = dataset_train.load_image(image_id)
#    mask, class_ids = dataset_train.load_mask(image_id)
#    visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names)

# Create model in training mode
model = modellib.MaskRCNN(mode="training", config=config,
                          model_dir=MODEL_DIR)

# Which weights to start with?
init_with = "coco"  # imagenet, coco, or last

if init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)
elif init_with == "coco":
    # Load weights trained on MS COCO, but skip layers that
    # are different due to the different number of classes
    # See README for instructions to download the COCO weights
    model.load_weights(COCO_MODEL_PATH, by_name=True,
                       exclude=["mrcnn_class_logits", "mrcnn_bbox_fc",
                                "mrcnn_bbox", "mrcnn_mask"])
elif init_with == "coco":
    # Load the last model you trained and continue training
    model.load_weights(model.find_last()[1], by_name=True)

# Train the head branches
# Passing layers="heads" freezes all layers except the head
# layers. You can also pass a regular expression to select
# which layers to train by name pattern.

6开始训练

model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=20,
            layers='heads')

训练的截图如下
model.train(dataset_train, dataset_val,learning_rate=config.LEARNING_RATE,epochs=20,layers='heads')
7 测试

# Directory of images to run detection on
IMAGE_DIR = os.path.join(ROOT_DIR, "imagess")
print(IMAGE_DIR)
class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

    # Number of classes (including background)
    NUM_CLASSES = 1 + 80  # background + 3 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 384

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE =100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


#import train_tongue
#class InferenceConfig(coco.CocoConfig):
class InferenceConfig(ShapesConfig):
    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

config = InferenceConfig()

model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)


# Create model object in inference mode.
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)
 
import os
import sys
import random
import math
import numpy as np
import skimage.io
import matplotlib
import matplotlib.pyplot as plt
import cv2
import time
from mrcnn.config import Config
from datetime import datetime 
# COCO Class names
# Index of the class in the list is its ID. For example, to get ID of
# the teddy bear class, use: class_names.index('teddy bear')
class_names = ['BG', 'Blazer','Blouse','Coat']
# Load a random image from the images folder
file_names = next(os.walk(IMAGE_DIR))[2]
image = skimage.io.imread(os.path.join(IMAGE_DIR, random.choice(file_names)))
 
a=datetime.now() 
# Run detection

results = model.detect([image], verbose=1)
b=datetime.now() 
# Visualize results
print("shijian",(b-a).seconds)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            class_names, r['scores'])

测试的结果如下:
在这里插入图片描述
这个小demo我就模拟了一下训练的数据,数据集很小,并且loss很小,中间遇到了一些问题,结局方法如下:
Mask_RCNN:ValueError: Dimension 1 in both shapes must be equal, but are 8 and 324. - qq_15969343的博客 - CSDN博客 https://blog.csdn.net/qq_15969343/article/details/80559154
看我的博客,加上参考博客,你也可以完成自己的训练的,接下来,我将学习如何使用更多的数据,然后自己手动标注,然后整一个很大的数据集,完成训练,最后使用我自己训练自己训练出来的权重,来测试我自己的图片。
参考文献如下:
1 Win10系统下一步一步教你实现MASK_RCNN训练自己的数据集(使用labelme制作自己的数据集)及需要注意的大坑 - wangzhwsme的博客 - CSDN博客 https://blog.csdn.net/u012746060/article/details/82143285#commentBox
2Mask Rcnn使用篇-训练自己的数据集 - qq_30625217的博客 - CSDN博客 https://blog.csdn.net/qq_36810544/article/details/83582397#commentBox
3 Mask Rcnn使用篇-训练自己的数据集 - qq_30625217的博客 - CSDN博客 https://blog.csdn.net/qq_36810544/article/details/83582397#commentBox
后记
以前自己都不知道训练是做什么的,可是我渐渐的学习,我现在都可以训练自己的数据集了,我希望自己一定有一个很好的结局,解决一下自己的算法问题,然后把我的论文写出来,把一些经典的检测算法自己实现,然后就安心的去搞视频中的目标检测了,我就是一个小白开始的,后来我觉得我也可以的时候,自己就越来自信,我以前学什么都觉得学的慢,但是我会渐渐的转进去,然后我就慢慢的找到感觉,直到我自己学会跑这些程序,现在想着用python解决问题,也可以提高我的编程水平,唯一不好的地方就是我最近看的论文比价少。希望以后的自己尽快的把该做的事情都做好,还有一周我就回家了,在回家之前,希望自己可以把小论文搞完。

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐