st-gcn训练自建行为识别数据集

st-gcn训练自建行为识别数据集代码下载与环境配置准备行为数据代码下载与环境配置首先参照下面的命令，下载st-gcn算法的训练代码，配置环境。git clone https://github.com/yysijie/st-gcn.gitcd st-gcnpip install -r requirements.txtcd torchlightpython setup.py installcd ..

枸杞叶儿

13728人浏览 · 2021-07-16 11:16:04

枸杞叶儿 · 2021-07-16 11:16:04 发布

st-gcn训练自建行为识别数据集

一、代码下载与环境配置

首先参照下面的命令，下载st-gcn算法的训练代码，配置环境。

git clone https://github.com/yysijie/st-gcn.git
cd st-gcn
pip install -r requirements.txt
cd torchlight
python setup.py install
cd ..

二、准备行为数据

训练之前，需要根据kinetics-skeleton数据集的格式，提取自建数据集中目标的行为数据。下图中是kinetics-skeleton数据集的组成。其中kinetics_train和kinetics_val文件夹中存储的是每一段视频中的行为信息，包括每一帧的姿态和行为标签。另外两个json文件中包含了对应文件夹中所有的文件名称和行为标签与索引。相关示例可下载。
kinetics-skleton数据集
提取生成kinetics-skeleton人体行为数据集的方法参考下面链接：
kinetics-skeleton格式行为数据提取方法

三、数据转换

stgcn训练代码中自带了数据转换代码tools/kinetics_gendata.py,使用该脚本将kinetics-skleton数据集转换为训练使用的npy与pkl文件。

python tools/kinetics_gendata.py

这里需要注意根据自己的数据集，修改kinetics_gendata.py中的内容，包括37-39行的num_person_in、num_person_out和max_frame，55行的关键点个数，72~83行的数据读取路径。feeder_kinetics.py中也要相应的修改。

	num_person_in=5,  #observe the first 5 persons 
	num_person_out=2,  #then choose 2 persons with the highest score
	max_frame=300):

	shape=(len(sample_name), 3, max_frame, 18, num_person_out))

	# output data shape (N, C, T, V, M)
	self.N = len(self.sample_name)  #sample
	self.C = 3  #channel
	self.T = 300  #frame
	self.V = 18  #joint
	self.M = self.num_person_out  #person

修改后运行脚本，转换生成npy与pkl文件如图。
数据转换

四、添加Layout

在net/utils/graph.py文件里面get_edge函数中增加一个elif，num_node为关键点个数、self_link为连接关系。如下添加的是一个‘my_pose’Layout，关键点个数为20（默认的pose点数是18）。

注意这里的默认的layout如果符合自己定义的姿态就不用修改，否则需要自定义一个。其中num_node为关键点的个数，neighbor_link为关键点连接关系。如果新定义的姿态点数不为18，在后续转换中可能还有修改保持一致。

elif layout == 'my_pose':
    self.num_node = 20
    self_link = [(i, i) for i in range(self.num_node)]
    neighbor_link = [(0, 1), (0, 3), (1, 2), (3, 4), (0, 5), (0, 11),
                     (5, 6), (6, 7), (11, 12), (12, 13),
                     (5, 8), (11, 14), (8, 9), (9, 10), (14, 15), (15, 16),
                     (17, 18), (8, 19), (14, 19), (17, 5), (17, 8), (17, 11),
                     (17, 14)]
    self.edge = self_link + neighbor_link
    self.center = 1

五、修改训练参数

修改config/st_gcn/kinetics-skeleton/train.yaml中的相关参数。
data_path和label_path修改为之前生成的文件路径；
num_class改为自建数据集的行为类别个数；
layout参数修改为之前添加的layout类别；
strategy设置为spatial；
由于我使用多GPU训练报错，所以设置device: [0]；
window_size可适当增大；
调整batch_size、学习率和迭代次数等。

六、开始训练

执行训练代码：

python main.py recognition -c config/st_gcn/kinetics-skeleton/train.yaml

训练生成的模型在work_dir中，默认每10个epoch保存一次模型。

七、模型测试

import numpy as np
import torch
import torch.nn as nn
from net.utils.graph import Graph
from net.utils.tgcn import ConvTemporalGraphical
# from net.st_gcn import Model
import torch.nn.functional as F
class Model(nn.Module):
    r"""Spatial temporal graph convolutional networks.
    Args:
        in_channels (int): Number of channels in the input data
        num_class (int): Number of classes for the classification task
        graph_args (dict): The arguments for building the graph
        edge_importance_weighting (bool): If ``True``, adds a learnable
            importance weighting to the edges of the graph
        **kwargs (optional): Other parameters for graph convolution units
    Shape:
        - Input: :math:`(N, in_channels, T_{in}, V_{in}, M_{in})`
        - Output: :math:`(N, num_class)` where
            :math:`N` is a batch size,
            :math:`T_{in}` is a length of input sequence,
            :math:`V_{in}` is the number of graph nodes,
            :math:`M_{in}` is the number of instance in a frame.
    """
    def __init__(self, in_channels=3, num_class=3,
                 edge_importance_weighting=True, **kwargs):
        super().__init__()
        # load graph
        self.graph = Graph()
        A = torch.tensor(self.graph.A, dtype=torch.float32, requires_grad=False)
        self.register_buffer('A', A)
        # build networks
        spatial_kernel_size = A.size(0)
        temporal_kernel_size = 9
        kernel_size = (temporal_kernel_size, spatial_kernel_size)
        self.data_bn = nn.BatchNorm1d(in_channels * A.size(1))
        kwargs0 = {k: v for k, v in kwargs.items() if k != 'dropout'}
        self.st_gcn_networks = nn.ModuleList((
            st_gcn(in_channels, 64, kernel_size, 1, residual=False, **kwargs0),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 128, kernel_size, 2, **kwargs),
            st_gcn(128, 128, kernel_size, 1, **kwargs),
            st_gcn(128, 128, kernel_size, 1, **kwargs),
            st_gcn(128, 256, kernel_size, 2, **kwargs),
            st_gcn(256, 256, kernel_size, 1, **kwargs),
            st_gcn(256, 256, kernel_size, 1, **kwargs),
        ))
        # initialize parameters for edge importance weighting
        if edge_importance_weighting:
            self.edge_importance = nn.ParameterList([
                nn.Parameter(torch.ones(self.A.size()))
                for i in self.st_gcn_networks
            ])
        else:
            self.edge_importance = [1] * len(self.st_gcn_networks)
        # fcn for prediction
        self.fcn = nn.Conv2d(256, num_class, kernel_size=1)
    def forward(self, x):
        # data normalization
        N, C, T, V, M = x.size()
        x = x.permute(0, 4, 3, 1, 2).contiguous()
        x = x.view(N * M, V * C, T)
        x = self.data_bn(x)
        x = x.view(N, M, V, C, T)
        x = x.permute(0, 1, 3, 4, 2).contiguous()
        x = x.view(N * M, C, T, V)
        # forwad
        for gcn, importance in zip(self.st_gcn_networks, self.edge_importance):
            x, _ = gcn(x, self.A * importance)
        # global pooling
        x = F.avg_pool2d(x, x.size()[2:])
        x = x.view(N, M, -1, 1, 1).mean(dim=1)
        # prediction
        x = self.fcn(x)
        x = x.view(x.size(0), -1)
        return x
    def extract_feature(self, x):
        # data normalization
        N, C, T, V, M = x.size()
        x = x.permute(0, 4, 3, 1, 2).contiguous()
        x = x.view(N * M, V * C, T)
        x = self.data_bn(x)
        x = x.view(N, M, V, C, T)
        x = x.permute(0, 1, 3, 4, 2).contiguous()
        x = x.view(N * M, C, T, V)
        # forwad
        for gcn, importance in zip(self.st_gcn_networks, self.edge_importance):
            x, _ = gcn(x, self.A * importance)
        _, c, t, v = x.size()
        feature = x.view(N, M, c, t, v).permute(0, 2, 3, 4, 1)
        # prediction
        x = self.fcn(x)
        output = x.view(N, M, -1, t, v).permute(0, 2, 3, 4, 1)
        return output, feature

class st_gcn(nn.Module):
    r"""Applies a spatial temporal graph convolution over an input graph sequence.
    Args:
        in_channels (int): Number of channels in the input sequence data
        out_channels (int): Number of channels produced by the convolution
        kernel_size (tuple): Size of the temporal convolving kernel and graph convolving kernel
        stride (int, optional): Stride of the temporal convolution. Default: 1
        dropout (int, optional): Dropout rate of the final output. Default: 0
        residual (bool, optional): If ``True``, applies a residual mechanism. Default: ``True``
    Shape:
        - Input[0]: Input graph sequence in :math:`(N, in_channels, T_{in}, V)` format
        - Input[1]: Input graph adjacency matrix in :math:`(K, V, V)` format
        - Output[0]: Outpu graph sequence in :math:`(N, out_channels, T_{out}, V)` format
        - Output[1]: Graph adjacency matrix for output data in :math:`(K, V, V)` format
        where
            :math:`N` is a batch size,
            :math:`K` is the spatial kernel size, as :math:`K == kernel_size[1]`,
            :math:`T_{in}/T_{out}` is a length of input/output sequence,
            :math:`V` is the number of graph nodes.
    """
    def __init__(self,
                 in_channels,
                 out_channels,
                 kernel_size,
                 stride=1,
                 dropout=0,
                 residual=True):
        super().__init__()
        assert len(kernel_size) == 2
        assert kernel_size[0] % 2 == 1
        padding = ((kernel_size[0] - 1) // 2, 0)
        self.gcn = ConvTemporalGraphical(in_channels, out_channels,
                                         kernel_size[1])
        self.tcn = nn.Sequential(
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(
                out_channels,
                out_channels,
                (kernel_size[0], 1),
                (stride, 1),
                padding,
            ),
            nn.BatchNorm2d(out_channels),
            nn.Dropout(dropout, inplace=True),
        )
        if not residual:
            self.residual = lambda x: 0
        elif (in_channels == out_channels) and (stride == 1):
            self.residual = lambda x: x
        else:
            self.residual = nn.Sequential(
                nn.Conv2d(
                    in_channels,
                    out_channels,
                    kernel_size=1,
                    stride=(stride, 1)),
                nn.BatchNorm2d(out_channels),
            )
        self.relu = nn.ReLU(inplace=True)
    def forward(self, x, A):
        res = self.residual(x)
        x, A = self.gcn(x, A)
        x = self.tcn(x) + res
        return self.relu(x), A

if __name__ == '__main__':
    # set params
    weights_path = './work_dir/recognition/pig18/epoch300_model.pt'
    label_list = ['standing', 'walking', 'laying']
    data_path = "./data/pig18/val_data.npy"
    label_path = "./data/pig18/val_label.pkl"

    model = Model().to('cuda:0')
    weights = torch.load(weights_path)
    model.load_state_dict(weights)
    model.eval()

    val_data = np.load(data_path)
    f = open(label_path, 'rb')
    label_data = pickle.load(f)
    Num_ture = [0] * (len(label_list)+1)
    Num_total = [0] * (len(label_list)+1)
    '''
    data1 = torch.tensor(val_data[0]).unsqueeze(0)
    data1 = data1.float().to("cuda:0")
    traced_model = torch.jit.trace(model, data1)  #trace模型
    '''
    for data in val_data:
        data = torch.tensor(data).unsqueeze(0)
        data = data.float().to("cuda:0")
        output = model(data).data.cpu().numpy()[0].tolist()
        pred_index = output.index(max(output))
        label_index = label_data[1][Num_total[-1]]
        print("Label/Pred:  {}/{}".format(label_list[label_index], label_list[pred_index]))

        for l_idx in range(len(label_list)):
            if label_index == l_idx:
                Num_total[l_idx] += 1
                if pred_index==label_index:
                    Num_ture[l_idx] += 1
                    Num_ture[-1] += 1
        Num_total[-1] += 1

    for idx in range(len(label_list)):
        print("Accuracy for {}-{}: {}, TP: {}, Total_num: {}".format(idx, label_list[idx], Num_ture[idx]/Num_total[idx], Num_ture[idx], Num_total[idx]))
    print("Total Accuracy: {}".format(Num_ture[-1]/Num_total[-1]))
    # # traced_model.save("stgcn_torchscript.pt")  #保存trace的模型

AtomGit 开源协作平台测评赛

瓜分20万奖金获得内推名额丰厚实物奖励易参与易上手

更多推荐

ADS1292R 使用过程心电图高精度ADC模块

文章目录1 Fundamentals ofPrecision ADC Noise Analysis 精密模数转换器噪声分析基础1 Fundamentals ofPrecision ADC Noise Analysis 精密模数转换器噪声分析基础https://www.ti.com.cn/cn/lit/wp/slyy192/slyy192.pdf?ts=1600659610730&ref_u

开放原子开发者工作坊

实现一个家庭安防与环境监测系统（一）

开放原子开发者工作坊

【cf】Codeforces Round #774 (Div. 2) 前4题

题目A. Square Counting 简单数学题目大意题解代码B. Quality vs Quantity 排序题目大意题解代码C. Factorials and Powers of Two 状态压缩dp+位运算题目大意题解代码D. Weight the Tree 树形dp+dfs题目大意题解代码E. Power Board 看起来像是数论？许多年没打cf了，偶尔打了一盘，恢复紫名了。A. S