【PR 2021】Progressive sample mining and representation learning for one-shot person re-identification

文章目录内容概要工作概述成果概述方法详解方法特点方法框架算法描述具体实现实验结果总体评价引用格式参考文献内容概要论文名称简称会议/期刊出版年份baselinebackbone数据集Progressive sample mining and representation learning for one-shot person re-identificationPSMAPattern Recogni

_Summer tree

533人浏览 · 2021-08-27 11:14:53

_Summer tree · 2021-08-27 11:14:53 发布

在这里插入图片描述
下方↓公众号后台回复“PSMA”，即可获得论文电子资源。

文章目录

内容概要

论文名称	简称	会议/期刊	出版年份	baseline	backbone	数据集
Progressive sample mining and representation learning for one-shot person re-identification	PSMA	Pattern Recognition	2021	Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Ouyang, and Y. Yang, “Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, pp. 5177–5186.	ResNet-50	Market1501、DukeMTMC-reID

在线链接：https://www.sciencedirect.com/science/article/pii/S0031320320304179?via%3Dihub
源码链接：https://github.com/detectiveli/PSMA

工作概述

we propose to iteratively guess pseudo la- bels for the unlabelled image samples, which are later used to update the re-identification model to- gether with the labelled samples. A new sampling mechanism is designed to select unlabelled samples to pseudo labelled samples based on the distance matrix, and to form a training triplet batch including both labelled samples and pseudo labelled samples. We also design an HSoften-Triplet-Loss to soften the negative impact of the incorrect pseudo label, considering the unreliable nature of pseudo labelled sam- ples. Finally, we deploy an adversarial learning method to expand the image samples to different camera views.

成果概述

our framework achieves a new state-of-the-art one-shot Re-ID per- formance on Market-1501 (mAP 42.7%) and DukeMTMC-Reid dataset (mAP 40.3%).

方法详解

方法特点

提出了一种新的采样机制，形成了 tranining triplet batch。
设计了 HSoften-Triplet-Loss软化了错误标签的影响
应用adversarial learning 方法扩展图像样本到不同的相机视角下。

方法框架

在这里插入图片描述

Fig. 2. Overview of our method. Our training process takes several iterations. Each iteration has two main steps: 1) Add pseudo labelled images for each labelled image. 2) Train the model with both CE loss and HSoft-triplet loss. After each iteration, the model should be more discriminative for feature representation and more reliable to generate the next similarity matrix. This is demonstrated by the fact that image features of the same person are clustered in a more compact manner, and features of different person move apart. The new similarity matrix is used to sample more pseudo labelled images for the next iteration training. Best viewed in color

算法描述

在这里插入图片描述

具体实现

采样机制。先对每个类别中心点选择固定数量的最近的样本，再根据该样本是否与此类中心同类来决定是正样本还是负样本。这为使用三元组损失做铺垫。
对抗学习。对原有的one-shot样本按照camero数量进行了增强。增强采用的是CycleGAN。这样带真实标注的样本数量就变为了以前的 camero数量倍。
数据增强之后再使用采样机制，需要注意的是类别中心变为了所有带标注样本的均值。
新的batch形成规则。从带标注和伪标签样本中随机采样。（这一点理解得不是很好）
损失函数由两个部分组成。 softmax loss 和文章提出的 Hsoften-Triplet-loss

在这里插入图片描述

Hsoften-Triplet-loss 是在 MSMloss的基础上，将hardest positive pair 换成了soften positive pair。具体的做法是将样本对中的原始样本换成了该类样本点的中心。

在这里插入图片描述

实验结果

在这里插入图片描述
是目前性能最好的one-shot re-id了。

总体评价

最终的实验效果确实是很不错的，但是我觉得有一部分贡献来源于他对one-sho数据进行了增强，如果camero数量是5的话，她相当于是five-shot了。
设计的采样机制还是很不错的，相比于EUG把所有的伪标签数据拿来排序，他这个分组排序很有创意，并且，也为他使用triplet loss做了很好的铺垫。从损失上来说，triplet loss本来就笨交叉熵损失更有力一些。
new batch的设计没有很get到他的点。

引用格式

@article{DBLP:journals/pr/LiXSLZ21,
author = {Hui Li and
Jimin Xiao and
Mingjie Sun and
Eng Gee Lim and
Yao Zhao},
title = {Progressive sample mining and representation learning for one-shot
person re-identification},
journal = {Pattern Recognit.},
volume = {110},
pages = {107614},
year = {2021}
}

参考文献

[1]C. Song , Y. Huang , W. Ouyang , L. Wang , Mask-guided contrastive attention model for person re-identification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[2] G. Ding , S. Khan , Z. Tang , F. Porikli , Feature mask network for person re-iden- tification, Pattern Recognition Letters, Online, 2019 .
[3] Y. LeCun , Y. Bengio , G. Hinton , Deep learning, in: Nature, vol. 521, Nature Pub- lishing Group, 2015, pp. 436–4 4 4 .
[4] A. Serbetci , Y.S. Akgul , End-to-end training of CNN ensembles for person re-i- dentification, in: Pattern Recognition, vol. 104, 2020, p. 107319 .
[5] H. Fan , L. Zheng , C. Yan , Y. Yang , Unsupervised person re-identification, in: ACM Transactions on Multimedia Computing, Communications, and Applica- tions, vol. 14, Association for Computing Machinery, 2018, pp. 1–18 .
[6] J. Meng , S. Wu , W.-S. Zheng , Weakly supervised person re-identification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 .
[7] Y. Wu , Y. Lin , X. Dong , Y. Yan , W. Bian , Y. Yang , Progressive learning for person re-identification with one example, in: IEEE Transactions on Image Processing, vol. 28, 2019, pp. 2872–2881 .
[8] Y. Fu , Y. Wei , G. Wang , Y. Zhou , H. Shi , T. Huang , Self-similarity grouping: a simple unsupervised cross domain adaptation approach for person re-identifi- cation, in: IEEE International Conference on Computer Vision (ICCV), 2019 . [9] W. Deng , L. Zheng , Q. Ye , G. Kang , Y. Yang , J. Jiao , Image-image domain adap-
tation with preserved self-similarity and domain-dissimilarity for person re-i- dentification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[10] J. Wang , X. Zhu , S. Gong , W. Li , Transferable joint attribute-identity deep learn- ing for unsupervised person re-identification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[11] Y. Wu , Y. Lin , X. Dong , Y. Yan , W. Ouyang , Y. Yang , Exploit the unknown grad- ually: One-shot video-based person re-identification by stepwise learning, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[12] Z. Zhong , L. Zheng , Z. Luo , S. Li , Y. Yang , Invariance matters: exemplar memory for domain adaptive person re-identification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 .
[13] C. Su , J. Li , S. Zhang , J. Xing , W. Gao , Q. Tian , Pose-driven deep convolutional model for person re-identification, in: IEEE International Conference on Com- puter Vision (ICCV), 2017 .
[14] J. Liu , B. Ni , Y. Yan , P. Zhou , S. Cheng , J. Hu , Pose transferrable person re-i-
dentification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[15] D. Zheng , J. Xiao , K. Huang , Y. Zhao , Segmentation mask guided end-to-end person search, in: Signal Processing: Image Communication, Elsevier, 2020, p. 115876 .
[16] C. Song , Y. Huang , W. Ouyang , L. Wang , Mask-guided contrastive attention model for person re-identification, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 .
[17] Y. Lin , L. Zheng , Z. Zheng , Y. Wu , Z. Hu , C. Yan , Y. Yang , Improving person re-i- dentification by attribute and identity learning, in: Pattern Recognition, vol. 95, Elsevier, 2019, pp. 151–161 .
[18] G. Zhang , J. Xu , Person re-identification by mid-level attribute and part-based identity learning, in: Asian Conference on Machine Learning (ACML), 2018 .
[19] H.-X. Yu , W.-S. Zheng , A. Wu , X. Guo , S. Gong , J.-H. Lai , Unsupervised person re-identification by soft multilabel learning, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 .
[20] L. Song, C. Wang, L. Zhang, B. Du, Q. Zhang, C. Huang, X. Wang, Unsupervised domain adaptive re-identification: theory and practice, 2018, arXiv: 1807.11334 .
[21] M. Li , X. Zhu , S. Gong , Unsupervised tracklet person re-identification, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 . pp. 1–1
[22] Y. Wen , K. Zhang , Z. Li , Y. Qiao , A discriminative feature learning approach for deep face recognition, in: European Conference on Computer Vision (ECCV), 2016 .
[23] J. Xiao , Y. Xie , T. Tillo , K. Huang , Y. Wei , J. Feng , IAN: The individual aggrega- tion network for person search, in: Pattern Recognition, vol. 87, Elsevier, 2019, pp.
[24] A. Hermans, L. Beyer, B. Leibe, In defense of the triplet loss for person re- identification, 2017, arXiv: 1703.07737 .
[25] I. Goodfellow , J. Pouget-Abadie , M. Mirza , B. Xu , D. Warde-Farley , S. Ozair , A. Courville , Y. Bengio , Generative adversarial nets, in: Advances in Neural In- formation Processing Systems, 2014, pp. 2672–2680 .
[26] A. Brock, J. Donahue, K. Simonyan, Large scale GAN training for high fidelity natural image synthesis, 2018, arXiv: 1809.11096 .
[27] P. Isola , J.-Y. Zhu , T. Zhou , A .A . Efros , Image-to-image translation with condi- tional adversarial networks, in: IEEE Conference on Computer Vision and Pat- tern Recognition (CVPR), 2017 .
[28] X. Dong , Y. Yan , W. Ouyang , Y. Yang , Style aggregated network for facial land- mark detection, in: IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2018 .
[29] C. Ledig , L. Theis , F. Huszar , J. Caballero , A. Cunningham , A. Acosta , A. Aitken , A. Tejani , J. Totz , Z. Wang , et al. , Photo-realistic single image super-resolution using a generative adversarial network, in: IEEE Conference on Computer Vi- sion and Pattern Recognition (CVPR), 2017 .
[30] H. Fan , L. Zheng , C. Yan , Y. Yang , Fine-grained action segmentation using the semi-supervised action GAN, in: Pattern Recognition, vol. 98, Elsevier, 2020, p. 107039 .
[31] Z. Zhong , L. Zheng , Z. Zheng , S. Li , Y. Yang , Camera style adaptation for person re-identification, in: IEEE Conference on Computer Vision and Pattern Recog- nition (CVPR), 2018 .
[32] Y. Ge , Z. Li , H. Zhao , G. Yin , S. Yi , X. Wang , et al. , FD-GAN: Pose-guided fea- ture distilling GAN for robust person re-identification, in: Advances in Neural Information Processing Systems, 2018, pp. 1222–1233 .
[33] J.-Y. Zhu , T. Park , P. Isola , A .A . Efros , Unpaired image-to-image translation us- ing cycle-consistent adversarial networkss, in: IEEE International Conference on Computer Vision (ICCV), 2017 .
[34] S. Ioffe , C. Szegedy , Batch normalization: accelerating deep network training by reducing internal covariate shift, in: International Conference on Machine Learning (ICML), 2015 .
[35] N. Srivastava , G. Hinton , A. Krizhevsky , I. Sutskever , R. Salakhutdinov ,Dropout: a simple way to prevent neural networks from overfitting, in: Machine Learn- ing Research, vol. 15, 2014, pp. 1929–1958 .
[36] L. Zheng , L. Shen , L. Tian , S. Wang , J. Wang , Q. Tian , Scalable person re-iden- tification: a benchmark, in: IEEE International Conference on Computer Vision (ICCV), 2015 .
[37] Z. Zheng , L. Zheng , Y. Yang , Unlabeled samples generated by GAN improve the person re-identification baseline in vitro, in: IEEE International Conference on Computer Vision (ICCV), 2017 .
[38] E. Ristani , F. Solera , R. Zou , R. Cucchiara , C. Tomasi , Performance measures and a data set for multi-target, multi-camera tracking, in: European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking, 2016 .
[39] J. Deng , W. Dong , R. Socher , L.-J. Li , K. Li , L. Fei-Fei , ImageNet: a Large-Scale Hierarchical Image Database, in: IEEE Conference on Computer Vision and Pat- tern Recognition (CVPR), 2009 .
[40] L.v.d. Maaten , G. Hinton , Visualizing data using t-sne, in: Journal of Machine Learning Research, 2008, pp. 2579–2605 .
Hui