[简体中文]

Deep-Person: Learning Discriminative Deep Features for Person

Xiang Bai Mingkun Yang Tengteng Huang Zhiyong Dou Rui Yu Yongchao Xu
Arxiv [code] [pdf]

Abstract

Recently, many methods of person re-identification (ReID) rely on part-based feature representation to learn a discriminative pedestrian descriptor. However, the spatial context between these parts is ignored for the independent extractor on each separate part. In this paper, we propose to apply Long Short-Term Memory (LSTM) in an end-to-end way to model the pedestrian, seen as a sequence of body parts from head to foot. Integrating the contextual information strengthens the discriminative ability of local representation. We also leverage the complementary information between local and global feature. Furthermore, we integrate both identification task and ranking task in one network, where a discriminative embedding and a similarity measurement are learned concurrently. This results in a novel three-branch framework named DeepPerson, which learns highly discriminative features for person Re-ID. Experimental results demonstrate that DeepPerson outperforms the state-of-the-art methods by a large margin on three challenging datasets including Market1501, CUHK03, and DukeMTMC-reID. Specifically, combining with a re-ranking approach, we achieve a 90.84% mAP on Market-1501 under single query setting

Method

Illustration of the Deep-Person architecture. Given a triplet of images T, each image is fed into a part-based identification branch B I p and a global-based identification branch B I g . Meanwhile, a distance ranking branch B R t is applied on T using the triplet loss function. Note that each image in T is fed into the same backbone network. The duplication of the backbone is only for visualization purpose.
Each vector in the feature sequence describes the region of the corresponding receptive field in the raw image.

Results

We evaluate our proposed method, DeepPerson, on three widely used large-scale datasets: Market1501 , CUHK03 , and DukeMTMC-reID  dataset.

Comparison with state-of-the-art results on Market-1501. The 1 st/2 nd best result is highlighted in red/blue.

 

Quantitative comparison with state-of-the-art methods on DukeMTMC-reID dataset

 

 Comparison with state-of-the-art part-based models on
 Comparison with state-of-the-art part-based models on Market-1501 dataset.

 

Visualization of feature maps extracted from three variants of Deep-Person. From left to right in (a-d): raw image, the feature map of using B I g branch alone, B I g + B I p without LSTM, and B I g + B I p with LSTM, respectively.

 

BibTeX

@article{bai2017deep,
  title={Deep-Person: Learning Discriminative Deep Features for Person Re-Identification},
  author={Bai, Xiang and Yang, Mingkun and Huang, Tengteng and Dou, Zhiyong and Yu, Rui and Xu, Yongchao},
  journal={arXiv preprint arXiv:1711.10658},
  year={2017}
}

Join the Discussion