Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Bottom-Up Foreground-Aware Feature Fusion for Practical Person Search
Ist Teil von
IEEE transactions on circuits and systems for video technology, 2022-01, Vol.32 (1), p.262-274
Ort / Verlag
New York: IEEE
Erscheinungsjahr
2022
Quelle
IEEE Electronic Library (IEL)
Beschreibungen/Notizen
The key to efficient person search is jointly localizing pedestrians and learning discriminative representation for person re-identification (re-ID). Some recently developed models are built with separate detection and re-ID branches on top of shared region feature extraction networks. There are two factors that are detrimental to re-ID feature learning. One is the background information redundancy resulting from the large receptive field of neurons. The other is the body part missing and background clutter caused by inaccurate localization. In this work, a bottom-up fusion (BUF) subnet is proposed to fuse the bounding box features pooled from multiple network stages. With a few parameters introduced, BUF leverages the multi-level features with various sizes of receptive fields to mitigate the background-bias problem. To further suppress the non-pedestrian regions, the newly introduced segmentation head generates a foreground probability map as guidance for the network to focus on the foreground regions. The resulting foreground attention module (FAM) enhances the foreground features. Moreover, for robust feature learning in practical person search, we propose to adaptively smooth the labels of the pedestrian boxes with consideration of the detection quality. Extensive experiments on PRW and CUHK-SYSU validate the effectiveness of the proposals. Our Bottom-Up Foreground-Aware Feature Fusion (BUFF) network with ALS achieves considerable gains over the state-of-the-art on PRW and competitive performance on CUHK-SYSU.