Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Transformer-Based Object Detection with Deep Feature Fusion Using Carafe Operator in Remote Sensing Image
Ist Teil von
EAI endorsed transactions on energy web, 2023-08, Vol.10
Ort / Verlag
European Alliance for Innovation (EAI)
Erscheinungsjahr
2023
Quelle
EZB Electronic Journals Library
Beschreibungen/Notizen
Recently, broad applications can be found in optical remote sensing images (ORSI), such as in urban planning, military mapping, field survey, and so on. Target detection is one of its important applications. In the past few years, with the wings of deep learning, the target detection algorithm based on CNN has harvested a breakthrough. However, due to the different directions and target sizes in ORSI, it will lead to poor performance if the target detection algorithm for ordinary optical images is directly applied. Therefore, how to improve the performance of the object detection model on ORSI is thorny. Aiming at solving the above problems, premised on the one-stage target detection model-RetinaNet, this paper proposes a new network structure with more efficiency and accuracy, that is, a Transformer-Based Network with Deep Feature Fusion Using Carafe Operator (TRCNet). Firstly, a PVT2 structure based on the transformer is adopted in the backbone and we apply a multi-head attention mechanism to obtain global information in optical images with complex backgrounds. Meanwhile, the depth is increased to better extract features. Secondly, we introduce the carafe operator into the FPN structure of the neck to integrate the high-level semantics with the low-level ones more efficiently to further improve its target detection performance. Experiments on our well-known public NWPU-VHR-10 and RSOD show that mAP increases by 8.4% and 1.7% respectively. Comparison with other advanced networks also witnesses that our proposed network is effective and advanced.