UB Paderborn / Katalog / Details

Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...

Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network

IEEE transactions on image processing, 2018-11, Vol.27 (11), p.5585-5599

2018

Details

Autor(en) / Beteiligte

Titel

Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network

Ist Teil von

IEEE transactions on image processing, 2018-11, Vol.27 (11), p.5585-5599

Ort / Verlag

United States: IEEE

Erscheinungsjahr

2018

Link zum Volltext

Volltext

Quelle

IEEE/IET Electronic Library

Beschreibungen/Notizen

Nowadays, cross-modal retrieval plays an important role to flexibly find useful information across different modalities of data. Effectively measuring the similarity between different modalities of data is the key of cross-modal retrieval. Different modalities, such as image and text, have imbalanced and complementary relationship, and they contain unequal amount of information when describing the same semantics. For example, images often contain more details that cannot be demonstrated by textual descriptions and vice versa. Existing works based on deep neural network mostly construct one common space for different modalities, to find the latent alignments between them, which lose their exclusive modality-specific characteristics. Therefore, we propose modality-specific cross-modal similarity measurement approach by constructing the independent semantic space for each modality, which adopts an end-to-end framework to directly generate the modality-specific cross-modal similarity without explicit common representation. For each semantic space, modality-specific characteristics within one modality are fully exploited by recurrent attention network, while the data of another modality is projected into this space with attention based joint embedding, which utilizes the learned attention weights for guiding the fine-grained cross-modal correlation learning, and captures the imbalanced and complementary relationship between different modalities. Finally, the complementarity between the semantic spaces for different modalities is explored by adaptive fusion of the modality-specific cross-modal similarities to perform the cross-modal retrieval. Experiments on the widely used Wikipedia, Pascal Sentence, and MS-COCO data sets as well as our constructed large-scale XMediaNet data set verify the effectiveness of our proposed approach, outperforming nine state-of-the-art methods.

Sprache: Englisch
Identifikatoren: ISSN: 1057-7149
eISSN: 1941-0042
DOI: 10.1109/TIP.2018.2852503
Titel-ID: cdi_proquest_miscellaneous_2068341649

Format: –
Schlagworte: adaptive fusion, Artificial neural networks, attention based joint embedding, Correlation, Encyclopedias, Extraterrestrial measurements, Internet, Modality-specific cross-modal similarity measurement, Neural networks, recurrent attention network, Retrieval, Semantics, Similarity, Visualization, Well construction

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network

Details

Weiterführende Literatur