Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 3 von 135

Details

Autor(en) / Beteiligte
Titel
Is Cross-Modal Information Retrieval Possible Without Training?
Ist Teil von
  • Advances in Information Retrieval, p.377-385
Ort / Verlag
Cham: Springer Nature Switzerland
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • Encoded representations from a pretrained deep learning model (e.g., BERT text embeddings, penultimate CNN layer activations of an image) convey a rich set of features beneficial for information retrieval. Embeddings for a particular modality of data occupy a high-dimensional space of its own, but it can be semantically aligned to another by a simple mapping without training a deep neural net. In this paper, we take a simple mapping computed from the least squares and singular value decomposition (SVD) for a solution to the Procrustes problem to serve a means to cross-modal information retrieval. That is, given information in one modality such as text, the mapping helps us locate a semantically equivalent data item in another modality such as image. Using off-the-shelf pretrained deep learning models, we have experimented the aforementioned simple cross-modal mappings in tasks of text-to-image and image-to-text retrieval. Despite simplicity, our mappings perform reasonably well reaching the highest accuracy of 77% on recall@10, which is comparable to those requiring costly neural net training and fine-tuning. We have improved the simple mappings by contrastive learning on the pretrained models. Contrastive learning can be thought as properly biasing the pretrained encoders to enhance the cross-modal mapping quality. We have further improved the performance by multilayer perceptron with gating (gMLP), a simple neural architecture.
Sprache
Englisch
Identifikatoren
ISBN: 303128237X, 9783031282379
ISSN: 0302-9743
eISSN: 1611-3349
DOI: 10.1007/978-3-031-28238-6_27
Titel-ID: cdi_springer_books_10_1007_978_3_031_28238_6_27
Format

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX