Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 14 von 335
Multimedia systems, 2022-02, Vol.28 (1), p.195-207
2022
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Attention based video captioning framework for Hindi
Ist Teil von
  • Multimedia systems, 2022-02, Vol.28 (1), p.195-207
Ort / Verlag
Berlin/Heidelberg: Springer Berlin Heidelberg
Erscheinungsjahr
2022
Quelle
SpringerNature Journals
Beschreibungen/Notizen
  • In recent times, active research is going on for bridging the gap between computer vision and natural language. In this paper, we attempt to address the problem of Hindi video captioning. In a linguistically diverse country like India, it is important to provide a means which can help in understanding the visual entities in native languages. In this work, we employ a hybrid attention mechanism by extending the soft temporal attention mechanism with a semantic attention to make the system able to decide when to focus on visual context vector and semantic input. The visual context vector of the input video is extracted using 3D convolutional neural network (3D CNN) and a Long Short-Term Memory (LSTM) recurrent network with attention module is used for decoding the encoded context vector. We experimented on a dataset built in-house for Hindi video captioning by translating M S R - V T T dataset followed by post-editing. Our system achieves 0.369 CIDEr score and 0.393 METEOR score and outperformed other baseline models including RMN (Reasoning Module Networks)-based model.
Sprache
Englisch
Identifikatoren
ISSN: 0942-4962
eISSN: 1432-1882
DOI: 10.1007/s00530-021-00816-3
Titel-ID: cdi_proquest_journals_2623605301

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX