Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 23 von 403

Details

Autor(en) / Beteiligte
Titel
Unified Audio-visual Saliency Model for Omnidirectional Videos with Spatial Audio
Ist Teil von
  • IEEE transactions on multimedia, 2024-01, Vol.26, p.1-13
Ort / Verlag
IEEE
Erscheinungsjahr
2024
Quelle
IEEE/IET Electronic Library (IEL)
Beschreibungen/Notizen
  • Spatial audio is a crucial component of omnidirectional videos (ODVs), which can provide an immersive experience by enabling viewers to perceive sound sources in all directions. However, most visual attention modeling works for ODVs focus only on visual cues, and audio modality is rather rarely considered. Additionally, the existing audio-visual saliency models for ODVs lack spatial audio location-awareness (i.e. sound source location-agnostic) and audio content attributes discriminability (i.e. audio content attributes-agnostic). To this end, we propose a novel audio-visual perception saliency (AVPS) model with spatial audio location-awareness and audio content attributes-adaptive to efficiently address the problem of fixation prediction in ODVs. Specifically, we first utilize the improved group equivariant convolutional neural network (G-CNN) with eidetic 3D LSTM (E3D-LSTM) to extract spatial-temporal visual features. Then we perceive sound source locations by computing the audio energy map (AEM) of the audio information in ODVs. Subsequently, we introduce SoundNet to extract audio features with multiple attributes. Finally, we develop an audio-visual feature fusion module to adaptively integrate spatial-temporal visual features and spatial auditory information to generate the final audio-visual saliency map. Extensive experiments in three audio modalities validate the effectiveness of the proposed model. Meanwhile, the performance of the proposed model is superior to the other 10 state-of-the-art saliency models.
Sprache
Englisch
Identifikatoren
ISSN: 1520-9210
eISSN: 1941-0077
DOI: 10.1109/TMM.2023.3271022
Titel-ID: cdi_ieee_primary_10109890

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX