UB Paderborn / Katalog / Suche / Details

Ergebnis 9 von 25947

IEEE transactions on circuits and systems for video technology, 2023-01, Vol.33 (1), p.445-456

2023

Volltextzugriff (PDF)

Autor(en) / Beteiligte

Titel

Unsupervised Video Summarization via Deep Reinforcement Learning With Shot-Level Semantics

Ist Teil von

IEEE transactions on circuits and systems for video technology, 2023-01, Vol.33 (1), p.445-456

Ort / Verlag

New York: IEEE

Erscheinungsjahr

2023

Quelle

IEEE Xplore

Beschreibungen/Notizen

Video summarization is one of the critical techniques in video retrieval, video browsing, and management. It is still a challenging research task due to user subjectivity, excessive redundant information, and lack of spatio-temporal dependency. In this paper, we propose an unsupervised video summarization approach via reinforcement learning with shot-level semantics. The primary idea of this unsupervised method is based on the encoder-decoder model. We use a novel field size dataset to train a convolutional neural network as an encoder to extract the convolutional feature matrix from the video. Then, a bidirectional LSTM is utilized as a decoder to obtain probability weights for selecting keyframes, which preserves the spatio-temporal dependence of video summarization. Specifically, to reduce the influence of user subjectivity, we design a shot-level semantic reward function to generate more representative summarization results. The shot-level semantics are the rules followed by the video shooting process without being changed by the preferences of different viewers. Finally, we evaluate our approach on four classical datasets, SumMe, TVSum, CoSum, and VTW. The results suggest that our algorithm outperforms others and achieves satisfactory results.

Sprache: Englisch
Identifikatoren: ISSN: 1051-8215
eISSN: 1558-2205
DOI: 10.1109/TCSVT.2022.3197819
Titel-ID: cdi_ieee_primary_9853629

Format: –
Schlagworte: Algorithms, Artificial neural networks, Coders, Convolutional neural networks, Datasets, Deep learning, deep reinforcement learning, Encoders-Decoders, Feature extraction, Image color analysis, Machine learning, Reinforcement learning, Semantics, shot-level semantics, Task analysis, Video data, Video summarization, Visualization

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX