Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 12 von 63
Neural processing letters, 2018-10, Vol.48 (2), p.709-732
2018
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
A Temporal Dependency Based Multi-modal Active Learning Approach for Audiovisual Event Detection
Ist Teil von
  • Neural processing letters, 2018-10, Vol.48 (2), p.709-732
Ort / Verlag
New York: Springer US
Erscheinungsjahr
2018
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • In this work, two novel active learning approaches for the annotation and detection of audiovisual events are proposed. The assumption behind the proposed approaches is that events are susceptible to substantively deviate from the distribution of normal observations and therefore should be lying in regions of low density. Thus, it is believed that an event detection model can be trained more efficiently by focusing on samples that appear to be inconsistent with the majority of the dataset. The first approach is an uni-modal method which consists in using rank aggregation to select informative samples which have previously been ranked using different unsupervised outlier detection techniques in combination with an uncertainty sampling technique. The information used for the sample selection stems from an unique modality (e.g. video channel). Since most active learning approaches focus on one target channel to perform the selection of informative samples and thus do not take advantage of potentially useful and complementary information among correlated modalities, we propose an extension of the previous uni-modal approach to multi-modality. From a target pool of instances belonging to a specific modality, the uni-modal approach is used to select and manually label a set of informative instances. Additionally, a second set of automatically labelled instances of the target pool is generated, based on a transfer of information stemming from an auxiliary modality which is temporally dependent to the target one. Both sets of labelled instances (automatically and manually labelled instances) are used for the semi-supervised training of a classification model to be used in the next active learning iteration. Both methods have been assessed on a set of participants selected from the UUlmMAC dataset and have proven to be effective in substantially reducing the cost of manual annotation required for the training of a facial event detection model. The assessment is done based on two different methods: Support Vector Data Description and expected similarity estimation. Furthermore, given an appropriate sampling approach, the multi-modal approach outperforms its uni-modal counterpart in most of the cases.
Sprache
Englisch
Identifikatoren
ISSN: 1370-4621
eISSN: 1573-773X
DOI: 10.1007/s11063-017-9719-y
Titel-ID: cdi_proquest_journals_2918341166

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX