Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...

Details

Autor(en) / Beteiligte
Titel
Deep Learning-Based End-to-End Speaker Identification Using Time–Frequency Representation of Speech Signal
Ist Teil von
  • Circuits, systems, and signal processing, 2024-03, Vol.43 (3), p.1839-1861
Ort / Verlag
New York: Springer US
Erscheinungsjahr
2024
Quelle
SpringerLink
Beschreibungen/Notizen
  • Speech-based speaker identification system is one of the alternatives to the conventional biometric contact-based identification systems. Recent works demonstrate the growing interest among researchers in this field and highlight the practical usability of speech for speaker identification across various applications. In this work, we try to address the limitations in the existing state-of-the-art approaches and highlight the usability of convolutional neural networks for speaker identification systems. The present work examines the usage of spectrogram as an input to these spatial networks and its robustness in the presence of noise. For faster training (computation) and to reduce the memory requirement (storage), SpectroNet model for speech-based speaker identification is introduced in this work. Evaluation of the proposed system has been done using Voxceleb1 and Part1 of the RSR 2015 databases. Experimental results show a relative improvement of ~ 16% (accuracy—96.21%) with spectrogram and ~ 10% (accuracy—98.92%) with log Mel spectrogram in identifying the speaker compared to the existing models. When cochleagram was used, it results in an identification accuracy of 99.26%. Analyzing the result obtained shows the applicability of the proposed approach in situations where (i) minimal speech data are available for speaker identification; (ii) speech data are noisy in nature.
Sprache
Englisch
Identifikatoren
ISSN: 0278-081X
eISSN: 1531-5878
DOI: 10.1007/s00034-023-02542-9
Titel-ID: cdi_proquest_journals_2938247357

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX