Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 8 von 95

Details

Autor(en) / Beteiligte
Titel
Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
Ist Teil von
  • Sensors (Basel, Switzerland), 2021-02, Vol.21 (4), p.1205
Ort / Verlag
Switzerland: MDPI
Erscheinungsjahr
2021
Link zum Volltext
Quelle
Elektronische Zeitschriftenbibliothek - Freely accessible e-journals
Beschreibungen/Notizen
  • This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal and localizes them. AFD-Obj consists of two main stages: firstly, we formulate the problem of AFs detection as an object detection problem and prepare the data to fulfill requirement of object detectors by generating a spectral three-channel image from the speech signal and creating the corresponding annotation for each utterance. Secondly, we use annotated images to train the proposed system to detect sequences of AFs and their boundaries. We test the system by feeding spectrogram images to the system, which will recognize and localize multi-label AFs. We investigated using these AFs to detect the utterance phonemes. YOLOv3-tiny detector is selected because of its real-time property and its support for multi-label detection. We test our AFD-Obj system on Arabic and English languages using KAPD and TIMIT corpora, respectively. Additionally, we propose using YOLOv3-tiny as an Arabic phoneme detection system (i.e., PD-Obj) to recognize and localize a sequence of Arabic phonemes from whole speech utterances. The proposed AFD-Obj and PD-Obj systems achieve excellent results for Arabic corpus and comparable to the state-of-the-art method for English corpus. Moreover, we showed that using only one-scale detection is suitable for AFs detection or phoneme recognition.
Sprache
Englisch
Identifikatoren
ISSN: 1424-8220
eISSN: 1424-8220
DOI: 10.3390/s21041205
Titel-ID: cdi_doaj_primary_oai_doaj_org_article_c49362893ff74650ac41eb702452df37

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX