UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 19 von 91

Improving Text-Independent Forced Alignment to Support Speech-Language Pathologists with Phonetic Transcription

Sensors (Basel, Switzerland), 2023-12, Vol.23 (24), p.9650

2023

Details

Autor(en) / Beteiligte

Titel

Improving Text-Independent Forced Alignment to Support Speech-Language Pathologists with Phonetic Transcription

Ist Teil von

Sensors (Basel, Switzerland), 2023-12, Vol.23 (24), p.9650

Ort / Verlag

Switzerland: MDPI AG

Erscheinungsjahr

2023

Link zum Volltext

Quelle

Free E-Journal (出版社公開部分のみ）

Beschreibungen/Notizen

: Phonetic transcription is crucial in diagnosing speech sound disorders (SSDs) but is susceptible to transcriber experience and perceptual bias. Current forced alignment (FA) tools, which annotate audio files to determine spoken content and its placement, often require manual transcription, limiting their effectiveness. : We introduce a novel, text-independent forced alignment model that autonomously recognises individual phonemes and their boundaries, addressing these limitations. Our approach leverages an advanced, pre-trained wav2vec 2.0 model to segment speech into tokens and recognise them automatically. To accurately identify phoneme boundaries, we utilise an unsupervised segmentation tool, UnsupSeg. Labelling of segments employs nearest-neighbour classification with wav2vec 2.0 labels, before connectionist temporal classification (CTC) collapse, determining class labels based on maximum overlap. Additional post-processing, including overfitting cleaning and voice activity detection, is implemented to enhance segmentation. : We benchmarked our model against existing methods using the TIMIT dataset for normal speakers and, for the first time, evaluated its performance on the TORGO dataset containing SSD speakers. Our model demonstrated competitive performance, achieving a harmonic mean score of 76.88% on TIMIT and 70.31% on TORGO. : This research presents a significant advancement in the assessment and diagnosis of SSDs, offering a more objective and less biased approach than traditional methods. Our model's effectiveness, particularly with SSD speakers, opens new avenues for research and clinical application in speech pathology.

Sprache: Englisch
Identifikatoren: ISSN: 1424-8220
eISSN: 1424-8220
DOI: 10.3390/s23249650
Titel-ID: cdi_doaj_primary_oai_doaj_org_article_2a0315a0ad214116bef2ee436b6903b8

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Improving Text-Independent Forced Alignment to Support Speech-Language Pathologists with Phonetic Transcription

Details

Weiterführende Literatur