Determining the adaptation data saturation of ASR systems for dysarthric speakers

Al-Qatab, Bassam Ali and Mustafa, Mumtaz Begum and Salim, Siti Salwah and Abdul Sani, Asmiza (2021) Determining the adaptation data saturation of ASR systems for dysarthric speakers. International Journal of Speech Technology, 24 (1, SI). pp. 183-192. ISSN 1381-2416, DOI https://doi.org/10.1007/s10772-020-09788-7.

Full text not available from this repository.

Abstract

Automatic speech recognition (ASR) systems are gradually accepted as the assistive technology for the physically impaired individuals such as speakers with dysarthria. Dysarthria is a motor speech impairment, where the muscles related to speech organs are weak, causing slow or no movement of the muscles. It is often accompanied by neurological conditions such as cerebral palsy, head injury, muscles dystrophy and multiple sclerosis. Using the ASR system to understand the spoken language of a speaker with dysarthia came with many advantages as compared to the conventional keyboard and mouse method. However, the development of an effective ASR system for this condition often limited by data sparsity in terms of coverage of the language or the size of the speech databases. To overcome the data sparsity issues, existing researchers proposed several solutions including the adaptation techniques such as MLLR and MAP. In this study, two types of adaptation techniques were considered, which includes the individual MLLR and MAP adaptation technique, as well as the combined adaptation technique (MLLR + MAP sequence, and MAP + MLLR sequence) to determine the saturation point of the adaptation data of dysarthric speech. The saturation point is identified using linear regression between the data size and the recognition accuracy. The results show that the saturation points are different for both individual MLLR and MAP adaptation technique, while the sequence of the combined adaptation technique influences the saturation points.

Item Type: Article
Funders: None
Uncontrolled Keywords: Dysarthric speech; Speaker adaptation; ASR system; Data saturation; Saturation point; Severity-based adaptation
Subjects: T Technology > TA Engineering (General). Civil engineering (General)
Divisions: Faculty of Computer Science & Information Technology
Depositing User: Ms Zaharah Ramly
Date Deposited: 09 May 2022 01:16
Last Modified: 09 May 2022 01:21
URI: http://eprints.um.edu.my/id/eprint/34966

Actions (login required)

View Item View Item