Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes

El Amrani, Mohamed Yassine and Rahman, M.M. Hafizur and Wahiddin, Mohamed Ridza and Shah, Asadullah (2016) Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes. Egyptian Informatics Journal, 17 (3). pp. 305-314. ISSN 1110-8665

PDF (full paper)
Restricted to Registered users only
Download (2MB) | Request a copy

Preview

PDF (scopus)
Download (209kB) | Preview

Official URL: http://www.sciencedirect.com/science/article/pii/S...

Abstract

This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%.

Item Type:	Article (Journal)
Additional Information:	6724/53574
Subjects:	T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button):	Kulliyyah of Information and Communication Technology > Department of Computer Science Kulliyyah of Information and Communication Technology > Department of Computer Science
Depositing User:	Dr. M.M. Hafizur Rahman
Date Deposited:	28 Dec 2016 13:06
Last Modified:	11 Jan 2017 11:01
URI:	http://irep.iium.edu.my/id/eprint/53574

Actions (login required)

View Item