IIUM Repository

Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems

Abushariah, Mohammad Abd-Alrahman Mahmoud and Raja Zainal Abidin, Raja Noor Ainon and Zainuddin, Roziati and Alqudah, Assal Ali Mustafa and Ahmed, Moustafa Elshafei and Khalifa, Othman Omran (2011) Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems. Journal of the Franklin Institute. ISSN 0016-0032 (In Press)

[img] PDF (Modern standard Arabic speech corpus) - Accepted Version
Restricted to Repository staff only

Download (929kB) | Request a copy

Abstract

This paper presents our work towards developing a new speech corpus for Modern Standard Arabic (MSA), which can be used for implementing and evaluating Arabic speaker-independent, large vocabulary, automatic, and continuous speech recognition systems. The speech corpus was recorded by 40 (20 male and 20 female) Arabic native speakers from 11 countries representing three major regions (Levant, Gulf, and Africa). Three development phases were conducted based on the size of training data, Gaussian mixture distributions, and tied states (senones). Based on our third development phase using 11 hours of training speech data, the acoustic model is composed of 16 Gaussian mixture distributions and the state distributions tied to 300 senones. Using three different data sets, the third development phase obtained 94.32% and 8.10% average word recognition correctness rate and average Word Error Rate (WER), respectively, for same speakers with different sentences (testing sentences). For different speakers with same sentences (training sentences), this work obtained 98.10% and 2.67% average word recognition correctness rate and average WER, respectively, whereas for different speakers with different sentences (testing sentences) this work obtained 93.73% and 8.75% average word recognition correctness rate and average WER, respectively.

Item Type: Article (Journal)
Additional Information: 4119/5625
Subjects: T Technology > T Technology (General)
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Depositing User: Prof. Dr Othman O. Khalifa
Date Deposited: 25 Oct 2011 10:22
Last Modified: 17 Jan 2012 11:56
URI: http://irep.iium.edu.my/id/eprint/5625

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year