IIUM Repository (IREP)

Bangla speech-to-text conversion using SAPI

Sultana, Shaheena and Akhand, M. A. H and Das, Prodip Kumar and Rahman, M.M. Hafizur (2012) Bangla speech-to-text conversion using SAPI. In: International Conference on Computer and Communication Engineering (ICCCE 2012), 3-5 July 2012, Seri Pacific Hotel Kuala Lumpur.

[img] PDF (Bangla speech-to-text conversion using SAPI) - Published Version
Restricted to Registered users only

Download (266kB) | Request a copy

Abstract

Speech is the most natural form of communication and interaction between humans; whereas, text and symbols are the most common form of transaction in computer systems. Therefore, interest regarding conversion between speech and text is increasing day by day for speech oriented human-computer interaction. Microsoft Corporation developed Speech Application Program Interface (SAPI) for speech related works in its Windows operating systems that includes features for only eight languages including English. So, the aim of this study is to investigate Speech-to-Text (STT) conversion using SAPI for Bangla language. Bangla is an important language with a rich heritage; 21st February is declared as the International Mother Language day by UNESCO to respect the language martyrs for the language in Bangladesh at the year of 1952. We managed SAPI to match pronunciation from continuous Bangla speech in precompiled grammar file of SAPI and SAPI returned Bangla words in English character if matches occur. The words are then used to fetch Bangla words from database and return words in true Bangla characters and to complete the sentences. Several English words for particular Bangla word in the grammar file of SAPI is found to overcome tone variation of persons as well as pronunciation variation in language communities and shown to improve overall performance of the system. Experimental study is carried out for the technique on an article from a news paper and the recognition rate was approximately 78% on an average. Although achieved performance is promising for STT related studies, we identified several elements to improve the performance and might give better accuracy. The theme of this study will also be helpful for other languages for Speech-to-Text conversion and similar tasks.

Item Type: Conference or Workshop Item (Full Paper)
Uncontrolled Keywords: Speech, Text, Human-Computer Interaction
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes: Kulliyyah of Information and Communication Technology > Department of Computer Science
Kulliyyah of Information and Communication Technology > Department of Computer Science
Depositing User: Dr. M.M. Hafizur Rahman
Date Deposited: 06 Sep 2012 14:52
Last Modified: 06 Sep 2012 14:52
URI: http://irep.iium.edu.my/id/eprint/24980

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year