Abu Seman, Muhamad Sadry and Wan Mamat, Wan Ali @ Wan Yusoff and Noordin, Mohamad Fauzan and Othman, Roslina (2019) A model for islamic istilahnet in Malay manuscripts for big data analytics and linguistics consortium. Research Report. UNSPECIFIED. (Unpublished)
PDF (RIGS 2015 Research Report)
Restricted to Registered users only Download (238kB) | Request a copy |
Abstract
The research on Malay manuscripts content in Information Technology is limited especially on statistical approach as compared to rule-based approach. This research aims to propose a hybrid model, which combines the two approaches for jawi-roman transliteration of Malay manuscript contents. This research assesses the quality scores of utilizing a prevalent statistical model, Statistical Model Transliteration (SMT) for jawi-roman transliteration. This research utilizes exploratory approach. The data used were extracted from 3 Malay manuscripts: Bidāyat al-Mubtadī bi-Faḍlillāh al-Muhdī, Kashf al-Asrār and Hujjat al-Balighah, acquired from ISTAC with a total of 3,420 rows of data transliterated into old jawi, modern jawi and roman form. Quality scores of Bilingual Evaluation Understudy (BLEU) score and word error rate are used for evaluation of SMT output. The findings show that E-Jawi.net word error rate for old jawi-roman is 55.8% error while modern jawi-roman is 32.42% on the initial data. Hence, the research opted for human expert to develop a quality corpus for SMT consisting of multiple transliterations of the manuscript contents in modern jawi and roman. Significantly, the model is dependable on a quality parallel corpus.
Item Type: | Monograph (Research Report) |
---|---|
Additional Information: | 4621/73052 |
Uncontrolled Keywords: | Transliteration, Jawi, Roman, Malay Manuscript, Statistical Model Transliteration, Hybrid Transliteration Model |
Subjects: | T Technology > T Technology (General) Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases |
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): | Kulliyyah of Information and Communication Technology > Department of Information System Kulliyyah of Information and Communication Technology > Department of Information System |
Depositing User: | Dr. Muhamad Sadry Abu Seman |
Date Deposited: | 01 Dec 2019 11:57 |
Last Modified: | 01 Dec 2019 11:57 |
URI: | http://irep.iium.edu.my/id/eprint/73052 |
Actions (login required)
View Item |