Awang Abu Bakar, Normi Sham (2020) The development of an integrated corpus for Malay language. In: Sixth International Conference on Computational Science and Technology 2019 (ICCST2019), 29-30 Aug 2019, Kota Kinabalu.
PDF (paper)
- Published Version
Restricted to Repository staff only Download (566kB) | Request a copy |
||
|
PDF (scopus)
Download (109kB) | Preview |
Abstract
Generally, a corpus serves as the source of data for various types of research. As such, there are a number of Malay corpora being developed to support the needs of the researchers. However, the various corpora of Malay text are distributed and not integrated, where some words are not included or missing in some corpora. The focus of this paper is to develop an integrated corpus that will combine four most comprehensive Malay corpora. The intention is to provide comprehensive coverage of Malay corpora which would be beneficial for any relevant work.
Item Type: | Conference or Workshop Item (Plenary Papers) |
---|---|
Additional Information: | 3509/75243 |
Subjects: | T Technology > T Technology (General) |
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): | Kulliyyah of Information and Communication Technology > Department of Computer Science Kulliyyah of Information and Communication Technology > Department of Computer Science |
Depositing User: | Dr. Normi Sham Awang Abu Bakar |
Date Deposited: | 24 Oct 2019 15:56 |
Last Modified: | 24 Oct 2019 15:57 |
URI: | http://irep.iium.edu.my/id/eprint/75243 |
Actions (login required)
View Item |