IIUM Repository (IREP)

Towards a Malay derivational lexicon: learning affixes using expectation maximization

Sulaiman, Suriani and Gasser, Michael and Kubler, Sandra (2011) Towards a Malay derivational lexicon: learning affixes using expectation maximization. In: 2nd Workshop on South and Souteast Asian Natural Language Processing (WSSANLP), IJCNLP 2011, 8th-13th Nov. 2011, Chiang Mai, Thailand.

[img] PDF - Published Version
Restricted to Repository staff only

Download (199kB) | Request a copy

Abstract

We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a na¨ıve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the model with EM to find the segmentation that maximizes the probability of each morpheme. We extract the set of affix patterns produced by our algorithm and evaluate them against two references: a list of affix patterns extracted from our hand-segmented derivational wordlist and a derivational history produced by a stemmer.

Item Type: Conference or Workshop Item (Full Paper)
Additional Information: 4615/32082
Uncontrolled Keywords: Malay derivational lexicon
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes: Kulliyyah of Information and Communication Technology > Department of Computer Science
Kulliyyah of Information and Communication Technology > Department of Computer Science
Depositing User: Sr. Suriani Sulaiman
Date Deposited: 26 Dec 2013 11:11
Last Modified: 26 Dec 2013 11:12
URI: http://irep.iium.edu.my/id/eprint/32082

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year