IIUM Repository

Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators

Htike@Muhammad Yusof, Zaw Zaw and Win, Shoon Lei (2013) Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators. Procedia Computer Science, 23. pp. 60-67. ISSN 1877-0509

[img] PDF - Published Version
Restricted to Repository staff only

Download (366kB) | Request a copy


The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes.

Item Type: Article (Journal)
Additional Information: 6919/34337
Uncontrolled Keywords: genetic sequence classification; promoter recognition; WAODE
Subjects: Q Science > Q Science (General)
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Depositing User: Mr. Zaw Zaw Htike
Date Deposited: 16 Jan 2014 12:24
Last Modified: 01 Jun 2015 11:29
URI: http://irep.iium.edu.my/id/eprint/34337

Actions (login required)

View Item View Item


Downloads per month over past year