IIUM Repository (IREP)

A filtering algorithm for efficient retrieving of DNA sequence

Abdul Rahman, Mohd Nordin and Mohd. Saman, Md. Yazid and Ahmad, Aziz and Md. Tap, Abu Osman (2009) A filtering algorithm for efficient retrieving of DNA sequence. International Journal of Computer Theory and Engineering, 1 (2). pp. 102-109. ISSN 1793-821X (O), 1793-8201 (P)

[img]
Preview
PDF (A Filtering Algorithm for Efficient Retrieving of DNA Sequence)
Download (223kB) | Preview

Abstract

DNA sequence similarity search is an important task in computational biology applications. Similarity search procedure is executed by an alignment process between query and targeted sequences. An optimal alignment process based on the dynamic programming algorithms has shown to have O(n m) time and space complexity. Heuristics algorithms can process a fast DNA sequence alignment, but generate low comparison sensitivity. The biologists frequently demand for optimal comparison result so that the perfect structure of living beings evolution can be constructed. This task becomes more complex and challenging as the sizes of public sequence databases get very large and are increasing exponentially each year. The aim of this study is to develop a filtering algorithm in order to reduce the iteration of dynamic programming process and therefore an efficient process of retrieving a set of similar DNA sequences in database can be made. The algorithm filtered the expected irrelevant DNA sequences in database from being computed for dynamic programming based optimal alignment process. An automaton-based algorithm is used to develop the filtering process proposed. A set of random patterns is generated from query sequence are placed in automaton machine before exact matching and scoring process is performed. Extensive experiments have been carried out on several parameters and the results show that the developed filtering algorithm removed the unrelated targeted sequences from being aligned with query sequence

Item Type: Article (Journal)
Additional Information: 5996/1205
Uncontrolled Keywords: Exact string matching, Aho-Corasick algorithm, sequence comparison, Smith-Waterman algorithm
Subjects: T Technology > T Technology (General)
T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes: Kulliyyah of Information and Communication Technology
Kulliyyah of Information and Communication Technology

Kulliyyah of Information and Communication Technology > Department of Information System
Kulliyyah of Information and Communication Technology > Department of Information System
Depositing User: Prof Dr ABU OSMAN MD TAP
Date Deposited: 28 Jul 2011 16:19
Last Modified: 20 Mar 2012 16:26
URI: http://irep.iium.edu.my/id/eprint/1205

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year