Sequence homology detection through large scale pattern discovery

Abstract

We describe a new approach for identifying sequence similarity between a query sequence and a data base of proteins. The central idea is the use of a set of patterns obtained from the underlying data base through an one-time computation. These patterns are subsequently searched for on every query sequence presented to the system. A pattern matched by a region of the query pinpoints to a potential local similarity between that region and all the data base sequences also matching that pattern. By using a set of prudently chosen patterns, the tool presented in this work is able to discover weak but biologically important similarities.

Publication Title

Proceedings of the Annual International Conference on Computational Molecular Biology, RECOMB

Share

COinS