Clustering of Main Orthologs for Multiple Genomes

Zheng Fu*, Tao Jiang

Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA. zfu@cs.ucr.edu

Proc LSS Comput Syst Bioinform Conf. August, 2007. Vol. 6, p. 195-201. Full-Text PDF

*To whom correspondence should be addressed.


The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and a high-throughput system MSOAR for ortholog identification between closely related genomes based on genome rearrangement and gene duplication have been proposed in 11. MSOAR assumes that orthologous genes correspond to each other in the most parsimonious evolutionary scenario minimizing the number of genome rearrangement and (post-speciation) gene duplication events. However, the parsimony approach used by MSOAR limits it to pairwsie genome comparisons. In this paper, we extend MSOAR to multiple (closely related) genomes and propose an ortholog clustering method, called MultiMSOAR, to infer main orthologs in multiple genomes. As a preliminary experiment, we apply MultiMSOAR to rat, mouse and human genomes, and validate our results using gene annotations and gene function classifications in the public databases. We further compare our results to the ortholog clusters predicted by MultiParanoid, which is an extension of the well-known program Inparanoid for pairwise genome comparisons. The comparison reveals that MultiMSOAR gives more detailed and accurate orthology information since it can effectively distinguish main orthologs from inparalogs.


[CSB2007 Conference Home Page]....[CSB2007 Online Proceedings]....[Life Sciences Society Home Page]