CSB2010 Inferring Haplotypes from Genotypes on a Pedigree with Mutations, Genotyping Errors and Missing Alleles

Inferring Haplotypes from Genotypes on a Pedigree with Mutations, Genotyping Errors and Missing Alleles

Wei-Bung Wang*, Tao Jiang

Department of Computer Science, University of California - Riverside, Riverside, CA 92521, USA. weiw@cs.ucr.edu

Proc LSS Comput Syst Bioinform Conf. August, 2010. Vol. 9, p. 192-203. Full-Text PDF

*To whom correspondence should be addressed.


Inferring the haplotypes of the members of a pedigree from their genotypes has been extensively studied. However, most studies do not consider genotyping errors and de novo mutations. In this paper, we study how to infer haplotypes from genotype data which may contain genotyping errors, de novo mutations and missing alleles. We assume that there are no recombinants in the genotype data, which is usually true for tightly linked markers. We introduce a combinatorial optimization problem, called haplotype configuration with mutations and errors (HCME), which calls for haplotype configurations consistent with the given genotypes that incur no recombinants and require the minimum number of mutations and errors. HCME is NP-hard. To solve the problem, we propose a heuristic algorithm, the core of which is an integer linear program (ILP) using the system of linear equations over Galois field GF(2). Our algorithm can detect and locate genotyping errors that cannot be detected by simply checking the Mendelian law of inheritance. The algorithm also offers error correction in genotypes/haplotypes rather than just detecting inconsistencies and deleting the involved loci. Our experimental results show that the algorithm can infer haplotypes with a very high accuracy, and recover 65%–94% of genotyping errors depending on the pedigree topology.


[ CSB2010 Conference Home Page ] .... [ CSB2010 Online Proceedings ] .... [ Life Sciences Society Home Page ]