Tools and Available

These tools/code are available for noncommercial use free of charge. Other usage should contact Dr. Li (jingli@case.edu). All right are reserved.

PedPhase

PedPhase is a suite of computer programs for haplotype inference from genotypes on pedigree data. It consists of several different algorithms that are designed based on a combinatorial formulation of haplotype inference, namely the minimum-recombinant haplotype configuration (MRHC) problem, and are effective for different types of data. For haplotype and IBD inference of family data with many untyped individuals, please use PedIBD.

Pedphase 3.0

PedPhase 3.0 implements the DSS algorithm, which is described in details in n almost linear time algorithm for a general haplotype solution on tree pedigrees with no recombination and its extensions. JBCB, 7(3): 521-545, 2009. It was released in Aug 2009 as a standalone program for windows and Linux platforms, with a brief description of the algorithm. The algorithm can effectively work for any pedigree structure and large number of SNPs, with no or very few recombinations and moderate missing. This algorithm should replace the Block-extension algorithm and the Constraint-finding algorithm implemented in version 1.0. This tool is a SNP variant detection tool from pooled DNAs. Please download the Pedphase 3.0 from here.

Pedphase 2.1

PedPhase 2.0/2.1 implements the ILP algorithm, which is described in details in Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming. Journal of Computational Biology, 12:719-739, 2005. It is an exact algorithm. Version 2.1 has fix some bugs reported by users. Due to the ILP solver, Version 2.1 is only available on Windows. Please download the Pedphase 2.1 from here.

Pedphase 1.0

PedPhase 1.0 implements four algorithms, which are described in these two papers ( Efficient Inference of Haplotypes from Genotype on a Pedigree and Minimum Recombinant Haplotype Configuration on Tree Pedigrees.). Version 1.0 is available on Windows and Linux.. Please download the Pedphase 1.0 Windows from here. and Pedphase 1.0 Linux from here.

Citations
  1. Li, X. & Li, J. An almost linear time algorithm for a general haplotype solution on tree pedigrees with no recombination and its extensions.Journal of Bioinformatics and Computational Biology (JBCB), 7(3): 521-545, 2009.
  2. Jing Li and Tao Jiang. Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming. Journal of Computational Biology, 12:719-739, 2005.
  3. Jing Li and Tao Jiang. Efficient Inference of Haplotypes from Genotype on a Pedigree. Journal of Bioinformatics and Computational Biology(JBCB) 1(1):41-69. 2003
  4. Koichiro Doi, Jing Li and Tao Jiang. Minimum Recombinant Haplotype Configuration on Tree Pedigrees. In Proc. WABI03 339-353.
Dataset

We have applied the integer linear programming algorithm in PedPhase on the genotype data with pedigree information obtained from Gabriel et al. You may find the haplotype solutions here. Please refer to the readme file for more information about the data.

Related References
  1. HapMap Project at National Human Genome Research Institute.
  2. Genotype data of the structure of haplotype blocks discovered by Whitehead Institute/MIT Center for Genome Research.
  3. The SNP Consortium.
  4. DIMACS Workshop on SNP
Related Software
  1. Cyrillic Pedigree Draw Software
  2. Another Pedigree Draw Software by Dave Curtis
  3. SimWalk2, a statistical pedigree analysis package
HapMiner

HapMiner is a computer program for association mapping based on directly mining the haplotypes from case-control data via a density-based clustering algorithm. HapMiner can be applied to whole-genome screens, as well as candidate-gene studies in small genomic regions.

HapMiner 1.1

The current version of HapMiner is 1.1, which includes the QTL mapping. Currently HapMiner Version 1.1 is available on Windows and Linux. Please download the HapMiner 1.1 for Windowshere. and for Linuxhere.

Citations
  1. Jing Li and Tao Jiang. Haplotype-based linkage disequilibrium mapping via direct data mining. Bioinformatics, 2005 21(24):4384-4393. Please find the supplementary materials of this work here.
  2. Li, J., Zhou, Y. & Elston, R.C. Haplotype-based quantitative trait mapping using a clustering algorithm BMC Bioinformatics 7, 258 (2006). https://doi.org/10.1186/1471-2105-7-258
Dataset
  1. The CF gene dataset and the FA dataset obtained from Dr. Jun Liu's website.
  2. The HLA dataset obtained from Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory.
  3. The simulated dataset obtained from Dr. Hannu Toivonen.