Structural variation (SV) has been reported to be associated with numerous diseases such as cancer. With the advent of next generation sequencing (NGS) technologies, various types of SV can be potentially identified. We propose a model based clustering approach utilizing a set of features defined for each type of SV event. Our method, termed SVMiner, not only provides a probability score for each candidate, but also predicts the heterozygosity of genomic deletions. Extensive experiments on genome-wide deep sequencing data have demonstrated that SVMiner is robust against the variability of a single cluster feature, and it performs well when classifying validated SV events with accentuated features.

SVMiner analyzes alignment results provided in the SAM/BAM and MAP(MAQ alignment result) format.

It requires MAQ, SAMtools and Matlab to be preinstalled on user's local machine.


Matthew Hayes, Yoon Soo Pyon and Jing Li "A model-based clustering method for genomic structural variant prediction and genotyping using paired-end sequencing data". In revision