Variant Filtration

GATK Variant Quality Score Recalibration (VQSR) was used to filter variants.


To train the SNP VQSR model HapMap3.3 and 1KG Omni2.5 SNP sites were used and a 99.6% sensitivity threshold was used to filter variants, while Mills et. al. 1KG gold standard and Axiom Exome Plus sites wre used VQSR index model and a 95.0% sensitivity threshold was use.


https://gist.github.com/hongiiv/a9669470717c9b8b5aa7.js

This resulted in ~80% of bi-allelic singleton SNPs to be filtered. From analyzing TiTv, singleton transmission in trios and validated sites, the VQSLOD PASS cut off was adjusted resulting in filtering of ~90% of bi-allelic singleton SNPs.

An additional inbreeding coefficient (InbreedingCoeff <= -0.2) was applied to filter sites missed by VQSR filtering. Lastly, an additional filter labelled AC_Adj0_Filter was intrudouced to indicate that low quality genotype calls containing alternate alleles are present in the release subset.