Hardy Weinberg Equilibrium and Sequencing Errors
More than 200,000 genomes and an even higher number of exomes have been sequenced to date. It is still widely established that variants found using NGS should be validated with the current “gold standard” for DNA sequencing, Sanger dideoxy terminator sequencing. Although there are several reports suggesting that NGS results are at least as accurate or in some cases more accurate than Sanger sequencing providing preliminary evidence that Sanger sequencing validation may not represent the best practice for clinical NGS validation. Massively parallel sequencing technologies have revolutionized medical genetics however also NGS is prone to both negative and positive results. Problems may be generated during library preparation procedure, PCR artefacts for example can introduce false positive results in capturing based libraries while amplicon based libraries are prone to allelic dropout problems due to presence of variation in the sequence that produce the selective amplification of a single allele. Errors can also be introduced during the bioinformatic analysis, extended insertions or deletions for example could be not identified. To date there are not validated procedures to detect sequencing errors that remains in general identified by change.
The Hardy Weinberg equilibrium can be considered a valid approach to test if there are some genotyping errors in our data. This approach is not indicated for rare variants but it might be useful to identify amplicons affected by allelic dropout problems for example. In this case, polymorphic variants that are localized in regions characterized by ADO will result not in Hardy Weinberg Equilibrium.
Gentilini Davide