A deep learning approach to automate refinement of somatic variant calling from cancer sequencing data

Benjamin J. Ainscough, Erica K. Barnell, Peter Ronning, Katie M. Campbell, Alex H. Wagner, Todd A. Fehniger, Gavin P. Dunn, Ravindra Uppaluri, Ramaswamy Govindan, Thomas E. Rohan, Malachi Griffith, Elaine R. Mardis, S. Joshua Swamidass, Obi L. Griffith

Research output: Contribution to journalArticle

7 Scopus citations

Abstract

Cancer genomic analysis requires accurate identification of somatic variants in sequencing data. Manual review to refine somatic variant calls is required as a final step after automated processing. However, manual variant refinement is time-consuming, costly, poorly standardized, and non-reproducible. Here, we systematized and standardized somatic variant refinement using a machine learning approach. The final model incorporates 41,000 variants from 440 sequencing cases. This model accurately recapitulated manual refinement labels for three independent testing sets (13,579 variants) and accurately predicted somatic variants confirmed by orthogonal validation sequencing data (212,158 variants). The model improves on manual somatic refinement by reducing bias on calls otherwise subject to high inter-reviewer variability.

Original languageEnglish (US)
JournalNature Genetics
DOIs
StateAccepted/In press - Jan 1 2018

    Fingerprint

ASJC Scopus subject areas

  • Genetics

Cite this

Ainscough, B. J., Barnell, E. K., Ronning, P., Campbell, K. M., Wagner, A. H., Fehniger, T. A., Dunn, G. P., Uppaluri, R., Govindan, R., Rohan, T. E., Griffith, M., Mardis, E. R., Swamidass, S. J., & Griffith, O. L. (Accepted/In press). A deep learning approach to automate refinement of somatic variant calling from cancer sequencing data. Nature Genetics. https://doi.org/10.1038/s41588-018-0257-y