Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions

Timothy G. Bowler, Kith Pradhan, Yu Kong, Matthias Bartenstein, Kerry A. Morrone, Ashwin Shridharan, Rachel M. Kessel, Aditi Shastri, Orsolya Giricz, Tushar D. Bhagat, Shanisha Gordon-Mitchell, Mersedeh Rohanizadegan, Lauren Hooda, Ishan Datt, Bartlomiej P. Przychodzen, Simrit Parmar, Shahina Maqbool, Jaroslaw P. Maciejewski, Ulrich Steidl, John M. GreallyAmit Verma

Research output: Contribution to journalArticle

Abstract

The MLL3 gene has been shown to be recurrently mutated in many malignancies including in families with acute myeloid leukemia. We demonstrate that many MLL3 variant calls made by exome sequencing are false positives due to misalignment to homologous regions, including a region on chr21, and can only be validated by long-range PCR. Numerous other recurrently mutated genes reported in COSMIC and TCGA databases have pseudogenes and cannot also be validated by conventional short read-based sequencing approaches. Genome-wide identification of pseudogene regions demonstrates that frequency of these homologous regions is increased with sequencing read lengths below 200 bps. To enable identification of poor quality sequencing variants in prospective studies, we generated novel genome-wide maps of regions with poor mappability that can be used in variant calling algorithms. Taken together, our findings reveal that pseudogene regions are a source of false-positive mutations in cancers.

Original languageEnglish (US)
JournalLeukemia and Lymphoma
DOIs
StatePublished - Jan 1 2019

Fingerprint

Pseudogenes
Mutation
Genome
Exome
Neoplasms
Acute Myeloid Leukemia
Genes
Databases
Prospective Studies
Polymerase Chain Reaction

Keywords

  • AML
  • MLL3
  • pseudogenes

ASJC Scopus subject areas

  • Hematology
  • Oncology
  • Cancer Research

Cite this

Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions. / Bowler, Timothy G.; Pradhan, Kith; Kong, Yu; Bartenstein, Matthias; Morrone, Kerry A.; Shridharan, Ashwin; Kessel, Rachel M.; Shastri, Aditi; Giricz, Orsolya; Bhagat, Tushar D.; Gordon-Mitchell, Shanisha; Rohanizadegan, Mersedeh; Hooda, Lauren; Datt, Ishan; Przychodzen, Bartlomiej P.; Parmar, Simrit; Maqbool, Shahina; Maciejewski, Jaroslaw P.; Steidl, Ulrich; Greally, John M.; Verma, Amit.

In: Leukemia and Lymphoma, 01.01.2019.

Research output: Contribution to journalArticle

Bowler, TG, Pradhan, K, Kong, Y, Bartenstein, M, Morrone, KA, Shridharan, A, Kessel, RM, Shastri, A, Giricz, O, Bhagat, TD, Gordon-Mitchell, S, Rohanizadegan, M, Hooda, L, Datt, I, Przychodzen, BP, Parmar, S, Maqbool, S, Maciejewski, JP, Steidl, U, Greally, JM & Verma, A 2019, 'Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions', Leukemia and Lymphoma. https://doi.org/10.1080/10428194.2019.1630620
Bowler, Timothy G. ; Pradhan, Kith ; Kong, Yu ; Bartenstein, Matthias ; Morrone, Kerry A. ; Shridharan, Ashwin ; Kessel, Rachel M. ; Shastri, Aditi ; Giricz, Orsolya ; Bhagat, Tushar D. ; Gordon-Mitchell, Shanisha ; Rohanizadegan, Mersedeh ; Hooda, Lauren ; Datt, Ishan ; Przychodzen, Bartlomiej P. ; Parmar, Simrit ; Maqbool, Shahina ; Maciejewski, Jaroslaw P. ; Steidl, Ulrich ; Greally, John M. ; Verma, Amit. / Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions. In: Leukemia and Lymphoma. 2019.
@article{90bb08e990ca4513873de8c1079138d3,
title = "Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions",
abstract = "The MLL3 gene has been shown to be recurrently mutated in many malignancies including in families with acute myeloid leukemia. We demonstrate that many MLL3 variant calls made by exome sequencing are false positives due to misalignment to homologous regions, including a region on chr21, and can only be validated by long-range PCR. Numerous other recurrently mutated genes reported in COSMIC and TCGA databases have pseudogenes and cannot also be validated by conventional short read-based sequencing approaches. Genome-wide identification of pseudogene regions demonstrates that frequency of these homologous regions is increased with sequencing read lengths below 200 bps. To enable identification of poor quality sequencing variants in prospective studies, we generated novel genome-wide maps of regions with poor mappability that can be used in variant calling algorithms. Taken together, our findings reveal that pseudogene regions are a source of false-positive mutations in cancers.",
keywords = "AML, MLL3, pseudogenes",
author = "Bowler, {Timothy G.} and Kith Pradhan and Yu Kong and Matthias Bartenstein and Morrone, {Kerry A.} and Ashwin Shridharan and Kessel, {Rachel M.} and Aditi Shastri and Orsolya Giricz and Bhagat, {Tushar D.} and Shanisha Gordon-Mitchell and Mersedeh Rohanizadegan and Lauren Hooda and Ishan Datt and Przychodzen, {Bartlomiej P.} and Simrit Parmar and Shahina Maqbool and Maciejewski, {Jaroslaw P.} and Ulrich Steidl and Greally, {John M.} and Amit Verma",
year = "2019",
month = "1",
day = "1",
doi = "10.1080/10428194.2019.1630620",
language = "English (US)",
journal = "Leukemia and Lymphoma",
issn = "1042-8194",
publisher = "Informa Healthcare",

}

TY - JOUR

T1 - Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions

AU - Bowler, Timothy G.

AU - Pradhan, Kith

AU - Kong, Yu

AU - Bartenstein, Matthias

AU - Morrone, Kerry A.

AU - Shridharan, Ashwin

AU - Kessel, Rachel M.

AU - Shastri, Aditi

AU - Giricz, Orsolya

AU - Bhagat, Tushar D.

AU - Gordon-Mitchell, Shanisha

AU - Rohanizadegan, Mersedeh

AU - Hooda, Lauren

AU - Datt, Ishan

AU - Przychodzen, Bartlomiej P.

AU - Parmar, Simrit

AU - Maqbool, Shahina

AU - Maciejewski, Jaroslaw P.

AU - Steidl, Ulrich

AU - Greally, John M.

AU - Verma, Amit

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The MLL3 gene has been shown to be recurrently mutated in many malignancies including in families with acute myeloid leukemia. We demonstrate that many MLL3 variant calls made by exome sequencing are false positives due to misalignment to homologous regions, including a region on chr21, and can only be validated by long-range PCR. Numerous other recurrently mutated genes reported in COSMIC and TCGA databases have pseudogenes and cannot also be validated by conventional short read-based sequencing approaches. Genome-wide identification of pseudogene regions demonstrates that frequency of these homologous regions is increased with sequencing read lengths below 200 bps. To enable identification of poor quality sequencing variants in prospective studies, we generated novel genome-wide maps of regions with poor mappability that can be used in variant calling algorithms. Taken together, our findings reveal that pseudogene regions are a source of false-positive mutations in cancers.

AB - The MLL3 gene has been shown to be recurrently mutated in many malignancies including in families with acute myeloid leukemia. We demonstrate that many MLL3 variant calls made by exome sequencing are false positives due to misalignment to homologous regions, including a region on chr21, and can only be validated by long-range PCR. Numerous other recurrently mutated genes reported in COSMIC and TCGA databases have pseudogenes and cannot also be validated by conventional short read-based sequencing approaches. Genome-wide identification of pseudogene regions demonstrates that frequency of these homologous regions is increased with sequencing read lengths below 200 bps. To enable identification of poor quality sequencing variants in prospective studies, we generated novel genome-wide maps of regions with poor mappability that can be used in variant calling algorithms. Taken together, our findings reveal that pseudogene regions are a source of false-positive mutations in cancers.

KW - AML

KW - MLL3

KW - pseudogenes

UR - http://www.scopus.com/inward/record.url?scp=85068638138&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068638138&partnerID=8YFLogxK

U2 - 10.1080/10428194.2019.1630620

DO - 10.1080/10428194.2019.1630620

M3 - Article

AN - SCOPUS:85068638138

JO - Leukemia and Lymphoma

JF - Leukemia and Lymphoma

SN - 1042-8194

ER -