Comparing computational methods for identification of allele-specific expression based on next generation sequencing data

Zhi Liu, Jing Yang, Huayong Xu, Chao Li, Zhen Wang, Yuanyuan Li, Xiao Dong, Yixue Li

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF). We compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. Our study provides useful information for RNA sequencing data processing in the identification of ASE.

Original languageEnglish (US)
Pages (from-to)591-598
Number of pages8
JournalGenetic Epidemiology
Volume38
Issue number7
DOIs
StatePublished - Nov 1 2014

Fingerprint

Alleles
RNA Sequence Analysis
Genome
Sensitivity and Specificity
Transcriptome
Medicine
RNA

Keywords

  • Allele-specific expression
  • Next-generation sequencing
  • RNA sequencing

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)

Cite this

Comparing computational methods for identification of allele-specific expression based on next generation sequencing data. / Liu, Zhi; Yang, Jing; Xu, Huayong; Li, Chao; Wang, Zhen; Li, Yuanyuan; Dong, Xiao; Li, Yixue.

In: Genetic Epidemiology, Vol. 38, No. 7, 01.11.2014, p. 591-598.

Research output: Contribution to journalArticle

Liu, Zhi ; Yang, Jing ; Xu, Huayong ; Li, Chao ; Wang, Zhen ; Li, Yuanyuan ; Dong, Xiao ; Li, Yixue. / Comparing computational methods for identification of allele-specific expression based on next generation sequencing data. In: Genetic Epidemiology. 2014 ; Vol. 38, No. 7. pp. 591-598.
@article{5462e09204f0472093f7df0d5052737e,
title = "Comparing computational methods for identification of allele-specific expression based on next generation sequencing data",
abstract = "Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF). We compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. Our study provides useful information for RNA sequencing data processing in the identification of ASE.",
keywords = "Allele-specific expression, Next-generation sequencing, RNA sequencing",
author = "Zhi Liu and Jing Yang and Huayong Xu and Chao Li and Zhen Wang and Yuanyuan Li and Xiao Dong and Yixue Li",
year = "2014",
month = "11",
day = "1",
doi = "10.1002/gepi.21846",
language = "English (US)",
volume = "38",
pages = "591--598",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "7",

}

TY - JOUR

T1 - Comparing computational methods for identification of allele-specific expression based on next generation sequencing data

AU - Liu, Zhi

AU - Yang, Jing

AU - Xu, Huayong

AU - Li, Chao

AU - Wang, Zhen

AU - Li, Yuanyuan

AU - Dong, Xiao

AU - Li, Yixue

PY - 2014/11/1

Y1 - 2014/11/1

N2 - Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF). We compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. Our study provides useful information for RNA sequencing data processing in the identification of ASE.

AB - Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF). We compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. Our study provides useful information for RNA sequencing data processing in the identification of ASE.

KW - Allele-specific expression

KW - Next-generation sequencing

KW - RNA sequencing

UR - http://www.scopus.com/inward/record.url?scp=84908281598&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908281598&partnerID=8YFLogxK

U2 - 10.1002/gepi.21846

DO - 10.1002/gepi.21846

M3 - Article

C2 - 25183311

AN - SCOPUS:84908281598

VL - 38

SP - 591

EP - 598

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 7

ER -