Characterization of human pseudogene-derived non-coding RNAs for functional potential

Xingyi Guo, Mingyan Lin, Shira Rockowitz, Herbert M. Lachman, Deyou Zheng

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.

Original languageEnglish (US)
Article numbere93972
JournalPLoS One
Volume9
Issue number4
DOIs
StatePublished - Apr 3 2014

Fingerprint

Untranslated RNA
Pseudogenes
pseudogenes
RNA
Genes
Transcription
Tissue
Pipelines
Cells
Methyltransferases
Gene expression
Histones
Lysine
Chromatin
non-coding RNA
transcription (genetics)
Epigenomics
epigenetics
genes
cell lines

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Characterization of human pseudogene-derived non-coding RNAs for functional potential. / Guo, Xingyi; Lin, Mingyan; Rockowitz, Shira; Lachman, Herbert M.; Zheng, Deyou.

In: PLoS One, Vol. 9, No. 4, e93972, 03.04.2014.

Research output: Contribution to journalArticle

@article{40c1789ae3324ba5bd1683a504251f07,
title = "Characterization of human pseudogene-derived non-coding RNAs for functional potential",
abstract = "Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.",
author = "Xingyi Guo and Mingyan Lin and Shira Rockowitz and Lachman, {Herbert M.} and Deyou Zheng",
year = "2014",
month = "4",
day = "3",
doi = "10.1371/journal.pone.0093972",
language = "English (US)",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "4",

}

TY - JOUR

T1 - Characterization of human pseudogene-derived non-coding RNAs for functional potential

AU - Guo, Xingyi

AU - Lin, Mingyan

AU - Rockowitz, Shira

AU - Lachman, Herbert M.

AU - Zheng, Deyou

PY - 2014/4/3

Y1 - 2014/4/3

N2 - Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.

AB - Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.

UR - http://www.scopus.com/inward/record.url?scp=84899449348&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899449348&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0093972

DO - 10.1371/journal.pone.0093972

M3 - Article

C2 - 24699680

AN - SCOPUS:84899449348

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 4

M1 - e93972

ER -