Integrated pseudogene annotation for human chromosome 22: Evidence for transcription

Deyou Zheng, Zhaolei Zhang, Paul M. Harrison, John Karro, Nick Carriero, Mark Gerstein

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 pseudogenes with strong support for transcription. In particular, one of the pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near pseudogenes, we did find that ∼12% of the pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.pseudogene.org).

Original languageEnglish (US)
Pages (from-to)27-45
Number of pages19
JournalJournal of Molecular Biology
Volume349
Issue number1
DOIs
StatePublished - May 27 2005
Externally publishedYes

Fingerprint

Chromosomes, Human, Pair 22
Pseudogenes
Human Chromosomes
Expressed Sequence Tags
CpG Islands
Intergenic DNA
Chromatin Immunoprecipitation
Oligonucleotide Array Sequence Analysis
Genes

Keywords

  • CESCR
  • Cromosome 22
  • Microarray
  • Pseudogene
  • Transcription

ASJC Scopus subject areas

  • Virology

Cite this

Integrated pseudogene annotation for human chromosome 22 : Evidence for transcription. / Zheng, Deyou; Zhang, Zhaolei; Harrison, Paul M.; Karro, John; Carriero, Nick; Gerstein, Mark.

In: Journal of Molecular Biology, Vol. 349, No. 1, 27.05.2005, p. 27-45.

Research output: Contribution to journalArticle

Zheng, Deyou ; Zhang, Zhaolei ; Harrison, Paul M. ; Karro, John ; Carriero, Nick ; Gerstein, Mark. / Integrated pseudogene annotation for human chromosome 22 : Evidence for transcription. In: Journal of Molecular Biology. 2005 ; Vol. 349, No. 1. pp. 27-45.
@article{48b69b97218c4b3b96e062c7ca0df0ba,
title = "Integrated pseudogene annotation for human chromosome 22: Evidence for transcription",
abstract = "Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 pseudogenes with strong support for transcription. In particular, one of the pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near pseudogenes, we did find that ∼12{\%} of the pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.pseudogene.org).",
keywords = "CESCR, Cromosome 22, Microarray, Pseudogene, Transcription",
author = "Deyou Zheng and Zhaolei Zhang and Harrison, {Paul M.} and John Karro and Nick Carriero and Mark Gerstein",
year = "2005",
month = "5",
day = "27",
doi = "10.1016/j.jmb.2005.02.072",
language = "English (US)",
volume = "349",
pages = "27--45",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Integrated pseudogene annotation for human chromosome 22

T2 - Evidence for transcription

AU - Zheng, Deyou

AU - Zhang, Zhaolei

AU - Harrison, Paul M.

AU - Karro, John

AU - Carriero, Nick

AU - Gerstein, Mark

PY - 2005/5/27

Y1 - 2005/5/27

N2 - Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 pseudogenes with strong support for transcription. In particular, one of the pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near pseudogenes, we did find that ∼12% of the pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.pseudogene.org).

AB - Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 pseudogenes with strong support for transcription. In particular, one of the pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near pseudogenes, we did find that ∼12% of the pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.pseudogene.org).

KW - CESCR

KW - Cromosome 22

KW - Microarray

KW - Pseudogene

KW - Transcription

UR - http://www.scopus.com/inward/record.url?scp=18144417788&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18144417788&partnerID=8YFLogxK

U2 - 10.1016/j.jmb.2005.02.072

DO - 10.1016/j.jmb.2005.02.072

M3 - Article

C2 - 15876366

AN - SCOPUS:18144417788

VL - 349

SP - 27

EP - 45

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 1

ER -