SMITE: An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information

N. Ari Wijetunga, Andrew D. Johnston, Ryo Maekawa, Fabien Delahaye, Netha Ulahannan, Kami Kim, John M. Greally

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Background: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. Results: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. Conclusions: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.

Original languageEnglish (US)
Article number41
JournalBMC Bioinformatics
Volume18
Issue number1
DOIs
StatePublished - Jan 18 2017

Fingerprint

Transcriptome
Epigenomics
Genomics
Genes
Module
Gene Regulatory Networks
Assays
Gene
Gene expression
Genome
Gene Expression
Toxoplasmosis
Confidence
Spin glass
Glass
Integrate
Set theory
Cell
Spin Glass
p-Value

Keywords

  • Bioinformatics
  • Epigenetic
  • Gene expression
  • Genomic
  • Interaction network
  • Modules

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

SMITE : An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information. / Wijetunga, N. Ari; Johnston, Andrew D.; Maekawa, Ryo; Delahaye, Fabien; Ulahannan, Netha; Kim, Kami; Greally, John M.

In: BMC Bioinformatics, Vol. 18, No. 1, 41, 18.01.2017.

Research output: Contribution to journalArticle

Wijetunga, N. Ari ; Johnston, Andrew D. ; Maekawa, Ryo ; Delahaye, Fabien ; Ulahannan, Netha ; Kim, Kami ; Greally, John M. / SMITE : An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information. In: BMC Bioinformatics. 2017 ; Vol. 18, No. 1.
@article{eb638203e3474a25bc14ae0d9f1f2930,
title = "SMITE: An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information",
abstract = "Background: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. Results: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. Conclusions: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.",
keywords = "Bioinformatics, Epigenetic, Gene expression, Genomic, Interaction network, Modules",
author = "Wijetunga, {N. Ari} and Johnston, {Andrew D.} and Ryo Maekawa and Fabien Delahaye and Netha Ulahannan and Kami Kim and Greally, {John M.}",
year = "2017",
month = "1",
day = "18",
doi = "10.1186/s12859-017-1477-3",
language = "English (US)",
volume = "18",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - SMITE

T2 - An R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information

AU - Wijetunga, N. Ari

AU - Johnston, Andrew D.

AU - Maekawa, Ryo

AU - Delahaye, Fabien

AU - Ulahannan, Netha

AU - Kim, Kami

AU - Greally, John M.

PY - 2017/1/18

Y1 - 2017/1/18

N2 - Background: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. Results: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. Conclusions: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.

AB - Background: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. Results: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. Conclusions: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.

KW - Bioinformatics

KW - Epigenetic

KW - Gene expression

KW - Genomic

KW - Interaction network

KW - Modules

UR - http://www.scopus.com/inward/record.url?scp=85009809651&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009809651&partnerID=8YFLogxK

U2 - 10.1186/s12859-017-1477-3

DO - 10.1186/s12859-017-1477-3

M3 - Article

C2 - 28100166

AN - SCOPUS:85009809651

VL - 18

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 1

M1 - 41

ER -