Computational methods using genome-wide association studies to predict radiotherapy complications and to identify correlative molecular processes

Jung Hun Oh, Sarah Kerns, Harry Ostrer, Simon N. Powell, Barry Rosenstein, Joseph O. Deasy

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

The biological cause of clinically observed variability of normal tissue damage following radiotherapy is poorly understood. We hypothesized that machine/statistical learning methods using single nucleotide polymorphism (SNP)-based genome-wide association studies (GWAS) would identify groups of patients of differing complication risk, and furthermore could be used to identify key biological sources of variability. We developed a novel learning algorithm, called pre-conditioned random forest regression (PRFR), to construct polygenic risk models using hundreds of SNPs, thereby capturing genomic features that confer small differential risk. Predictive models were trained and validated on a cohort of 368 prostate cancer patients for two post-radiotherapy clinical endpoints: late rectal bleeding and erectile dysfunction. The proposed method results in better predictive performance compared with existing computational methods. Gene ontology enrichment analysis and protein-protein interaction network analysis are used to identify key biological processes and proteins that were plausible based on other published studies. In conclusion, we confirm that novel machine learning methods can produce large predictive models (hundreds of SNPs), yielding clinically useful risk stratification models, as well as identifying important underlying biological processes in the radiation damage and tissue repair process. The methods are generally applicable to GWAS data and are not specific to radiotherapy endpoints.

Original languageEnglish (US)
Article number43381
JournalScientific Reports
Volume7
DOIs
StatePublished - Feb 24 2017

Fingerprint

Genome-Wide Association Study
Radiotherapy
Single Nucleotide Polymorphism
Biological Phenomena
Protein Interaction Maps
Gene Ontology
Erectile Dysfunction
Prostatic Neoplasms
Proteins
Learning
Radiation
Hemorrhage
Machine Learning

ASJC Scopus subject areas

  • General

Cite this

Computational methods using genome-wide association studies to predict radiotherapy complications and to identify correlative molecular processes. / Oh, Jung Hun; Kerns, Sarah; Ostrer, Harry; Powell, Simon N.; Rosenstein, Barry; Deasy, Joseph O.

In: Scientific Reports, Vol. 7, 43381, 24.02.2017.

Research output: Contribution to journalArticle

@article{f01d4dba4d3a4804a0e18dabd865b931,
title = "Computational methods using genome-wide association studies to predict radiotherapy complications and to identify correlative molecular processes",
abstract = "The biological cause of clinically observed variability of normal tissue damage following radiotherapy is poorly understood. We hypothesized that machine/statistical learning methods using single nucleotide polymorphism (SNP)-based genome-wide association studies (GWAS) would identify groups of patients of differing complication risk, and furthermore could be used to identify key biological sources of variability. We developed a novel learning algorithm, called pre-conditioned random forest regression (PRFR), to construct polygenic risk models using hundreds of SNPs, thereby capturing genomic features that confer small differential risk. Predictive models were trained and validated on a cohort of 368 prostate cancer patients for two post-radiotherapy clinical endpoints: late rectal bleeding and erectile dysfunction. The proposed method results in better predictive performance compared with existing computational methods. Gene ontology enrichment analysis and protein-protein interaction network analysis are used to identify key biological processes and proteins that were plausible based on other published studies. In conclusion, we confirm that novel machine learning methods can produce large predictive models (hundreds of SNPs), yielding clinically useful risk stratification models, as well as identifying important underlying biological processes in the radiation damage and tissue repair process. The methods are generally applicable to GWAS data and are not specific to radiotherapy endpoints.",
author = "Oh, {Jung Hun} and Sarah Kerns and Harry Ostrer and Powell, {Simon N.} and Barry Rosenstein and Deasy, {Joseph O.}",
year = "2017",
month = "2",
day = "24",
doi = "10.1038/srep43381",
language = "English (US)",
volume = "7",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - Computational methods using genome-wide association studies to predict radiotherapy complications and to identify correlative molecular processes

AU - Oh, Jung Hun

AU - Kerns, Sarah

AU - Ostrer, Harry

AU - Powell, Simon N.

AU - Rosenstein, Barry

AU - Deasy, Joseph O.

PY - 2017/2/24

Y1 - 2017/2/24

N2 - The biological cause of clinically observed variability of normal tissue damage following radiotherapy is poorly understood. We hypothesized that machine/statistical learning methods using single nucleotide polymorphism (SNP)-based genome-wide association studies (GWAS) would identify groups of patients of differing complication risk, and furthermore could be used to identify key biological sources of variability. We developed a novel learning algorithm, called pre-conditioned random forest regression (PRFR), to construct polygenic risk models using hundreds of SNPs, thereby capturing genomic features that confer small differential risk. Predictive models were trained and validated on a cohort of 368 prostate cancer patients for two post-radiotherapy clinical endpoints: late rectal bleeding and erectile dysfunction. The proposed method results in better predictive performance compared with existing computational methods. Gene ontology enrichment analysis and protein-protein interaction network analysis are used to identify key biological processes and proteins that were plausible based on other published studies. In conclusion, we confirm that novel machine learning methods can produce large predictive models (hundreds of SNPs), yielding clinically useful risk stratification models, as well as identifying important underlying biological processes in the radiation damage and tissue repair process. The methods are generally applicable to GWAS data and are not specific to radiotherapy endpoints.

AB - The biological cause of clinically observed variability of normal tissue damage following radiotherapy is poorly understood. We hypothesized that machine/statistical learning methods using single nucleotide polymorphism (SNP)-based genome-wide association studies (GWAS) would identify groups of patients of differing complication risk, and furthermore could be used to identify key biological sources of variability. We developed a novel learning algorithm, called pre-conditioned random forest regression (PRFR), to construct polygenic risk models using hundreds of SNPs, thereby capturing genomic features that confer small differential risk. Predictive models were trained and validated on a cohort of 368 prostate cancer patients for two post-radiotherapy clinical endpoints: late rectal bleeding and erectile dysfunction. The proposed method results in better predictive performance compared with existing computational methods. Gene ontology enrichment analysis and protein-protein interaction network analysis are used to identify key biological processes and proteins that were plausible based on other published studies. In conclusion, we confirm that novel machine learning methods can produce large predictive models (hundreds of SNPs), yielding clinically useful risk stratification models, as well as identifying important underlying biological processes in the radiation damage and tissue repair process. The methods are generally applicable to GWAS data and are not specific to radiotherapy endpoints.

UR - http://www.scopus.com/inward/record.url?scp=85013851581&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013851581&partnerID=8YFLogxK

U2 - 10.1038/srep43381

DO - 10.1038/srep43381

M3 - Article

VL - 7

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

M1 - 43381

ER -