An automated data analysis pipeline for GC-TOF-MS metabonomics studies

Wenxin Jiang, Yunping Qiu, Yan Ni, Mingming Su, Wei Jia, Xiuxia Du

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Recent technological advances have made it possible to carry out high-throughput metabonomics studies using gas chromatography coupled with time-of-flight mass spectrometry. Large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze data in a high-throughput fashion as well. We present an Automated Data Analysis Pipeline (ADAP) that has been developed for this purpose. ADAP consists of peak detection, deconvolution, peak alignment, and library search. It allows data to flow seamlessly through the analysis steps without any human intervention and features two novel algorithms in the analysis. Specifically, clustering is successfully applied in deconvolution to resolve coeluting compounds that are very common in complex samples and a two-phase alignment process has been implemented to enhance alignment accuracy. ADAP is written in standard C++ and R and uses parallel computing via Message Passing Interface for fast peak detection and deconvolution. ADAP has been applied to analyze both mixed standards samples and serum samples and identified and quantified metabolites successfully. ADAP is available at http://www.du-lab.org.

Original languageEnglish (US)
Pages (from-to)5974-5981
Number of pages8
JournalJournal of Proteome Research
Volume9
Issue number11
DOIs
StatePublished - Nov 5 2010
Externally publishedYes

Fingerprint

Metabolomics
Pipelines
Deconvolution
Throughput
Message passing
Parallel processing systems
Metabolites
Gas chromatography
Mass spectrometry
Gas Chromatography
Libraries
Cluster Analysis
Mass Spectrometry
Serum

Keywords

  • alignment
  • clustering analysis
  • deconvolution
  • extracted ion chromatogram
  • gas chromatography-mass spectrometry

ASJC Scopus subject areas

  • Biochemistry
  • Chemistry(all)

Cite this

An automated data analysis pipeline for GC-TOF-MS metabonomics studies. / Jiang, Wenxin; Qiu, Yunping; Ni, Yan; Su, Mingming; Jia, Wei; Du, Xiuxia.

In: Journal of Proteome Research, Vol. 9, No. 11, 05.11.2010, p. 5974-5981.

Research output: Contribution to journalArticle

Jiang, Wenxin ; Qiu, Yunping ; Ni, Yan ; Su, Mingming ; Jia, Wei ; Du, Xiuxia. / An automated data analysis pipeline for GC-TOF-MS metabonomics studies. In: Journal of Proteome Research. 2010 ; Vol. 9, No. 11. pp. 5974-5981.
@article{ebfbe6926cee4694bc4c1c6d84aca60c,
title = "An automated data analysis pipeline for GC-TOF-MS metabonomics studies",
abstract = "Recent technological advances have made it possible to carry out high-throughput metabonomics studies using gas chromatography coupled with time-of-flight mass spectrometry. Large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze data in a high-throughput fashion as well. We present an Automated Data Analysis Pipeline (ADAP) that has been developed for this purpose. ADAP consists of peak detection, deconvolution, peak alignment, and library search. It allows data to flow seamlessly through the analysis steps without any human intervention and features two novel algorithms in the analysis. Specifically, clustering is successfully applied in deconvolution to resolve coeluting compounds that are very common in complex samples and a two-phase alignment process has been implemented to enhance alignment accuracy. ADAP is written in standard C++ and R and uses parallel computing via Message Passing Interface for fast peak detection and deconvolution. ADAP has been applied to analyze both mixed standards samples and serum samples and identified and quantified metabolites successfully. ADAP is available at http://www.du-lab.org.",
keywords = "alignment, clustering analysis, deconvolution, extracted ion chromatogram, gas chromatography-mass spectrometry",
author = "Wenxin Jiang and Yunping Qiu and Yan Ni and Mingming Su and Wei Jia and Xiuxia Du",
year = "2010",
month = "11",
day = "5",
doi = "10.1021/pr1007703",
language = "English (US)",
volume = "9",
pages = "5974--5981",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "11",

}

TY - JOUR

T1 - An automated data analysis pipeline for GC-TOF-MS metabonomics studies

AU - Jiang, Wenxin

AU - Qiu, Yunping

AU - Ni, Yan

AU - Su, Mingming

AU - Jia, Wei

AU - Du, Xiuxia

PY - 2010/11/5

Y1 - 2010/11/5

N2 - Recent technological advances have made it possible to carry out high-throughput metabonomics studies using gas chromatography coupled with time-of-flight mass spectrometry. Large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze data in a high-throughput fashion as well. We present an Automated Data Analysis Pipeline (ADAP) that has been developed for this purpose. ADAP consists of peak detection, deconvolution, peak alignment, and library search. It allows data to flow seamlessly through the analysis steps without any human intervention and features two novel algorithms in the analysis. Specifically, clustering is successfully applied in deconvolution to resolve coeluting compounds that are very common in complex samples and a two-phase alignment process has been implemented to enhance alignment accuracy. ADAP is written in standard C++ and R and uses parallel computing via Message Passing Interface for fast peak detection and deconvolution. ADAP has been applied to analyze both mixed standards samples and serum samples and identified and quantified metabolites successfully. ADAP is available at http://www.du-lab.org.

AB - Recent technological advances have made it possible to carry out high-throughput metabonomics studies using gas chromatography coupled with time-of-flight mass spectrometry. Large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze data in a high-throughput fashion as well. We present an Automated Data Analysis Pipeline (ADAP) that has been developed for this purpose. ADAP consists of peak detection, deconvolution, peak alignment, and library search. It allows data to flow seamlessly through the analysis steps without any human intervention and features two novel algorithms in the analysis. Specifically, clustering is successfully applied in deconvolution to resolve coeluting compounds that are very common in complex samples and a two-phase alignment process has been implemented to enhance alignment accuracy. ADAP is written in standard C++ and R and uses parallel computing via Message Passing Interface for fast peak detection and deconvolution. ADAP has been applied to analyze both mixed standards samples and serum samples and identified and quantified metabolites successfully. ADAP is available at http://www.du-lab.org.

KW - alignment

KW - clustering analysis

KW - deconvolution

KW - extracted ion chromatogram

KW - gas chromatography-mass spectrometry

UR - http://www.scopus.com/inward/record.url?scp=78149388021&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78149388021&partnerID=8YFLogxK

U2 - 10.1021/pr1007703

DO - 10.1021/pr1007703

M3 - Article

VL - 9

SP - 5974

EP - 5981

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 11

ER -