Using SAGA and the open science grid to search for aptamers

Kevin Shieh, Pilib Ó Broin, David Rhee, Matthew Levy, Aaron Golden

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

RNA aptamers are small oligonucleotide molecules whose composition and resulting folded structure enable them to bind with high affinity and high selectivity to target ligands and therefore hold great promise as potential therapeutic drugs. Functional aptamers are selected from a large, randomized initial library in a process known as SELEX (systematic evolution of ligands by exponential enrichment). This is an iterative process involving numerous rounds of binding, elution, and amplification against a specific target substrate. During each iteration-or round of selection-we enrich for the species with the highest binding affinity to the target. After multiple rounds, we ideally have an enriched aptamer library suitable for subsequent investigation. Modern techniques employ massively parallel sequencing, enabling the generation of large libraries (~106 sequences) in a matter of hours for each round of selection. As RNA is singlestranded, covariance models (CMs) are ideal for representing motifs in their secondary structures, allowing us to discover patterns within functional aptamer populations following each round. CMs have been implemented in Infernal, a program that infers RNA alignments based on RNA sequence and structure. Calibrating a single CM in Infernal can take several hours and is a significant performance bottleneck for our work. However, as each CM calculation is itself independently determined and requires defined processing and memory resources, their computation in parallel offers a potential solution to this problem. In this paper, we describe using the Open Science Grid (OSG) to facilitate the identification of aptamer motifs by running CM calibrations and refinements in parallel across up to ten OSG clients. We use the Simple API for Grid Applications (SAGA) to interface with OSG and manage job submissions and file transfers. When run in parallel, our results show a significant speed up, constrained by typical latencies and QoS associated with nominal OSG usage. Our work demonstrates the ability of SAGA and the OSG to assist in parallelizing solutions to complex sequencing-based biomedical challenges.

Original languageEnglish (US)
Title of host publicationProceedings of the XSEDE 2014 Conference
Subtitle of host publicationEngaging Communities
PublisherAssociation for Computing Machinery
ISBN (Print)9781450328937
DOIs
Publication statusPublished - Jan 1 2014
Event2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014 - Atlanta, GA, United States
Duration: Jul 13 2014Jul 18 2014

Publication series

NameACM International Conference Proceeding Series

Other

Other2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014
CountryUnited States
CityAtlanta, GA
Period7/13/147/18/14

    Fingerprint

Keywords

  • Aptamers
  • HTCondor
  • High-throughput computing
  • Open science grid
  • Parallel computing
  • SAGA

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Shieh, K., Broin, P. Ó., Rhee, D., Levy, M., & Golden, A. (2014). Using SAGA and the open science grid to search for aptamers. In Proceedings of the XSEDE 2014 Conference: Engaging Communities [27] (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/2616498.2616517