A combined computational and experimental approach to the evolution and role of the DNA sequence environment in targeting mutations to antibody V regions

Project: Research projectResearch Project


Project summary There is a fundamental gap in our understanding of how mutations are preferentially targeted to the variable (V) regions of the Immunoglobulin (Ig) loci during somatic hypermutation (SHM). The persistence of this gap has limited our understanding of the mutagenic mechanisms involving activation-induced deaminase (AID) in the immune response and in the role of AID in mis-targeting mutations leading to B-cell lymphomas and other cancers. The long-term goal of the proposed research is to understand the global targeting of mutations in immunity that are required to protect us from infections. As high-throughput data from human antibody immune responses became available, it provided us with new opportunities to generate hypotheses to explain the underlying mechanisms of SHM. We now propose to generate further hypotheses using computational models applied to additional databases and to validate these hypotheses using cellular and animal experiments. Our objective is to understand what directs SHM across the many human Ig heavy chain V-regions. Our central hypothesis is that the V-region SHM process is highly dependent on a DNA sequence signature(s) that drives mutations in a largely deterministic fashion. This hypothesis is supported by our preliminary results using human in vivo data from a few human V region genes and has begun to be validated using independent databases and experiments in human B cell lines. The rationale is that evaluations of computational data based upon biological mechanisms, together with appropriate biological experiments, will reveal the key differences between IGHV regions (IGHV 3-23, 4-34, 1-18, 1-02, etc.) that lead to the dominance of each of those V regions in the responses to medically important antigens. Our hypothesis will be tested by pursuing two specific aims: 1) identify the extent to which a DNA signature determines the mutation process in four individual human IGHV genes that are important in disease responses; 2) examine the relationship between AID hotspots and Pol? hotspots across all the other human V region genes, thus rigorously defining a mutation targeting signature. Both aims will also entail studying human V region genes and modifications of them in human cell lines and in mice expressing a human V region to further confirm the signature and identify molecular mechanisms in vivo. Our approach is innovative because the computational models we are proposing will be mechanistically motivated focusing on the interaction between AID and Pol? hotspots, thus testing molecular mechanisms as opposed to classic statistical models using whole V region sequences that ignore the underlying biology. In addition, to focus on mechanisms we will leverage new high-throughput data from human V regions that have not undergone antigen selection. Our results will be highly relevant to human IgV repertoire analyses from immune responses that are currently hard to interpret and will help future vaccine and therapeutic antibody development, as well as help to understand mutations in human malignancies where AID plays a key role.
Effective start/end date3/1/182/28/23


  • National Institutes of Health: $603,890.00


Immunoglobulin Variable Region
Cell Line
Immunoglobulin Heavy Chains
B-Cell Lymphoma
Statistical Models
Antibody Formation
AICDA (activation-induced cytidine deaminase)


  • Medicine(all)
  • Immunology and Microbiology(all)