Complete Genome Phasing of Family Quartet by Combination of Genetic, Physical and Population-Based Phasing Analysis

Julien Lajugie, Rituparna Mukhopadhyay, Michael Schizas, Nathalie Lailler, Nicolas Fourel, Eric E. Bouhassira

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Phased genome maps are important to understand genetic and epigenetic regulation and disease mechanisms, particularly parental imprinting defects. Phasing is also critical to assess the functional consequences of genetic variants, and to allow precise definition of haplotype blocks which is useful to understand gene-flow and genotype-phenotype association at the population level. Transmission phasing by analysis of a family quartet allows the phasing of 95% of all variants as the uniformly heterozygous positions cannot be phased. Here, we report a phasing method based on a combination of transmission analysis, physical phasing by pair-end sequencing of libraries of staggered sizes and population-based analysis. Sequencing of a healthy Caucasians quartet at 120x coverage and combination of physical and transmission phasing yielded the phased genotypes of about 99.8% of the SNPs, indels and structural variants present in the quartet, a phasing rate significantly higher than what can be achieved using any single phasing method. A false positive SNP error rate below 10*E-7 per genome and per base was obtained using a combination of filters. We provide a complete list of SNPs, indels and structural variants, an analysis of haplotype block sizes, and an analysis of the false positive and negative variant calling error rates. Improved genome phasing and family sequencing will increase the power of genome-wide sequencing as a clinical diagnosis tool and has myriad basic science applications.

Original languageEnglish (US)
Article numbere64571
JournalPLoS One
Volume8
Issue number5
DOIs
StatePublished - May 31 2013

Fingerprint

Population Genetics
Genes
Genome
Single Nucleotide Polymorphism
genome
Haplotypes
haplotypes
Genomic Imprinting
genomic imprinting
Gene Flow
DNA libraries
Genetic Association Studies
Population Density
Epigenomics
epigenetics
Libraries
gene flow
Genotype
genotype
methodology

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Complete Genome Phasing of Family Quartet by Combination of Genetic, Physical and Population-Based Phasing Analysis. / Lajugie, Julien; Mukhopadhyay, Rituparna; Schizas, Michael; Lailler, Nathalie; Fourel, Nicolas; Bouhassira, Eric E.

In: PLoS One, Vol. 8, No. 5, e64571, 31.05.2013.

Research output: Contribution to journalArticle

Lajugie, Julien ; Mukhopadhyay, Rituparna ; Schizas, Michael ; Lailler, Nathalie ; Fourel, Nicolas ; Bouhassira, Eric E. / Complete Genome Phasing of Family Quartet by Combination of Genetic, Physical and Population-Based Phasing Analysis. In: PLoS One. 2013 ; Vol. 8, No. 5.
@article{0c2fb391b05b4334a34c1498e443c15a,
title = "Complete Genome Phasing of Family Quartet by Combination of Genetic, Physical and Population-Based Phasing Analysis",
abstract = "Phased genome maps are important to understand genetic and epigenetic regulation and disease mechanisms, particularly parental imprinting defects. Phasing is also critical to assess the functional consequences of genetic variants, and to allow precise definition of haplotype blocks which is useful to understand gene-flow and genotype-phenotype association at the population level. Transmission phasing by analysis of a family quartet allows the phasing of 95{\%} of all variants as the uniformly heterozygous positions cannot be phased. Here, we report a phasing method based on a combination of transmission analysis, physical phasing by pair-end sequencing of libraries of staggered sizes and population-based analysis. Sequencing of a healthy Caucasians quartet at 120x coverage and combination of physical and transmission phasing yielded the phased genotypes of about 99.8{\%} of the SNPs, indels and structural variants present in the quartet, a phasing rate significantly higher than what can be achieved using any single phasing method. A false positive SNP error rate below 10*E-7 per genome and per base was obtained using a combination of filters. We provide a complete list of SNPs, indels and structural variants, an analysis of haplotype block sizes, and an analysis of the false positive and negative variant calling error rates. Improved genome phasing and family sequencing will increase the power of genome-wide sequencing as a clinical diagnosis tool and has myriad basic science applications.",
author = "Julien Lajugie and Rituparna Mukhopadhyay and Michael Schizas and Nathalie Lailler and Nicolas Fourel and Bouhassira, {Eric E.}",
year = "2013",
month = "5",
day = "31",
doi = "10.1371/journal.pone.0064571",
language = "English (US)",
volume = "8",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "5",

}

TY - JOUR

T1 - Complete Genome Phasing of Family Quartet by Combination of Genetic, Physical and Population-Based Phasing Analysis

AU - Lajugie, Julien

AU - Mukhopadhyay, Rituparna

AU - Schizas, Michael

AU - Lailler, Nathalie

AU - Fourel, Nicolas

AU - Bouhassira, Eric E.

PY - 2013/5/31

Y1 - 2013/5/31

N2 - Phased genome maps are important to understand genetic and epigenetic regulation and disease mechanisms, particularly parental imprinting defects. Phasing is also critical to assess the functional consequences of genetic variants, and to allow precise definition of haplotype blocks which is useful to understand gene-flow and genotype-phenotype association at the population level. Transmission phasing by analysis of a family quartet allows the phasing of 95% of all variants as the uniformly heterozygous positions cannot be phased. Here, we report a phasing method based on a combination of transmission analysis, physical phasing by pair-end sequencing of libraries of staggered sizes and population-based analysis. Sequencing of a healthy Caucasians quartet at 120x coverage and combination of physical and transmission phasing yielded the phased genotypes of about 99.8% of the SNPs, indels and structural variants present in the quartet, a phasing rate significantly higher than what can be achieved using any single phasing method. A false positive SNP error rate below 10*E-7 per genome and per base was obtained using a combination of filters. We provide a complete list of SNPs, indels and structural variants, an analysis of haplotype block sizes, and an analysis of the false positive and negative variant calling error rates. Improved genome phasing and family sequencing will increase the power of genome-wide sequencing as a clinical diagnosis tool and has myriad basic science applications.

AB - Phased genome maps are important to understand genetic and epigenetic regulation and disease mechanisms, particularly parental imprinting defects. Phasing is also critical to assess the functional consequences of genetic variants, and to allow precise definition of haplotype blocks which is useful to understand gene-flow and genotype-phenotype association at the population level. Transmission phasing by analysis of a family quartet allows the phasing of 95% of all variants as the uniformly heterozygous positions cannot be phased. Here, we report a phasing method based on a combination of transmission analysis, physical phasing by pair-end sequencing of libraries of staggered sizes and population-based analysis. Sequencing of a healthy Caucasians quartet at 120x coverage and combination of physical and transmission phasing yielded the phased genotypes of about 99.8% of the SNPs, indels and structural variants present in the quartet, a phasing rate significantly higher than what can be achieved using any single phasing method. A false positive SNP error rate below 10*E-7 per genome and per base was obtained using a combination of filters. We provide a complete list of SNPs, indels and structural variants, an analysis of haplotype block sizes, and an analysis of the false positive and negative variant calling error rates. Improved genome phasing and family sequencing will increase the power of genome-wide sequencing as a clinical diagnosis tool and has myriad basic science applications.

UR - http://www.scopus.com/inward/record.url?scp=84878618900&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878618900&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0064571

DO - 10.1371/journal.pone.0064571

M3 - Article

VL - 8

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 5

M1 - e64571

ER -