Skip to main content

Hematology and Hemostasis

TOPMed based Imputation in Minority Samples

Authors
Madeline Kowalski, Huijun Qian, Ziyi Hou, Jonathan D. Rosen, Laura M. Raffield, Robert Kaplan, Eric Boerwinkle, Kari E. North, Charles Kooperberg, James G. Wilson, Alex P. Reiner, Yun Li, on behalf of the TOPMed Hematology and Hemostasis Working Group
Name and Date of Professional Meeting
American Society of Human Genetics, October 18, 2018
Associated paper proposal(s)
Working Group(s)
Abstract Text
Background: The NIH/NHLBI Trans-Omics for Precision Medicine (TOPMed) Project generated deep-coverage whole genome sequencing (WGS) on >50,000 individuals from diverse ancestral backgrounds. We anticipated TOPMed sequencing data would improve genotype imputation, particularly for rarer variants and in minority populations.

Methods: We performed imputation with minimac4 using TOPMed data as reference for individuals from the Jackson Heart Study (JHS, all African Americans [AA]) and Hispanic Community Health Study/Study of Latinos (HCHS/SOL, all Hispanic/Latino [HL]). For imputation with JHS subjects, we excluded them from TOPMed data; the remaining subjects were used as reference. Imputation quality was evaluated in 3082 JHS participants at all TOPMed variants not overlapping those on Affymetrix 6.0; and in 12,803 SOL individuals at all imputed MegaArray markers. We use estimated r2 for post-imputation quality control (QC); and dosage/true r2 (squared Pearson correlation between imputed dosages and true genotypes) for quality assessment. We compared performance when using the Haplotype Reference Consortium (HRC) or the 1000 Genomes phase 3 alone as reference.

Results: In JHS, 51 million (M) markers were well-imputed with standard/lenient QC, including 13.1M with sample minor allele frequency (MAF) <0.05%; in SOL, 60M markers well-imputed (28M with MAF <0.05%). In contrast, approximately 25M (7M with MAF <0.05%) and 30M (8M with MAF<0.05%) markers were well-imputed with HRC and 1000G, respectively.

The average dosage r2 for markers with sample MAF <0.05% exceeded 82% (JHS) and 66% (SOL) with standard/lenient QC, and exceeded 87% (JHS) and 78% (SOL) with estimated r2 threshold of 0.8. Towards the rare extreme, in JHS, 39% of markers with TOPMed minor allele count (MAC) 10-20 can be well imputed, with average true r2 77% for sample/JHS singletons, and >80% (80-97%) when JHS MAC >1.

Compared with standard reference panels, TOPMed resulted in many more well-imputed rare variants and in higher imputation quality for these rare variants. For example, TOPMed increased the number of well imputed variants with sample MAF <0.05% by >3x and 6x, with 17-20% and 16-24% improvement in average dosage r2 for markers imputed by both panels, compared to 1000G and HRC, respectively.

Conclusion: TOPMed proves a much better imputation reference panel for minority populations, in terms of both the number of variants imputable and the quality of the imputed variants.

Leveraging whole genome sequencing to identify novel determinants of platelet function

Authors
Benjamin A.T. Rodriguez, Ali R. Keramati, Lisa Yanek, Ming-Huei Chen, Kathleen Ryan, Brady Gaynor, Jennifer A. Brody, Nauder Faraday, Lewis Becker, Joshua Lewis, Andrew D. Johnson, Rasika Mathias
Name and Date of Professional Meeting
ASHG (October 2018)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Activated platelets provide the link between inflammation, thrombosis, and atherosclerotic cardiovascular disease. Platelet reactivity is highly heritable, yet the number of previously identified loci are limited and explain a relatively small portion of estimated heritability. Leveraging the scientific resources of TOPMed, we here report the first association study of platelet aggregation in response to variety of physiological stimuli using whole genome sequencing (WGS) data. Three extensively phenotyped studies of platelet function including GeneSTAR, the Framingham Heart Study and the Old Order Amish Study collaborated to (1) refine previously identified GWAS loci and (2) identify novel loci that determine platelet aggregation in response to different doses of collagen, ADP, and epinephrine. Tests for association using a 2-stage inverse-normal transformation after adjusting for age, sex and study were performed for 19 harmonized platelet aggregation phenotypes and ~69.6M variants within a multi-ethnic mega-analysis framework. We identified 19 novel, independent loci reaching genome-wide significance (P<5E-8), two of which may impact clinically actionable genes: 1p36 (P=1.04E-8, MAF=0.077, PINK1) and 1q31 (P=1.96E-9, MAF=0.442, RGS18). Previous knock out studies in mice suggest RGS18 acts as a brake on persistent or inappropriate platelet activation. PINK1-null mice have previously been shown to have increased platelet reactivity and thrombosis. We developed an innovative approach for thresholding variant effect prediction in the gene-based SKAT framework to further investigate low frequency or rare coding variants and identified five genes reaching genome-wide significance: SVEP1, CDNF, BCO1, NELFA, IDH3A. Our results for the SVEP1 gene, a risk locus for coronary artery disease including myocardial infarction, are driven by a missense coding variant and thus provide a testable biological mechanism for SVEP1 in heart attack. Finally, low-frequency or rare non-coding variant SKAT of cell-lineage specific epigenetic regulatory maps identified a megakaryocyte super-enhancer region near the platelet factor gene PEAR1, a known locus of common non-coding variants for platelet reactivity. This shows us the PEAR1 locus is more functionally complex than previously understood. WGS data coupled with innovative analytical strategies has resulted in new loci and better understanding of the determinants of platelet aggregation.
Back to top