Skip to main content

Near-optimal trans-ethnic association and fine mapping of smoking associated genes integrating GWAS and TOPMed sequence data of 1.3 million individuals

Authors
Y. Jiang; TOPMed smoking working group and GSCAN consortium
Name and Date of Professional Meeting
the American Society of Human Genetics annual meeting
Associated paper proposal(s)
Working Group(s)
Abstract Text
Tobacco use is a heritable risk factor for numerous diseases, for which 353 associated genes were identified in European samples. Yet, its genetic architecture in non-European populations remains elusive. To address this, we assembled TOPMed whole genome sequences of ~150,000 individuals from diverse US populations as well as GWAS data of up to 1.2 million individuals. Four smoking phenotypes were studied, including smoking initiation, cigarettes per day, smoking cessation and the age of smoking initiation.

To analyze these amazingly rich datasets, we developed a novel mixed effect meta-regression method for near-optimal trans-ethnic meta-analysis (MEMO). MEMO summarizes ancestry for each study using principal components of genome-wide allele frequencies. It models the between-study genetic effect heterogeneities due to genetic ancestry differences as a fixed effect and that due to non-ancestry exposure differences as random effects. For each SNP, MEMO adaptively selects fixed effects and random effects to be included that best models the genetic effect heterogeneity. It thus combines the strength of fixed effect, random effect meta-analysis, and meta-regression. MEMO is consistently the most powerful (or close to the most powerful) across a wide variety of scenarios in simulations, even when the simulated disease model is in favor of alternative methods. We further extend MEMO for fine mapping, which can distinguish causal variants with homogeneous effects and that show ancestry-specific effects. Due to the improved model of multi-ethnic genetic effects, MEMO considerably improves fine mapping resolution. Simulation shows the method is well calibrated and on average, the posterior probability of association for causal variants estimated by our method is 50% higher, and our 95% credible interval for causal variants is ~33% shorter than alternative trans-ethnic fine-mapping methods.

Applying MEMO, we identified 265 loci with p<5e-9 among which 27 are novel, and >400 independent secondary associations. Our fine-mapping narrowed down the 95% credible interval for causal variants to less than 10 variants for 76 loci, and 17 of them contain a single SNP. We estimated that 56% of the causal variants show homogeneous effects across ancestries, while another 26% and 12% show African specific and Hispanic specific effects. In conclusion, our results elucidate the genetic architecture for smoking traits, and our developed methods will be valuable for other studies.
Back to top