Abstract Text |
Smoking addictions are heritable traits and leading causes for many diseases. Recently, breakthrough in addiction genetics has been made through GSCAN meta-analysis, which identified 406 loci associated with different smoking behavioral traits in European ancestry. Yet, the genetic architecture in non-European ancestry remains elusive. To address this challenge, the GSCAN consortium aggregated datasets from 101 studies with a total of 3.4 million individuals from diverse ancestries (2,669,029 European, 296,395 Asian, 286,026 Admixed American, 119,589 African American). Together, we identify 2,007 loci, among which 464 are novel. This dataset offers an unprecedented opportunity to advance our understanding on the genetic architecture for smoking behavior in global populations.
We propose an improved meta regression-based model for trans-ancestry genetic effect distributions. Specifically, we use the principal components (PCs) of genome-wide allele frequencies as proxies of continuously varying cohort-level ancestry. We model the genetic effect from each study as a mixture of models with different number of PCs, which could encompass different extent of heterogeneity for different variants. For example, the model with 0 PC supports homogenous effects. As the 1st PC separates European and Asian ancestry, the model with 1 PC can be interpreted as having heterogenous effects along the European-Asian cline. By imposing a Dirichlet-Multinomial prior, we borrow strength across variants, learn the genetic architecture and fine map causal variants.
We perform simulations across different scenarios that assume variants have homogenous effects and that have ancestry-specific effects. We show that our method greatly improves the fine mapping resolution and allows us to estimate the fractions of loci that show homogenous effects and ancestry-specific effects. We apply our method to the GSCAN study of 4 phenotypes, i.e., the age of initiation of regular smoking (AgeSmk), cigarettes per day (CigDay), smoking initiation (SmkInit) and smoking cessation (SmkCes). Among 2,007 identified significant loci with a median of 3,274 variants per locus, our proposed method fine-map 34.5% of them to less than 6 variants and 1.51 genes in 90% credible sets, a significant improvement over fine mapping using European ancestry only. On average, 81% of loci show homogenous effect. 13% of the loci are best supported by the mixture with 1PC, which indicates the variants have distinct effects along the European-Asian cline, but homogenous in other ancestry groups. Our new results and continued research will elucidate the genetic architecture in global ancestries.
|