Skip to main content

A Statistical framework to assess replicability of signals from trans-ethnic genome-wide association meta-analysis: Applications to smoking/drinking addiction traits using 3.4 million individuals.

Authors
Chen Wang, GSCAN - the GWAS and Sequencing Consortium of Alcohol and Nicotine Use
Name and Date of Professional Meeting
ASHG Annual Meeting (October 18-22, 2021)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Consortium studies often use genome-wide association meta-analysis (GWAMA) aggregate summary statistics from multiple studies to empower genetic discovery. It is a standard practice to replicate the association signals using an independent dataset. Yet, as discovery studies continue to grow larger and more diverse, it becomes difficult to identify a large enough replication sample, and more so for studies of non-European ancestry. Without replication, the identified association signals are much more likely to be spurious and confound downstream studies. To address this challenge, we propose a novel statistical framework RATES (Replicability Assessment in Trans-Ethnic Studies) to assess replicability without a replication sample. RATES first models genetic effect variations across studies using meta-regression with principal components of genome-wide allele frequencies as covariates and adjusts genetic effect heterogeneities due to ancestry. Next, RATES leverages the strength and consistency of residual association signals across variants and studies to calculate a “posterior probability of replicability”, based on the rationale that replicable association signals tend to be significantly associated across multiple studies. A parametric bootstrap method was also developed to evaluate the p-values for PPR. We performed extensive simulations where 1) the genetic effects are homogeneous across ancestries, 2) the genetic effects are ancestry-specific, and 3) false-positive signals occur in some studies in the meta-analysis. We compared RATES with popular meta-analysis methods including the fixed effect (FE), random effects (RE and RE2) and binary effect (BE) meta- analysis, and meta-regression (MR-MEGA). We showed when outliers are present, only RATES yields correct type I error, while other methods (e.g., FE or RE2) can have > 4 folds inflated type I error. RATES also gives higher or comparable power in all scenarios, even for simulations that favor alternative methods. For variants with ancestry-specific effects, the power of RATES is 7% to over 400% higher compared to the 2nd best performing meta-analysis method. We further applied RATES to smoking/drinking addiction traits using 3.4 million individuals of different ethnic groups. As the first step, RATES confirmed that all sentinel variants reported have PPR>99%. When comparing the mean chi-square as converted from p-values, RATES yields chi-square values that are 9 % higher than the 2nd best method (RE2). Applying RATES to rare and low-frequency variants that are typically filtered out, we further identified novel signals of biological relevance in addition to GWAMA of common variants.
Back to top