Abstract Text |
With global initiatives for diverse genomic research, diverse genetic data will be increasingly available. Understanding the extent to which genetic architecture overlaps across populations and its impact on SNP-based heritability (h2SNP) and cross-ancestry genetic correlation are important to leverage these datasets to advance the genetic etiology of complex traits. Here we examined behaviors of different analytic approaches in estimating h2SNP and cross-ancestry genetic correlation in diverse populations, considering various cross-ancestry genetic architecture scenarios with varying allele frequency (AF) and allelic effects of causal variants across ancestries. Using up to 55,724 diverse whole-genome sequence data from the TOPMed consortium, we compared four approaches and different ways to standardize genotype in simulation and real anthropometric phenotype analysis: 1) single ancestry approach: Genomic Restricted Maximum Likelihood (GREML) /Haseman-Elston (HE) regression applied to individual ancestry, 2) combined ancestry approach: GREML/HE regression applied to ancestry- combined sample, 3) bivariate approach: bivariate GREML to estimate h2SNP and cross-ancestry genetic correlation, and 4) GxE approach: GxEMM to estimate h2SNP for shared and ancestry-specific genetic effects. In the simulation, we found that heterogeneity in AF and allelic effects significantly affect heritability and genetic correlation estimates. Enriching causal variants in variants with lower cross-ancestry AF difference than that of randomly chosen SNPs led to a downward bias in h2SNP estimates due to the overall low LD level of these variants. Additionally, enriching causal variants in variants with larger cross-ancestry AF differences led to an underestimation of cross-ancestry genetic correlation across all ancestry pairs. Overall, bivariate and GxE approaches using within-ancestry standardized GRM yielded robust estimates in a relatively wide range of cross-ancestry architectures than other methods. In real phenotype analysis, we observed that h2SNP of height in the AFR population is enriched in variants with larger AF differences with European ancestry even after accounting for overall allele frequency and LD level in individual ancestry (h2SNP estimates=.10 (.04) v.s., .19 (.04) for smaller v.s. larger AF difference). This study provides guidance for estimating h2SNP and cross-ancestry genetic correlation using diverse ancestry data and highlights the importance of considering cross-ancestry genetic architectures in interpreting the results of existing methods.
|