Skip to main content

Deciphering rare non-coding LDL-C associations in over 246K individuals with whole genome sequencing

Authors
M. Selvaraj, X. Li, Z. Li, X. Lin, G. Peloso, P. Natarajan, TOPMed Lipids Working Group
Name and Date of Professional Meeting
ASHG 2024
Associated paper proposal(s)
Working Group(s)
Abstract Text
Background:
Blood lipids and specifically low-density lipoprotein cholesterol (LDL-C) is a heritable risk factor for cardiovascular diseases, a leading cause of death. Recent genome-wide association studies (GWAS) identified numerous loci related to blood lipid levels, but the role of rare non-coding variants is less well-understood. Whole-genome sequencing (WGS) allows exploration of these variants. Our study meta-analyzed WGS data from two large datasets (TOPMed, n=72,175 and UK Biobank, n=173,982), yielding the largest WGS analysis for LDL-C.
Methods:
We ascertained deep-coverage WGS and LDL-C from UK Biobank and NHLBI freeze 10 (n=23 cohorts). We harmonized and normalized lipid measures from individual cohort and adjusted for age, sex, cohort-race, PCs and accounted for lipid-lowering medicine status. To enable efficient WGS meta-analysis across UK Biobank and TOPMed freeze 10, we implemented the MetaSTAAR workflow. In addition to single variant analyses, we performed gene-centric coding and non-coding set-based, and region-based sliding window meta-analysis of rare variants (MAF <1%) for LDL-C. Finally, we replicated our findings in All of Us WGS data.
Results:
We generated variant summary statistics and covariances matrices for UK Biobank and TOPMed, independently. We processed 571M and 660M variants from TOPMed and UKB respectively, in which 92M variants had a minor allele count >20. We then conducted the meta-analysis of both studies following the MetaSTAAR workflow. We used 5gene-centric coding variant masks and 7 non-coding variant masks and filtered genome significant aggregates based on Bonferroni-correction(0.05/(20K*masks)). Before conditional analysis we obtained 70 and 111 aggregates significantly associated with LDL-C for coding and non-coding region, respectively. After adjusting for known common variants we obtained 39 and 44 aggregates and replicated 25 and 28 coding and non-coding aggregates respectively. Many important known Mendelian lipid genes including LDLR, APOB, PCSK9 were significant and novel rare variant aggregates in ABCA6
and RELB were also significantly associated with LDL-C.
Conclusion:
In summary, we extend prior observations of rare non-coding variants near Mendelian lipid genes to now novel genes without prior known common non-coding or rare variant coding evidence.
Back to top