TOPMed harmonized phenotypes | NHLBI Trans-Omics for Precision Medicine

Available datasets

The following is a list of available datasets. Clicking on the dataset name will take you to more information about which phenotypes are included and the number of participants with non-missing information by study.

Available datasets
Dataset name	version	Date uploaded
Atherosclerosis events incident	1	2019-10-31
Atherosclerosis events prior	1	2019-10-31
Demographic	4	2019-10-29
Baseline Common Covariates	3	2019-10-04
Sleep	1	2019-10-04
Inflammation	1	2019-04-19
Lipids	3	2018-12-13
VTE	1	2018-11-20
Blood Cell Count	3	2018-10-12
Blood Pressure	1	2018-08-27
Atherosclerosis	1	2018-06-01

These datasets are split by study and uploaded to each TOPMed study’s exchange area. They can be found under the “Provisional Files” tab and within the Phenotype/DCC/official folder. An example for one study is shown below:

topmed-dcc
  exchange
    phs000956_TOPMed_WGS_Amish
      Phenotype
        DCC
          official
            topmed_dcc_baseline_common_covariates_v1_phs000956.tar
            topmed_dcc_demographic_v1_phs000956.tar

The study-specific data files downloaded from the exchange areas can then be combined for cross-study analysis. Phenotypes for studies without a TOPMed exchange area will be uploaded once the exchange area is created at dbGaP.

Authorship guidelines

If you have used phenotypes that the TOPMed DCC has harmonized in your analysis, please see authorship guidelines for TOPMed-harmonized phenotypes for information about including DCC authors from the phenotype harmonization team.

Available phenotypes by dataset

For each phenotype, an associated age at measurement variable is also provided. For example, “weight_baseline_1” is body weight at the baseline exam and “age_at_weight_baseline_1” is the age of the participant at which that weight measurement was made. These age variables are not shown in the available phenotypes below but are a part of the datasets. The exception is for demographic phenotypes (e.g., sex, race, etc.), which do not have an associated age; they were derived primarily from baseline information, although later exams were used in some cases.

Atherosclerosis events incident

Atherosclerosis events incident
Phenotype	description
angina_incident_1	An indicator of whether a subject had an angina event (that was verified by adjudication or by medical professionals) during the follow-up period.
cabg_incident_1	An indicator of whether a subject had a coronary artery bypass graft (CABG) procedure (that was verified by adjudication or by medical professionals) during the follow-up period.
cad_followup_start_age_1	Age of subject at the start of the follow-up period during which atherosclerosis events were reviewed and adjudicated.
chd_death_definite_1	An indicator of whether the cause of death was determined by medical professionals or technicians to be “definite” coronary heart disease for subjects who died during the follow-up period.
chd_death_probable_1	An indicator of whether the cause of death was determined by medical professionals or technicians to be “probable” or “definite” coronary heart disease for subjects who died during the follow-up period.
coronary_angioplasty_incident_1	An indicator of whether a subject had a coronary angioplasty procedure (that was verified by adjudication or by medical professionals) during the follow-up period.
mi_incident_1	An indicator of whether a subject had a myocardial infarction (MI) event (that was verified by adjudication or by medical professionals) during the follow-up period.
pad_incident_1	An indicator of whether a subject had peripheral arterial disease (that was verified by adjudication or by medical professionals) during the follow-up period.

number-of-non-missing-measurements-by-study