Skip to main content

How to Prepare the dbGaP Study Config and Data files for TOPMed

You just received an invitation to upload to the dbGaP Submission Portal. Now what?

The dbGaP Study Submission Guide contains the most up-to-date dbGaP file templates and may help address any questions.

  1. Prepare the Study Config file.
    1. Under the "Description" section, please use the text below (Appendix A) describing TOPMed, followed by three components: Please note that these descriptions are needed to guide appropriate analytical strategies, enable rational combining of datasets, and provides documentation for development of study descriptions for publications using the study data.
      • Study History:  Describe the design of the parent study, including sampling frame or ascertainment; distribution of subjects by age, sex, race and relatedness; field centers and any other relevant design features.  Please cite publications that describe the study design.
      • Study Inclusion / Exclusion Criteria: Inclusion criteria may include specific criteria for case ascertainment, and any exclusion criteria for subject selection.
      • Study Description: Describe how samples for TOPMed whole-genome sequencing and/or other omics assays (e.g. stroke cases versus controls; high versus low HDL; key family members to be used for imputing others, etc.) were chosen, and their characteristics (eg, case, control, male, female, race, etc). 
    2. For the “Molecular Data” section, please use the following:
      • Type: Whole Genome Sequencing
      • Name and Version: Illumina X10 <or name of other instrument>
      • Vendor: Illumina
      • dbSNP Batch ID: N/A
      • Comments: Sequencing was performed at the <sequencing center name, e.g. Broad Institute of MIT and Harvard, Human Genome Sequencing Center at Baylor College of Medicine, Illumina, McDonnell Genome Institute at Washington University, New York Genome Center, Psomagen, Northwest Genomics Center at the University of Washington>
  2. Once the Study Config file is ready, please submit it by going to the dbGaP Submission Portal and clicking on your TOPMed study’s name.  The dbGaP team will then get started with assigning a phs number and setting up your study’s TOPMed Exchange Area even if the necessary data files (listed below) are still in progress.
  3. Next your team will need to prepare the following data files and their data dictionaries as part of your TOPMed dbGaP registration. Here is some TOPMed-specific dbGaP file prep info to get you started on these data files.
    1. Study Consent file
    2. Subject-Sample Mapping file
    3. Sample Attributes file
    4. Phenotype file (if not already deposited in a parent or previous dbGaP study; we want to avoid duplication of data)
    5. Pedigree file (if applicable and if not already deposited in a parent or previous dbGaP study; we want to avoid duplication of data)

 

Appendix A

Recommended preliminary text for Study Description:

This study is part of the NHLBI Trans-Omics for Precision Medicine (TOPMed) Whole Genome Sequencing Program. TOPMed is part of a broader Precision Medicine Initiative, which aims to provide disease treatments that are tailored to an individual's unique genes and environment. TOPMed will contribute to this initiative through the integration of whole-genome sequencing (WGS) and other -omics (e.g., metabolic profiles, protein and RNA expression patterns) data with molecular, behavioral, imaging, environmental, and clinical data. In doing so, this program aims to uncover factors that increase or decrease the risk of disease, to identify subtypes of disease, and to develop more targeted and personalized treatments. Information about how to identify other TOPMed WGS accessions for cross-study analysis, as well as descriptions of TOPMed methods of data acquisition, data processing and quality control, are provided in the accompanying document, "TOPMed Whole Genome Sequencing Project" [TBD study-specific dbGap URL to methods document].

Back to top