Data use regulation and governance in TOPMed
Data sharing in TOPMed will occur primarily through the dbGaP TOPMed Exchange Area (EA). For data obtained through dbGaP (from either the EA or a released study), data usage is regulated by the NHLBI according to the study’s Institutional Certification, which specifies Data Use Limitations. TOPMed Principal Investigators (PIs) are responsible for submitting the Institutional Certification at the time of TOPMed study registration.
Data uses are governed by the participant’s informed consent, as interpreted by each study’s PI(s) and their institutional IRB. The TOPMed PI(s) associated with each study are responsible for making consent types and data use restrictions known to the TOPMed collaborators with whom they share data directly (i.e. outside of dbGaP).
Data use in TOPMed manuscripts
Cross-study paper proposals originate in the TOPMed Working Groups, which may produce manuscripts for publication using data shared in the TOPMed EAs or released in dbGaP. To ensure transparency of data usage and acknowledgment of any Data Use Limitations, authors must obtain permission from each study PI to include their study’s data, as well as submit a paper proposal for their manuscript following specific paper proposal instructions, prior to starting work on their manuscript.
The paper proposal instructions have two parts:
- Basic information, scientific proposal and selection of TOPMed projects to include, via the Paper Proposal submission form; and
- Selection of specific study-consent groups to be used, via the Request data sets form.
Study-consent groups will not be available for selection until the study has been registered in dbGaP. Therefore, the proposer may need to update their dataset selection when additional study-consent groups become available. When the proposer selects a study-consent group, they agree that its Data Use Limitations will be respected.
Study dataset contacts will be notified when one of their study’s consent groups is selected for a given proposal and will have two weeks to approve or request modification. Reasons for requesting modification might include:
- the study wants to perform study-specific preliminary analyses before joining cross-study analyses;
- the PI was not contacted previously and needs further information from the proposer; or
- issues related to consent, potential stigmatization or harm to participants.
Modification requests at the stage of study-consent selection should be infrequent, since proposers should have already obtained permission from the study PI to include their study prior to proposal submission. Refer to the TOPMed Publications Policy for further information on the publications process, including specific information on “single PI” proposals and manuscripts.
Additional data use considerations
This section applies to data obtained from dbGaP. Different rules may apply to data obtained by other mechanisms (e.g., directly from study investigators).
NIH provides consent group titles and their associated standard Data Use Limitations , accompanied by further interpretation . Note that “General Research Use” permits “research relating to population structure,” while “Health/Medical/Biomedical” excludes “the study of population origins or ancestry” and “Disease specific” includes only “research on a specific disease or related condition.”
The TOPMed ELSI Committee can advise study PIs on approaches to provide evidence to their IRBs or Institutional Certification boards regarding support for broad use of TOPMed data.
Regarding indirect uses of data, such as imputation reference panels, common controls for association studies, and variant summary statistics:
- Data from an individual with a disease-specific consent will not be used in analyses outside of that restriction, unless specifically allowed by the Institutional Certification.
- Contribution of data to public (non-controlled) access servers is not allowed for individual-level data or for data summaries that could be used for individual identification. Contribution to public servers that protect individual-level data is allowed if specified in the Institutional Certification.
- NHLBI urges all TOPMed investigators to update their Institutional Certification with Data Use Limitations that specify whether data may be used for the following:
- Contribution to variant summary statistics to public variant servers.
- Inclusion of individual-level data as reference samples in public imputation servers that protect the individual-level data (i.e. where the reference samples are not accessed by the server’s users).
- Inclusion of individual-level data as common controls in association studies involving cases with diseases that are outside of the participants’ disease-specific consent.
- Effective November 2018, Genomic Summary Results (GSR) such as allele frequencies and association results are to be publicly available unless a study designates as “sensitive” and accordingly updates their Institutional Certification to indicate that GSR should remain under controlled access.
- TOPMed studies with questions about this should contact the NHLBI DAC and/or their GPA (see Key Contacts).
For current information on NIH-wide data sharing policies, including GSR, please refer to the NIH Genomic Data Sharing website.