Data Repositories Established as NIH Trusted Partners
The National Institutes of Health (NIH) promotes data sharing as an essential element to facilitate the translation of research results into knowledge, products, and procedures to improve human health. To achieve this goal, NIH has created a central repository model for data storage and distribution through the database for Genotypes and Phenotypes (dbGaP). However, in light of the increasing volume and complexity of the data, which necessitate innovative solutions for storing and presenting the data, NIH is exploring new models for data management resources, including structured partnerships with external organizations or "trusted partners."
A "trusted partner" is defined as a public or private, national or international organization that is able to meet core NIH standards for establishing data quality and data management service protocols for NIH, based on the programmatic need of an NIH funding Institute or Center (IC). A trusted partnership can be established only through a contract mechanism between an NIH funding IC and the trusted partner organization. Contracts are awarded through an IC's standard acquisition and negotiation processes. NIH funding ICs that are interested in submitting an application to establish a trusted partnership should contact GDS staff at: GDS@mail.nih.gov.
NIH Established Trusted Partners
Cancer Genomics Hub (CGHub): CGHub stores, catalogs, and facilitates research using cancer genome sequences, alignments, and mutation information from The Cancer Genome Atlas (TCGA) consortium and related projects.
NCI Genomic Data Commons: The mission of the National Cancer Institute (NCI) Genomic Data Commons (GDC) is to provide the research community with a unified repository of cancer genomics data and associated clinical information.
NCI Cancer Genomics Cloud Pilots
National Cancer Institute (NCI) funded three Cancer Genomics Cloud Pilot contracts with the primary objective to foster innovative solutions that support co-location of data from The Cancer Genome Atlas (TCGA) with computational resources, which would enable access to the data and tools by authorized users who do not have the resources to download the entire TCGA dataset.
Broad Institute FireCloud: Broad Institute's FireCloud is a cancer genome analysis platform with co-located TCGA data as well as other public datasets including 1000 Genomes, Cancer Cell Line Encyclopedia (CCLE), and Genotype-Tissue Expression (GTEx). FireCloud will securely track and manage data, metadata, tools, job execution and results and will capture provenance for each run.
Institute for Systems Biology (ISB) Cancer Genomics Cloud: ISB's Cancer Genomics Cloud will host the TCGA data in Google Cloud Storage and BigQuery tables and will provide end-users with tools and services ranging from web-based interactive exploration to cloud-based instances of RStudio and IPython and the means to run Docker containers and pipelines on virtual machines hosted on Google's infrastructure. Cancer researchers will be able to analyze TCGA data in conjunction with their own private data or with other publicly available datasets.
Seven Bridges Genomics Cancer Genomics Cloud: The Cancer Genomics Cloud powered by Seven Bridges allows users to analyze their data alongside data from The Cancer Genomics Atlas (TCGA) using custom tools and pipelines or community-contributed apps.
Bionimbus: Bionimbus is a collaboration between the Institute for Genomics and Systems Biology (IGSB) at the University of Chicago and the Open Science Data Cloud to develop open source technology for managing, analyzing, transporting, and sharing large NCI-funded cancer genomics datasets in a secure and compliant fashion.
If you have questions about NIH Trusted Partners, please email GDS@mail.nih.gov.