The Bionimbus Protected Data Cloud (PDC) is an open source petabyte-scale cloud that is designed to manage, analyze and share large genomic datasets for the research community in a secure and compliant fashion. The Bionimbus now contains all of the data available to date from The Cancer Genome Atlas (TCGA). Today, this is over 600 TB of data and will grow over the next two years to over 2.5 PB. This includes both the controlled access BAM files containing the genomic data, as well as the open access aggregated data derived from the BAM files.
I’ll be giving a talk today about the Bionimbus PDC at the O’Reilly Strata Health Rx Conference in Boston.
To analyze TCGA data using the Bionimbus TCGA, you will need the required approvals from dbGaP. Any researcher authorized to analyze controlled access TCGA data is welcome to use modest amounts of compute and storage resources on the PDC. If you need additional resources, you can apply for a PDC research allocation.
Please contact us if you would like to contribute some data to the PDC, have a project that would like to join the PDC, or have a biomedical cloud that would like to interoperate with the PDC.