The TCGA dataset has "2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available and has been used widely by the research community." The datasets are generated by TCGA Research Network.
From the website: - "Cancer patients were asked to donate a portion of tumor tissue that has been removed as part of their cancer treatment along with a sample of normal tissue, usually blood. Tissue and fluid used for analysis are called biospecimens. - "Biospecimen samples used for genomic research need to meet a stringent set of criteria so that the genetic material (DNA and RNA) removed from them can be used by advanced genomic analysis and sequencing technologies. - "The TCGA Biospecimen Core Resource laboratory processed samples to ensure they met the TCGA biospecimen criteria and prepare them for analysis. Part of the process included coding the biospecimens to remove any information that might connect a sample with a patient's private information."
The purpose of sharing these datasets is to expand cancer research. According to the website, these datasets have been used in at least a thousand studies of cancer by independent researchers and the TCGA research network members.