The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA’s research network, opportunities still exist to imple- ment novel methods, thereby elucidating new bio- logical pathways and diagnostic markers. However, mining the TCGA data presents several bioinformat- ics challenges, such as data retrieval and integra- tion with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to ad- dress these challenges and offer bioinformatics so- lutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incor- porated methodologies developed in previous TCGA marker studies and in our own group. Using four dif- ferent TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illus- trate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.
TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data / Colaprico, Silva; Olsen, Garofano; Cava, Garolini; Sabedot, Malta; Pagnotta, S; Castiglioni, Ceccarelli; Bontempi, Noushmehr; Ceccarelli, Michele. - In: NUCLEIC ACIDS RESEARCH. - ISSN 0305-1048. - 8:5(2016), p. e71. [10.1093/nar/gkv1507]
TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data
CECCARELLI, Michele
2016
Abstract
The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA’s research network, opportunities still exist to imple- ment novel methods, thereby elucidating new bio- logical pathways and diagnostic markers. However, mining the TCGA data presents several bioinformat- ics challenges, such as data retrieval and integra- tion with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to ad- dress these challenges and offer bioinformatics so- lutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incor- porated methodologies developed in previous TCGA marker studies and in our own group. Using four dif- ferent TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illus- trate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.File | Dimensione | Formato | |
---|---|---|---|
gkv1507.pdf
accesso aperto
Descrizione: Paper
Tipologia:
Versione Editoriale (PDF)
Licenza:
Dominio pubblico
Dimensione
5.54 MB
Formato
Adobe PDF
|
5.54 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.