We propose a generic framework for gene regulatory network (GRN) inference approached as a feature selection problem. GRNs obtained using Machine Learning techniques are often dense, whereas real GRNs are rather sparse. We use a Tikonov regularization inspired optimal L-curve criterion that utilizes the edge weight distribution for a given target gene to determine the optimal set of TFs associated with it. Our proposed framework allows to incorporate a mechanistic active biding network based on cis-regulatory motif analysis. We evaluate our regularization framework in conjunction with two non-linear ML techniques, namely gradient boosting machines (GBM) and random-forests (GENIE), resulting in a regularized feature selection based method specifically called RGBM and RGENIE respectively. RGBM has been used to identify the main transcription factors that are causally involved as master regulators of the gene expression signature activated in the FGFR3-TACC3-positive glioblastoma. Here, we illustrate that RGBM identifies the main regulators of the molecular subtypes of brain tumors. Our analysis reveals the identity and corresponding biological activities of the master regulators characterizing the difference between G-CIMP-high and G-CIMP-low subtypes and between PA-like and LGm6-GBM, thus providing a clue to the yet undetermined nature of the transcriptional events among these subtypes.

RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes / Mall, R; Cerulo, L; Garofano, L; Frattini, V; Kunji, K; Bensmail, H; Sabedot, Ts; Noushmehr, H; Lasorella, A; Iavarone, A; Ceccarelli, M. - In: NUCLEIC ACIDS RESEARCH. - ISSN 1362-4962. - 46:7(2018), p. 39. [10.1093/nar/gky015]

RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes

Cerulo L;Ceccarelli M
2018

Abstract

We propose a generic framework for gene regulatory network (GRN) inference approached as a feature selection problem. GRNs obtained using Machine Learning techniques are often dense, whereas real GRNs are rather sparse. We use a Tikonov regularization inspired optimal L-curve criterion that utilizes the edge weight distribution for a given target gene to determine the optimal set of TFs associated with it. Our proposed framework allows to incorporate a mechanistic active biding network based on cis-regulatory motif analysis. We evaluate our regularization framework in conjunction with two non-linear ML techniques, namely gradient boosting machines (GBM) and random-forests (GENIE), resulting in a regularized feature selection based method specifically called RGBM and RGENIE respectively. RGBM has been used to identify the main transcription factors that are causally involved as master regulators of the gene expression signature activated in the FGFR3-TACC3-positive glioblastoma. Here, we illustrate that RGBM identifies the main regulators of the molecular subtypes of brain tumors. Our analysis reveals the identity and corresponding biological activities of the master regulators characterizing the difference between G-CIMP-high and G-CIMP-low subtypes and between PA-like and LGm6-GBM, thus providing a clue to the yet undetermined nature of the transcriptional events among these subtypes.
2018
RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes / Mall, R; Cerulo, L; Garofano, L; Frattini, V; Kunji, K; Bensmail, H; Sabedot, Ts; Noushmehr, H; Lasorella, A; Iavarone, A; Ceccarelli, M. - In: NUCLEIC ACIDS RESEARCH. - ISSN 1362-4962. - 46:7(2018), p. 39. [10.1093/nar/gky015]
File in questo prodotto:
File Dimensione Formato  
RGBM_NAR.pdf

accesso aperto

Descrizione: Paper
Tipologia: Versione Editoriale (PDF)
Licenza: Dominio pubblico
Dimensione 10.27 MB
Formato Adobe PDF
10.27 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/844541
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 30
  • ???jsp.display-item.citation.isi??? 29
social impact