Article (Scientific journals)
Inferring Regulatory Networks from Expression Data Using Tree-Based Methods
Huynh-Thu, Vân Anh; Irrthum, Alexandre; Wehenkel, Louis et al.
2010In PLoS ONE, 5 (9), p. 12776
Peer Reviewed verified by ORBi
 

Files


Full Text
Huynh-Thu 2010 PLoS One.pdf
Publisher postprint (713.12 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
bioinformatics; Machine learning; systems biology
Abstract :
[en] One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
Research center :
Systems and modeling (Dept. of EE and CS) and Bioinformatics and modeling (GIGA-R)
Disciplines :
Computer science
Biochemistry, biophysics & molecular biology
Author, co-author :
Huynh-Thu, Vân Anh ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Irrthum, Alexandre ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Wehenkel, Louis  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Geurts, Pierre ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Inferring Regulatory Networks from Expression Data Using Tree-Based Methods
Publication date :
28 September 2010
Journal title :
PLoS ONE
eISSN :
1932-6203
Publisher :
Public Library of Science, San Franscisco, United States - California
Volume :
5
Issue :
9
Pages :
e12776
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBi :
since 02 October 2010

Statistics


Number of views
576 (53 by ULiège)
Number of downloads
377 (16 by ULiège)

Scopus citations®
 
1091
Scopus citations®
without self-citations
1077
OpenCitations
 
1013

Bibliography


Similar publications



Contact ORBi