Classifying pairs with trees for supervised biological network inference

Schrynemackers, Marie; Wehenkel, Louis; Madan Babu, Mohan; Geurts, Pierre

doi:10.1039/c5mb00174a

Download

Article (Scientific journals)

Classifying pairs with trees for supervised biological network inference

Schrynemackers, Marie; Wehenkel, Louis; Madan Babu, Mohan et al.

2015 • In Molecular Biosystems, 11 (8), p. 2116-2125

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/2268/172436

DOI
10.1039/c5mb00174a

PubMed
26008881

Files (3)Send to Details Statistics Bibliography Similar publications

Files

Full Text

schrynemackers-mbs-preprint.pdf

Author preprint (776.96 kB)

Download

Annexes

schrynemackers-mbs-suppl.pdf

Publisher postprint (910.75 kB)

Supplementary material

Download

schrynemackers-version-arxiv-2014.pdf

Publisher postprint (527.23 kB)

Version ArXiv 2014

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Network inference; Machine learning; Decision trees

Abstract :

[en] Networks are ubiquitous in biology, and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes. Here, we systematically investigate, theoretically and empirically, the exploitation of tree-based ensemble methods in the context of these two approaches for biological network inference. We first formalize the problem of network inference as a classification of pairs, unifying in the process homogeneous and bipartite graphs and discussing two main sampling schemes. We then present the global and the local approaches, extending the latter for the prediction of interactions between two unseen network nodes, and discuss their specializations to tree-based ensemble methods, highlighting their interpretability and drawing links with clustering techniques. Extensive computational experiments are carried out with these methods on various biological networks that clearly highlight that these methods are competitive with existing methods.

Disciplines :

Engineering, computing & technology: Multidisciplinary, general & others

Author, co-author :

Schrynemackers, Marie ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Madan Babu, Mohan

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Language :

English

Title :

Classifying pairs with trees for supervised biological network inference

Publication date :

11 May 2015

Journal title :

Molecular Biosystems

ISSN :

1742-206X

eISSN :

1742-2051

Publisher :

Royal Society of Chemistry (RSC), United Kingdom

Volume :

Issue :

Pages :

2116-2125

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBi :

since 29 September 2014

Statistics

Number of views

266 (35 by ULiège)

Number of downloads

353 (13 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

J.-P. Vert, Elements of Computational Systems Biology, John Wiley & Sons, Inc., 2010, ch. 7, pp. 165-188
K. Bleakley G. Biau J.-P. Vert Bioinformatics 2007 23 i57 i65
F. Mordelet J.-P. Vert Bioinformatics 2008 24 i76 i82
A. Ben-Hur W. S. Noble Bioinformatics 2005 21 i38 i46
J.-P. Vert J. Qiu W. S. Noble BMC Bioinf. 2007 8 S8
M. Hue and J.-P. Vert, Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010
L. Breiman Mach. Learn. 2001 45 5 32
N. Lin B. Wu R. Jansen M. Gerstein H. Zhao BMC Bioinf. 2004 5 154
X.-W. Chen M. Liu Bioinformatics 2005 21 4394 4400
Y. Qi Z. Bar-Joseph J. Klein-Seetharaman Proteins 2006 63 490 500
O. Tastan Y. Qi J. G. Carbonell J. Klein-Seetharaman Pac. Symp. Biocomput. 2009 14 516 527
H. Yu J. Chen X. Xu Y. Li H. Zhao Y. Fang X. Li W. Zhou W. Wang Y. Wang PLoS One 2012 7 e37608
T. Kato K. Tsuda A. Kiyoshi Bioinformatics 2005 21 2488 2495
P. Geurts N. Touleimat M. Dutreix F. d'Alché Buc BMC Bioinf. 2007 8 S4
C. Brouard, F. D'Alche-Buc and M. Szafranski, Proceedings of the 28th International Conference on Machine Learning (ICML-11), New York, NY, USA, 2011, pp. 593-600
Y. Qi J. Klein-seetharaman Z. Bar-joseph Y. Qi Z. Bar-joseph Pac. Symp. Biocomput. 2005 2005 531 542
F. Cheng C. Liu J. Jiang W. Lu W. Li G. Liu W. Zhou J. Huang Y. Tang PLoS Comput. Biol. 2012 8 e1002503
M. Schrynemackers R. Kuffner P. Geurts Front. Genet. 2013 4 262
Y. Park E. M. Marcotte Nat. Methods 2012 9 1134 1136
T. Pahikkala, M. Stock, A. Airola, T. Aittokallio, B. De Baets and W. Waegeman, in Machine Learning and Knowledge Discovery in Databases, ed., T. Calders, F. Esposito, E. Hullermeier, and, R. Meo, Springer, Berlin, Heidelberg, 2014, vol. 8725, pp. 517-532
L. Breiman, J. Friedman, R. Olsen and C. Stone, Classification and Regression Trees, Wadsworth International, 1984
P. Geurts D. Ernst L. Wehenkel Mach. Learn. 2006 63 3 42
H. Blockeel, L. De Raedt and J. Ramon, Proceedings of ICML 1998, 1998, pp. 55-63
P. Geurts A. Irrthum L. Wehenkel Mol. BioSyst. 2009 5 1593 1605
S. Madeira and A. Oliveira, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2004, vol. 1, pp. 24-45
C. V. Mering R. Krause B. Snel M. Cornell S. G. Oliver S. Fields P. Bork Nature 2002 417 399 403
Y. Yamanishi J.-P. Vert Bioinformatics 2004 20 i363 i370
M. Schuldiner S. Collins N. Thompson V. Denic A. Bhamidipati T. Punna J. Ihmels B. Andrews C. Boone J. Greenblatt J. Weissman N. Krogan Cell 2005 123 507 519
M. Hillenmeyer et al. Science 2008 320 362 365
Y. Yamanishi J.-P. Vert Bioinformatics 2005 21 i468 i477
J. J. Faith B. Hayete J. T. Thaden I. Mogno J. Wierzbowski G. Cottarel S. Kasif J. J. Collins T. S. Gardner PLoS Biol. 2007 5 e8
K. D. MacIsaac T. Wang B. Gordon D. K. Gifford G. D. Stormo E. Fraenkel BMC Bioinf. 2006 7 113
T. Hughes M. Marton A. Jones C. Roberts R. Stoughton C. Armour H. Bennett E. Coffey H. Dai Y. He M. Kidd A. King M. Meyer D. Slade P. Lum S. Stepaniants D. Shoemaker D. Gachotte K. Chakraburtty J. Simon M. Bard S. Friend Cell 2000 102 109 126
Z. Hu P. J. Killion V. R. Iyer Nat. Genet. 2007 39 683 687
G. Chua Q. D. Morris R. Sopko M. D. Robinson O. Ryan E. T. Chan B. J. Frey B. J. Andrews C. Boone T. R. Hughes Proc. Natl. Acad. Sci. U. S. A. 2006 103 12045 12050
J. Faith M. Driscoll V. Fusaro E. Cosgrove B. Hayete F. Juhn S. Schneider T. Gardner Nucleic Acids Res. 2007 36 866 870
S. Brohée R. Janky F. Abdel-Sater G. Vanderstocken B. André J. van Helden Nucleic Acids Res. 2011 39 6340 6358
Y. Yamanishi E. Pauwels H. Saigo V. Stoven J. Chem. Inf. Model. 2011 51 1183 1194
J. Gillis P. Pavlidis PLoS One 2011 6 e17258
J. Davis and M. Goadrich, Proceedings of the 23rd International Conference on Machine Learning, 2006, pp. 223-240
Y. Tabei E. Pauwels V. Stoven K. Takemoto Y. Yamanishi Bioinformatics 2012 28 i487 i494
G. Tsoumakas and I. Katakis, International Journal of Data Warehousing and Mining (IJDWM), 2007, vol. 3, pp. 1-13
C. Elkan and K. Noto, KDD '08 Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008, pp. 213-220
F. Denis R. Gilleron F. Letouzey Theor. Comput. Sci. 2005 348 70 83