Inferring biological networks with output kernel trees

[en] Background: Elucidating biological networks between proteins appears nowadays as one of the most important challenges in systems biology. Computational approaches to this problem are important to complement high-throughput technologies and to help biologists in designing new experiments. In this work, we focus on the completion of a biological network from various sources of experimental data. Results: We propose a new machine learning approach for the supervised inference of biological networks, which is based on a kernelization of the output space of regression trees. It inherits several features of tree-based algorithms such as interpretability, robustness to irrelevant variables, and input scalability. We applied this method to the inference of two kinds of networks in the yeast S. cerevisiae: a protein-protein interaction network and an enzyme network. In both cases, we obtained results competitive with existing approaches. We also show that our method provides relevant insights on input data regarding their potential relationship with the existence of interactions. Furthermore, we confirm the biological validity of our predictions in the context of an analysis of gene expression data. Conclusion: Output kernel tree based methods provide an efficient tool for the inference of biological networks from experimental data. Their simplicity and interpretability should make them of great value for biologists.

Disciplines :

Microbiology
Biochemistry, biophysics & molecular biology
Biotechnology

Author, co-author :

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Touleimat, Nizar; Université d'Evry > IBISC FRE CNRS 2873

Dutreix, Marie; Institut Curie (France)

d'Alche-Buc, Florence; Université d'Evry > IBISC FRE CNRS 2871

Language :

English

Title :

Inferring biological networks with output kernel trees

Publication date :

03 May 2007

Journal title :

BMC Bioinformatics

eISSN :

1471-2105

Publisher :

Biomed Central Ltd, London, United Kingdom

Volume :

Issue :

Suppl. 2

Pages :

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

http://www.biomedcentral.com/1471-2105/8/S2/S4

Available on ORBi :

since 13 August 2009

Statistics

Number of views

210 (30 by ULiège)

Number of downloads

193 (7 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

von Mering C Jensen LJ Snel B Hooper SD Krupp M Foglierini M Jouffre N Huynen MA Bork P STRING: Known and predicted protein-protein associations, integrated and transferred across organisms Nucleic Acids Res 2005, 33(Database issue):D433-D437 539959 15608232
Ben-Hur A Noble W Kernel methods for predicting protein-protein interactions Bioinformatics 2005, 21(Suppl 1):i38-i46 15961482
Yamanishi Y Vert JP Kanehisa M Protein network inference from multiple genomic data: A supervised approach Bioinformatics 2004, 20:i363-i370 15262821
Yamanishi Y Vert JP Kanehisa M Supervised enzyme network inference from the integration of genomic data and chemical information Bioinformatics 2005, 21:i468-i477 http://web.kuicr.kyoto-u.ac.jp/~yoshi/ismb05/ 15961492
Kato T Tsuda K Kiyoshi A Selective integration of multiple biological data for supervised network inference Bioinformatics 2005, 21(10):2488-2495 http://www.cbrc.jp/~kato/faem/faem.html 15728114
Vert JP Yamanishi Y Supervised graph inference Advances in Neural Information Processing Systems 2004, 17:1433-1440
Kondor R Lafferty J Diffusion kernels on graphs and other discrete input spaces Proc of the 19th International Conference on Machine Learning 2002, 315-322
Geurts P Wehenkel L d'Alché-Buc F Kernelizing the output of tree-based methods Proceedings of the 23rd International Conference on Machine Learning ACM Cohen W, Moore A 2006, 345-352
Breiman L Friedman J Olsen R Stone C Classification and Regression Trees Wadsworth International 1984
Geurts P Ernst D Wehenkel L Extremely randomized trees Machine Learning 2006, 36:3-42
von Mering C Krause R Snel B Cornell M Oliver S S F P B Comparative assessment of large-scale data sets of protein-protein interactions Nature 2002, 417(6887):399-403 12000970
Kaneshiha M Goto S Kawashima S Okuno Y Hattori M The KEGG resource for deciphering the genome Nucleic Acids Res 2004, 32(Database issue):D277-D280 308797 14681412
Spellman P Sherlock G Zhang M Iyer V Anders K Eisen M Brown P Botstein D Futcher B Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization Mol Biol Cell 1998, 9(12):3273-3297 25624 9843569
Eisen M Spellman P Patrick O Botstein D Cluster analysis and display of genome-wide expression patterns Proc Natl Acad Sci 1998, 95:14863-14868 24541 9843981
Huh W Falvo J Gerke C Carroll A Howson R Weissman J O'Shea E Global analysis of protein localization in budding yeast Nature 2003, 425:686-691 14562095
Uetz P Giot L Cagney G Mansfield T Judson R Knight J Lockshon D Narayan V Srinivasan M Pochart P Qureshi-Emili A Li Y Godwin B Conover D Kalbfleisch T Vijayadamodar G Yang M Johnston M Fields S Rothberg J A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Nature 2000, 403:623-627 10688190
Ito T Tashiro K Muta S Ozawa R Chiba T Nishizawa M Yamamoto K Kuhara S Sakaki Y Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations of between the yeast proteins Proc Natl Acad Sci 2000, 97:1143-1147 15550 10655498
Mercier G Berthault N Touleimat N Kepes F Fourel G Gilson E Dutreix M A haploid-specific transcriptional response to irradiation in Saccharomyces cerevisiae Nucleic Acids Res 2005, 33:6635-6643 1298924 16321968
Touleimat N Zehraoui F Dutreix M d'Alché-Buc F Xpath: A semi-automated inference tool for regulatory pathways extraction from pertubed data Submitted
Han JDJ Bertin N Hao T Goldberg DS Berriz GF Zhang LV Dupuy D Walhout AJM E Cusick M Roth FP Vidal M Evidence for dynamically organized modularity in the yeast protein-protein interaction network Nature 2004, 430(6995):88-93 15190252
http://www.ibisc.univ-evry.fr/Equipes/AMIS/papers/bmc-pmsb06/
Shannon P Markiel A Ozier O Baliga N Wang J Ramage D Amin N Schwikowski B Ideker T Cytoscape: A software environment for integrated models of biomolecular interaction networks Genome Res 2003, 13(11):2498-2504 http://cytoscape.org 403769 14597658
Maere S Heymans K Kuiper M BiNGO: A Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks Bioinformatics 2005, 21:3448-3449 15972284
Wade C Umbarger M McAlear M The budding yeast rRNA and ribosome biosynthesis (RRB) regulon contains over 200 genes Yeast 2006, 23(4):293-306 16544271
Saccharomyces Genome Database http://www.yeastgenome.org