Cortes, C., Mehryar, M., & Weston, J. (2005). A general regression technique for learning transductions. Proceedings of ICML 2005 (pp. 153-160).
Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119-139.
Friedman, J. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19, 1-67.
Friedman, J. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 29, 1189-1232.
Friedman, J. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38, 367-378.
Geurts, P., Ernst, D., & Wehenkel, L. (2006a). Extremely randomized trees. Machine Learning, 36, 3-42.
Geurts, P., Wehenkel, L., & d'Alché Bue F. (2006b). Kernelizing the output of tree-based methods. Proc. of ICML (pp. 345-352).
Mason, L., Baxter, J., Bartlett, P., & Frean, M. (2000). Boosting algorithms as gradient descent. Neural Information Processing Systems (pp. 512-518).
Memisevic (2006). An introduction to structured discriminative learning (Technical Report). Department of Computer Science, University of Toronto.
Rätsch, G., Demiriz, A., & Bennett, K. (2002). Sparse regression ensembles in infinite and finite hypothesis spaces. Machine Learning, 48, 193-221.
Szedmak, S., Shawe-Taylor, J., &: Parado-Hernandez, E. (2005). Learning via linear operators: Maximum margin regression (Technical Report). University of Southampton, UK.
Taskar, B., Chatalbashev, V., Koller, D., & Guestrin, C. (2005). Learning structured prediction models: A large margin approach. Proc. of ICML 2005 (pp. 897-904).
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. JMLR, 6, 1453-1484.
Weston, J., Chapelle, O., Elisseeff, A., Schoelkopf, B., & Vapnik, V. (2002). Kernel dependency estimation. Advances in Neural Information Processing Systems, 15.
Yamanishi, Y., Vert, J.-P., & Kanehisa, M. (2005). Supervised enzyme network inference from the integration of genomic data and chemical information. Bioinformatics, 21, i468-i477.