Regression on fixed-rank positive semidefinite matrices: a Riemannian approach

[en] The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixed-rank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks.

Disciplines :

Computer science

Author, co-author :

Meyer, Gilles ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Bonnabel, Silvère; Mines ParisTech > Robotics center

Sepulchre, Rodolphe ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Regression on fixed-rank positive semidefinite matrices: a Riemannian approach

Publication date :

03 March 2011

Journal title :

Journal of Machine Learning Research

ISSN :

1532-4435

eISSN :

1533-7928

Publisher :

Microtome Publishing, Brookline, United States - Massachusetts

Volume :

Issue :

Feb

Pages :

593−625

Peer reviewed :

Peer Reviewed verified by ORBi

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi :

since 08 February 2011

Statistics

Number of views

130 (18 by ULiège)

Number of downloads

217 (5 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

P.-A. Absil, R. Mahony, and R. Sepulchre. Riemannian geometry of Grassmann manifolds with a view on algorithmic computation. Acta Appl. Math., 80(2):199-220, 2004.
P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008.
R. Arora. On learning rotations. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 22, pages 55-63. MIT Press, 2009.
V. Arsigny, P. Fillard, X. Pennec, and N. Ayache. Geometric means in a novel vector space structure on symmetric positive-definite matrices. SIAM Journal on Matrix Analysis and Applications, 29(1):328-347, 2007. (Pubitemid 351269899)
A. Asuncion and D. J. Newman. UCI machine learning repository, 2007. URL http://www.ics. uci.edu/~mlearn/MLRepository.html. University of California, Irvine, School of Information and Computer Sciences.
F. Bach and M. I. Jordan. Predictive low-rank decomposition for kernel methods. In Proceedings of the 22nd International Conference on Machine Learning (ICML), 2005.
S. Bonnabel and R. Sepulchre. Riemannian metric and geometric mean for positive semidefinite matrices of fixed rank. SIAM Journal on Matrix Analysis and Applications, 31(3):1055-1070, 2009.
I. Borg and P. Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer, second edition, 2005.
L. Bottou. Online algorithms and stochastic approximations. In David Saad, editor, Online Learning and Neural Networks. Cambridge University Press, 1998.
L. Bottou. Stochastic learning. In Advanced Lectures on Machine Learning, number 3176 in Lecture Notes in Artificial Intelligence, LNAI-3176, pages 146-168. Springer Verlag, 2004. (Pubitemid 39741631)
L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In J. C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems, volume 20, pages 161-168. MIT Press, 2007.
D. Cai, X. He, and J. Han. Efficient kernel discriminant analysis via spectral regression. In Proceedings of the International Conference on Data Mining (ICDM07), 2007.
R. Chatpatanasiri, T. Korsrilabutr, P. Tangchanachaianan, and B. Kijsirikul. A new kernelization framework for Mahalanobis distance learning algorithms. Neurocomputing, 73 (10-12): 1570-1579, 2010.
T. Chen, Y. Hua, and W.-Y. Yan. Global convergence of Oja's subspace algorithm for principal component extraction. IEEE Transaction Neural Networks, 9(1):58-67, 1998. (Pubitemid 128743614)
T. F. Cox and M. A. A. Cox. Multidimensional Scaling. Chapman and Hall, 2001.
K. Crammer. Online tracking of linear subspaces. In Proceedings of 19th Annual Conference on Learning Theory (COLT), 2006.
J. V. Davis and I. S. Dhillon. Structured metric learning for high dimensional problems. In Proceedings of the 14th ACM SIGKDD conference on Knowledge Discovery and Data Mining, 2008.
J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning (ICML), 2007.
A. Edelman, T. A. Arias, and S. T. Smith. The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20(2):303-353, 1998. (Pubitemid 129333771)
J. Faraut and A. Koranyi. Analysis on Symmetric Cones. Oxford University Press, 1994.
S. Fine, K. Scheinberg, N. Cristianini, J. Shawe-Taylor, and B. Williamson. Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research, 2:243-264, 2001.
A. Globerson and S. Roweis. Metric learning by collapsing classes. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems, volume 18, pages 451-458. MIT Press, 2005.
J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood components analysis. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 17, pages 513-520. MIT Press, 2004.
G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, 1996.
P. Jain, B. Kulis, I. S. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 21, pages 761-768. MIT Press, 2008.
P. Jain, B. Kulis, and I. S. Dhillon. Inductive regularized learning of kernel functions. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 23, pages 946-954. MIT Press, 2010.
M. Journée, F. Bach, P.-A. Absil, and R. Sepulchre. Low-rank optimization for semidefinite convex problems. SIAM Journal on Optimization, 20(5):2327-2351, 2010.
J. Kivinen and M. K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Journal of Information and Computation, 132(1):1-64, January 1997.
B. Kulis, M. Sustik, and I. S. Dhillon. Low-rank kernel learning with Bregman matrix divergences. Journal of Machine Learning Research, 10:341-376, 2009.
J. Kwok and I. Tsang. Learning with idealized kernels. Proceedings of the 20th International Conference on Machine learning (ICML), 2003.
G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27-72, 2004.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Handwritten digit recognition with a back-propagation network. In David Touretzky, editor, Advances in Neural Information Processing Systems, volume 2. Morgan Kaufman, 1989.
Y. Matsuda and K. Yamaguchi. Global mapping analysis: stochastic gradient algorithm in SSTRESS and classical MDS stress. In Proceedings of International Conference on Neural Information Processing, 2001.
J. Nocedal and S. J. Wright. Numerical Optimization, Second Edition. Springer, 2006.
E. Oja. Principal components, minor components, and linear neural networks. Neural Networks, 5:927-935, 1992. (Pubitemid 23581250)
E. F. Petricoin, D. K. Ornstein, C. P. Paweletz, A. M. Ardekani, P. S. Hackett, B. A. Hitt, A. Velassco, C. Trucco, L. Wiegand, K. Wood, C. B. Simone, P. J. Levine, W. M. Linehan, M. R. Emmert-Buck, S. M. Steinberg, E. C. Kohn, and L. A. Liotta. Serum proteomic patterns for detection of prostate cancer. Journal of the National Cancer Institute, 94(20):1576-1578, 2002. (Pubitemid 35277282)
B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization. SIAM Review, 52(3):471-501, 2010.
R. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:297-336, 1999. (Pubitemid 32210620)
B. Schölkopf, A. J. Smola, and K.-R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(3):1299-1319, 1998. (Pubitemid 128463674)
S. Shalev-Shwartz, Y. Singer, and A. Ng. Online and batch learning of pseudo-metrics. In Proceedings of the 21st International Conference on Machine Learning (ICML), 2004.
J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.
S. T. Smith. Covariance, subspace, and intrinsic Cramér-Rao bounds. IEEE Transactions on Signal Processing, 53(5):1610-1630, 2005. (Pubitemid 40679137)
L. Song, A. Smola, K. Borgwardt, and A. Gretton. Colored maximum variance unfolding. In J. C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems, volume 20, pages 1385-1392. MIT Press, 2007.
A. Strehl, J. Ghosh, and R. Mooney. Impact of similarity measures on web-page clustering. In Workshop on Artificial Intelligence for Web Search (AAAI), 2000.
L. Torresani and K. Lee. Large margin component analysis. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems, volume 19, pages 1385-1392. MIT Press, 2006.
K. Tsuda, G. Ratsch, and M. Warmuth. Matrix exponentiated gradient updates for on-line learning and Bregman projection. Journal of Machine Learning Research, 6:995-1018, 2005.
M. Warmuth. Winnowing subspaces. Proceedings of the 24th international conference on Machine learning (ICML), 2007.
K. Weinberger and L. Saul. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10 (Feb): 207-244, 2009.
K. Weinberger, F. Sha, and L. Saul. Learning a kernel matrix for nonlinear dimensionality reduction. Proceedings of the 21st International Conference on Machine Learning (ICML), pages 839-846, 2004. (Pubitemid 40290888)
E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15, pages 505-512. MIT Press, 2002.
L. Yang. Distance metric learning: a comprehensive survey. Technical report, Michigan State University, 2006.
J. Zhuang, I. Tsang, and S. Hoi. SimpleNPKL: simple non-parametric kernel learning. Proceedings of the 26th International Conference on Machine Learning (ICML), 2009.

Similar publications

Sorry the service is unavailable at the moment. Please try again later.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website