[en] The knowledge of end-to-end network distances is
essential to many Internet applications. As active probing of all
pairwise distances is infeasible in large-scale networks, a natural
idea is to measure a few pairs and to predict the other ones
without actually measuring them. This paper formulates the
prediction problem as matrix completion where the unknown
entries in a pairwise distance matrix constructed from a network
are to be predicted. By assuming that the distance matrix has
a low-rank characteristics, the problem is solvable by lowrank
approximation based on matrix factorization. The new
formulation circumvents the well-known drawbacks of existing
approaches based on Euclidean embedding.
A new algorithm, so-called Decentralized Matrix Factorization
by Stochastic Gradient Descent (DMFSGD), is proposed. By
letting network nodes exchange messages with each other, the
algorithm is fully decentralized and only requires each node
to collect and to process local measurements, with neither
explicit matrix constructions nor special nodes such as landmarks
and central servers. In addition, we compared comprehensively
matrix factorization and Euclidean embedding to demonstrate
the suitability of the former on network distance prediction. We
further studied the incorporation of a robust loss function and
of non-negativity constraints. Extensive experiments on various
publicly-available datasets of network delays show not only the
scalability and the accuracy of our approach, but also its usability
in real Internet applications.
Disciplines :
Computer science
Author, co-author :
Liao, Yongjun ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques
Du, Wei; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques
Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Leduc, Guy ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques
Language :
English
Title :
DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction
Publication date :
11 October 2013
Journal title :
IEEE/ACM Transactions on Networking
ISSN :
1063-6692
eISSN :
1558-2566
Publisher :
Institute of Electrical and Electronics Engineers, New York, United States - New York
M. Crovella and B. Krishnamurthy, Internet Measurement: Infrastructure, Traffic and Applications. New York: Wiley, 2006.
S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, "A scalable content-addressable network," in Proc. ACM SIGCOMM, San Diego, CA, Aug. 2001, pp. 161-172. (Pubitemid 32981962)
F. Dabek, "A distributed hash table," Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA, Nov. 2005.
S. Ratnasamy,M. Handley, R. Karp, and S. Shenker, " Topologicallyaware overlay construction and server selection," in Proc. IEEE INFOCOM, New York, NY, Jun. 2002, pp. 1190-1199.
M. J. Freedman, K. Laskhminarayanan, and D. Mazières, "OASIS: Anycast for any service," in Proc. USENIX Symp. Netw. Syst. Design Implement., San Jose, CA, May 2006, pp. 10-23.
T. S. E. Ng and H. Zhang, "Predicting Internet network distance with coordinates-based approaches," in Proc. IEEE INFOCOM, NewYork, NY, Jun. 2002, pp. 170-179.
F. Dabek, R. Cox, F. Kaashoek, and R. Morris, "Vivaldi: A decentralized network coordinate system," in Proc. ACM SIGCOMM, Portland, OR, Aug. 2004, pp. 15-26. (Pubitemid 40954866)
Y. Mao, L. Saul, and J. M. Smith, "IDES: An Internet distance estimation service for large networks," IEEE J. Sel. Areas Commun., vol. 24, no. 12, pp. 2273-2284, Dec. 2006.
Y. Liao, P. Geurts, and G. Leduc, "Network distance prediction based on decentralized matrix factorization," in Proc. IFIP Netw. Conf., Chennai, India, May 2010, pp. 15-26.
B. Donnet, B. Gueye, and M. A. Kaafar, "A survey on network coordinates systems, design, and security," IEEE Commun. Surveys Tuts., vol. 12, no. 4, pp. 488-503, 4th Quart., 2010.
J. Ledlie, P. Gardner, and M. I. Seltzer, "Network coordinates in the wild," in Proc. USENIX Symp. Netw. Syst. Design Implement., Apr. 2007, pp. 22-35.
H. Zheng, E. K. Lua, M. Pias, and T. Griffin, "Internet routing policies and round-trip times," in Proc. Passive ActiveMeas., Boston, MA,Apr. 2005, pp. 236-250. (Pubitemid 41252750)
S. Lee, Z. Zhang, S. Sahu, and D. Saha, "On suitability of Euclidean embedding of Internet hosts," Perform. Eval. Rev., vol. 34, no. 1, pp. 157-168, 2006. (Pubitemid 44619088)
G. Wang, B. Zhang, and T. S. E. Ng, "Towards network triangle inequality violation aware distributed systems," in Proc. ACM/SIGCOMM Internet Meas. Conf., San Diego, CA, Oct. 2007, pp. 175-188.
S. Banerjee, T. G. Griffin, and M. Pias, "The interdomain connectivity of PlanetLab nodes," in Proc. Passive Active Meas., Antibes Juan-les- Pins, France, Apr. 2004, pp. 73-82.
E. K. Lua, T. Griffin, M. Pias, H. Zheng, and J. Crowcroft, "On the accuracy of embeddings for Internet coordinate systems," in Proc. ACM/ SIGCOMM Internet Meas. Conf., Berkeley, CA, 2005, pp. 1-14.
E. J. Candès and B. Recht, "Exact matrix completion via convex optimization," Found. Comput. Math., vol. 9, no. 6, pp. 717-772, 2009.
E. J. Candès and Y. Plan, "Matrix completion with noise," Proc. IEEE, vol. 98, no. 6, pp. 925-936, Jun. 2010.
R. H. Keshavan, S. Oh, and A. Montanari, "Matrix completion from a few entries," Comput. Res. Repository, 2009, abs/0901.3150.
L. Tang and M. Crovella, "Virtual landmarks for the Internet," in Proc. ACM/SIGCOMM Internet Meas. Conf., Miami, FL, Oct. 2003, pp. 143-152. (Pubitemid 40730646)
D. B. Chua, E.D.Kolaczyk, and M. Crovella, "Network kriging," IEEE J. Sel. Areas Commun., vol. 24, no. 12, pp. 2263-2272, Dec. 2006. (Pubitemid 44955766)
Y. Chen,D. Bindel,H. Song, and R. H.Katz, "An algebraic approach to practical and scalable overlay networkmonitoring," Comput. Commun. Rev., vol. 34, no. 4, pp. 55-66, Aug. 2004. (Pubitemid 40954869)
N. S. Nati and T. Jaakkola, "Weighted low-rank approximations," in Proc. Int. Conf.Mach. Learning,Washington, DC, 2003, pp. 720-727.
A. M. Buchanan and A. W. Fitzgibbon, "Damped newton algorithms for matrix factorization with missing data," in Proc. Comput. Vision Pattern Recogn., San Diego, CA, 2005, vol. 2, pp. 316-322.
S. Shalev-Shwartz, A. Gonen, and O. Shamir, "Large-scale convex minimization with a low-rank constraint," in Proc. Int. Conf. Mach. Learning, Bellevue, WA, 2011, pp. 329-336.
L. Bottou, ,D. Saad, Ed., "Online algorithms and stochastic approximations," in Online Learning and Neural Networks. Cambridge, U.K.: Cambridge Univ. Press, 1998.
A. Pathak, H. Pucha, Y. Zhang, Y. C. Hu, and Z. M. Mao, "A measurement study of Internet delay asymmetry," in Proc. Passive Active Meas., Cleveland, OH, Apr. 2008, pp. 182-191. (Pubitemid 351702291)
Y. He, M. Faloutsos, S. Krishnamurthy, B. Huffaker, Y. He, M. Faloutsos, S. Krishnamurthy, and B. Huffaker, "On routing asymmetry in the Internet," in Proc. IEEE GLOBECOM, St. Louis, MO, 2005, vol. 2, pp. 904-909. (Pubitemid 46171264)
B. Wong, A. Slivkins, and E. Sirer, "Meridian: A lightweight network location service without virtual coordinates," in Proc. ACM SIGCOMM, Philadelphia, PA, Aug. 2005, pp. 85-96. (Pubitemid 46323496)
G. H. Golub and C. F. van Loan, Matrix Computations, 3rd ed. Baltimore, MD: Johns Hopkins Univ. Press, 1996.
D. D. Lee and H. S. Seung, "Algorithms for non-negativematrix factorization," in Proc. Adv. Neural Inf. Process. Syst., Vancouver, Canada, 2001, pp. 556-562.
G. Takács, I. Pilászy, B. Németh, and D. Tikk, "Scalable collaborative filtering approaches for large recommender systems," J. Mach. Learning Res., vol. 10, pp. 623-656, June 2009.
D. Bertsekas, Nonlinear Programming. Belmont, MA: Athena Scientific, 1999.
C. Hennig and M. Kutlukaya, "Some thoughts about the design of loss functions," REVSTAT-Statist. J., vol. 5, no. 1, pp. 19-39, 2007.
Q. Ke and T. Kanade, "Robust norm factorization in the presence of outliers and missing data by alternative convex programming," in Proc. Comput. Vision Pattern Recogn., San Diego, CA, 2005, vol. 2, pp. 592-599.
C.-J. Lin, "Projected gradient methods for nonnegative matrix factorization," Neural Comput., vol. 19, no. 10, pp. 2756-2779, Oct. 2007.
K. P. Gummadi, S. Saroiu, and S. D. Gribble, "King: Estimating latency between arbitrary Internet end hosts," in Proc. ACM/SIGCOMM Internet Meas. Workshop, Marseille, France, Nov. 2002, pp. 5-18.
I. Borg and P. Groenen,ModernMultidimensional Scaling: Theory and Applications. New York: Springer, 2005.
G.Wang and T. S. E. Ng, "Distributed algorithms for stable and secure network coordinates," in Proc. ACM/SIGCOMM Internet Meas. Conf., Vouliagmeni, Greece, Oct. 2008, pp. 131-144.
Y. Liao,W. Du, P. Geurts, and G. Leduc, "Decentralized prediction of end-to-end network performance classes," in Proc. CoNEXT, Tokyo, Japan, Dec. 2011, pp. 14:1-14:12.