Grid; P2P; BitTorrent; file sharing; caching; Bag of Tasks; scheduling; Grid computing
Résumé :
[en] The transfer of large input data files in P2P computing Grids often leads to delays in Task completion times. Existing research related to this topic has been focused on the spatial grouping of Tasks, i.e. reuse of available data through data caching and data-aware scheduling. However, it tends to decrease the level of parallelism of Task execution. In this paper, this issue is addressed by integrating the BitTorrent P2P file sharing protocol, a novel Task selection scheduling algorithm, an existing online, data-aware Resource selection algorithm (similar to Storage Affinity), and caching support. These algorithms have been implemented in the Lightweight Bartering Grid middleware. The Java implementation relies exclusively on Free and Open Source data transfer software (Azureus, Apache FTP server, edtFTPj). The proposed data transfer architecture does not need Predictive Communications Ordering or an explicit deployment of an overlay network. It is also easily deployable. Our main contribution is the joint use of P2P computing and P2P file sharing technologies, enabling a highly scalable and adaptive data transfer architecture to support P2P computing.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
Briquet, Cyril ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Dalem, Xavier ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Jodogne, Sébastien ; Centre Hospitalier Universitaire de Liège - CHU > Radiothérapie
de Marneffe, Pierre-Arnoul ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Langue du document :
Anglais
Titre :
P2P File Sharing for P2P Computing
Date de publication/diffusion :
juin 2009
Titre du périodique :
Multiagent and Grid Systems
ISSN :
1574-1702
eISSN :
1875-9076
Maison d'édition :
IOS Press, Amsterdam, Pays-Bas
Titre particulier du numéro :
Content management and delivery through P2P-based content networks
C. Briquet, X. Dalem, S. Jodogne and P.A. de Marneffe.
P2P File Sharing for P2P Computing.
In Multiagent and Grid Systems. IOS Press, Volume 5, Issue 2, 2009.
S. Al Kiswany and M. Ripeanu, A Simulation Study of Data Distribution Strategies for Large-scale Scientific Data Collaborations, in: Proc CCECE, Vancouver, BC, Canada, April 2007.
S. Al Kiswany, M. Ripeanu, A. Iamnitchi and S. Vazhkudai, Are P2P Data-Dissemination Techniques Viable in Today’s Data Intensive Scientific Collaborations? in: Proc Euro-Par, Rennes, France, August 2007.
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu and I. Foster, The Globus Striped GridFTP Framework and Server, in: Proc SC, Seattle, WA, USA, 2005.
D.P. Anderson, BOINC: A System for Public-Resource Computing and Storage, in: Proc Grid, Pittsburgh, PA, USA, November 2004.
N. Andrade, J. Santana, F. Brasileiro and W. Cirne, On the Efficiency and Cost of Introducing QoS in BitTorrent, in:Proc. GP2PC, Rio de Janeiro, Brazil, May 2007.
C. Anglano and M. Canonico, The File Mover: An Efficient Data Transfer System for Grid Applications, in: Proc CCGrid, Chicago, IL, USA, 2004.
M. Beynon, R. Ferreira, T. Kurc, A. Sussman and J. Saltz, DataCutter: Middleware for Filtering Very Large Scientific Datasets on Archival Storage Systems, in: Proc IEEE Symposium on Mass Storage Systems, College Park, MD, USA, March 2000.
M. Beynon, T. Kurc, A. Sussman and J. Saltz, Optimizing Execution of Component-based Applications using Group Instances. in: Proc CCGrid, Brisbane, Queensland, Australia, May 15-18 2001.
C. Briquet, X. Dalem, S. Jodogne and P.-A. de Marneffe, Scheduling Data-Intensive Bags of Tasks in P2P Grids with BitTorrent-enabled Data Distribution, in: Proc UPGRADE-CN’07, HPDC Workshops, Monterey Bay, CA, USA, June 2007.
C. Briquet and P.-A. de Marneffe, Learning Reliability Models of Grid Resource Supplying, in: Proc. Cracow Grid Workshop, Cracow, Poland, 2005.
Cyril Briquet and Pierre-Arnoul de Marneffe, Description of a Lightweight Bartering Grid Architecture. In Proc. Cracow Grid Workshop, Cracow, Poland, 2006.
M. Canonico, Scheduling Algorithms for Bag-of-Tasks Applications on Fault-Prone Desktop Grids, PhD thesis, University of Torino, Italy, Torino, Italy, 2006.
W. Cirne, F. Brasileiro, N. Andrade, L.B. Costa, A. Andrade, R. Novaes and M. Mowbray, Labs of the World, Unite!!! in: J Grid Computing Springer, 2006.
W. Cirne, D. Paranhos, F. Brasileiro, L.F.W. Góes and W. Voorsluys, On the Efficacy, Efficiency and Emergent Behavior of Task Replication in Large Distributed Systems, in: Parallel Computing, Volume 33. Elsevier, 2007.
B. Cohen, Incentives Build Robustness in BitTorrent, in: ProcWorkshop on Economics of Peer-to-Peer Systems, Berkeley, CA, USA, 2003.
P. Cozza, C. Mastroianni, D. Talia and I. Taylor, A Super-Peer Protocol for Multiple Job Submission on a Grid, in: Proc CoreGrid Workshop on Grid Middleware, Euro-Par, August 2006.
F.A.B. da Silva, S. Carvalho and E.R. Hruschka, A Scheduling Algorithm for Running Bag-of-Tasks Data Mining Applications on the Grid, in: Proc Euro-Par, Pisa, Italy, 2004.
X. Dalem, Implémentation d’un Grid Peer-to-Peer, Master’s thesis, University of Li`ege, Li`ege, Belgium, June 2007.
F. Desprez and A. Vernois, Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid, in: J Grid Comp, Springer, 2006.
I. Foster, What is the Grid? A three Point Checklist, Grid Today, July 2002.
S. Jodogne, C. Briquet and J. Piater, Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks, in: Proc. European Conference on Machine Learning, Berlin, Germany, 2006.
A. Kaplan, G.C. Fox and G. von Laszewski, GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing, in: Proc Grid Computing Environments, Supercomputing Workshops, Reno, NV, USA, November 2007.
S.G.M. Koo, C.S. George Lee, K. Kannan and S.-W. Kwong, On the Economics of Peer-to-Peer Content Distribution Systems for Large-volume Contents, in: Proc SCI, Orlando, FL, USA, July 2004.
E. Lawler, J. Lenstra, K.A. Rinnooy and D. Shmoys, eds, The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization. Wiley Series in Discrete Mathematics & Optimization, Wiley, New York, 1985.
A. Legout, G. Urvoy-Keller and P. Michiardi, Understanding BitTorrent: An Experimental Perspective. Technical report, INRIA, Sophia Antipolis, France, 2005.
M. Maheswaran, S. Ali, H.J. Siegel, D.A. Hensgen and R.F. Freund, Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems, in: Heterogeneous Computing Workshop, San Juan, Puerto Rico, April 1999.
A.-M.K. Pathan, J. Broberg, K. Bubendorfer, K.H. Kim and R. Buyya, An Architecture for Virtual Organization (VO)- based Effective Peering of Content Delivery Networks, in: Proc UPGRADE-CN’07, HPDC Workshops, Monterey Bay, CA, USA, June 2007.
A. Ramakrishnan, G. Singh, H. Zhao, E. Deelman, R. Sakellariou, K. Vahi, K. Blackburn, D. Meyers and M. Samidi, Scheduling Data Intensive Workflows Onto Storage-Constrained Distributed Resources, in: Proc CCGrid, Rio de Janeiro, Brazil, May 2007.
K. Ranganathan and I. Foster, Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications, In Proc HPDC, Edinburgh, Scotland, 2002.
E. Santos-Neto, W. Cirne, F. Brasileiro and A. Lima, Exploiting Replication and Data Reuse to Efficiently Schedule Data-intensive Applications on Grids, in: Proc Workshop Job Scheduling Strategies for Parallel Processing, New York, NY, USA, 2004.
S. Sivasubramanian, G. Pierre and M. van Steen, Autonomic Data Placement Strategies for Update-intensive Web Applications, in: Proc Int Workshop Advanced Architectures and Algorithms for Internet Delivery and Applications, Orlando, FL, USA, June 2005.
D. Stutzbach, D. Zappala and R. Rejaie, The Scalability of Swarming Peer-to-Peer Content Delivery, in: Proc IFIP Networking, Waterloo, Ontario, Canada, May 2005.
B.Wei, G. Fedak and F. Cappello, Collaborative Data Distribution with BitTorrent for Computational Desktop Grids, in:Proc ISPDC, Lille, France, 2005.
B. Wei, G. Fedak and F. Cappello, Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent, in:Proc Grid, Seattle, WA, USA, 2005.
Wikipedia, the Free Encyclopedia.
A. Zissimos, K. Doka, A. Chazapis and N. Koziris, GridTorrent: Optimizing data transfers in the Grid with collaborative sharing, in: Proc. Panhellenic Conference on Informatics, Patras, Greece, May 2007.