2007 • In Fortino, Giancarlo; Mastroianni, Carlo; Pierre, Guillaume (Eds.) Proceedings of the Second Workshop on Use of P2P, GRID and Agents for the Development of Content Networks
Grid; P2P; BitTorrent; data sharing; caching; replication; Bag of Tasks; scheduling
Abstract :
[en] Scheduling Data-Intensive Bags of Tasks in P2P Grids leads to transfers of large input data files, which cause delays in completion times. We propose to combine several existing technologies and patterns to perform efficient data-aware scheduling:
(1) use of the BitTorrent P2P file sharing protocol to transfer data,
(2) data caching on computational Resources,
(3) use of a data-aware Resource selection scheduling algorithm similar to Storage Affinity,
(4) a new Task selection scheduling algorithm (Temporal Tasks Grouping), based on the temporally grouped scheduling of Tasks sharing input data files. Data replication is also discussed.
The proposed approach does not need an overlay network or Predictive Communications Ordering, making our operational implementation of a P2P Grid middleware easily deployable in unstructured P2P networks. Experiments show that performance gains are achieved by combining BitTorrent, caching, Storage Affinity and Temporal Tasks Grouping. This work can be summarized as combining P2P Grid computing and P2P data transfer technologies.
Disciplines :
Computer science
Author, co-author :
Briquet, Cyril ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Dalem, Xavier ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Jodogne, Sébastien ; Centre Hospitalier Universitaire de Liège - CHU > Radiothérapie
de Marneffe, Pierre-Arnoul ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique (ingénierie du logiciel et algorithmique)
Language :
English
Title :
Scheduling Data-Intensive Bags of Tasks in P2P Grids with BitTorrent-enabled Data Distribution
Publication date :
26 June 2007
Event name :
HPDC, UPGRADE-CN'07 Workshop
Event place :
Monterey, United States - California
Event date :
du 25 juin 2007 au 25 juin 2007
Audience :
International
Main work title :
Proceedings of the Second Workshop on Use of P2P, GRID and Agents for the Development of Content Networks
W. Cirne, F. Brasileiro, N. Andrade, L. Costa, A. Andrade, R. Novaes, and M. Mowbray, "Labs of the World, Unite!!!" in J. Grid Computing. Springer, 2006.
N. Andrade, W. Cirne, F. Brasileiro, and P. Roisenberg, "OurGrid: An Approach to Easily Assemble Grids with Equitable Resource Sharing," in Proc. Workshop on Job Scheduling Strategies for Parallel Processing, Seattle, WA, USA, 2003.
C. Briquet and P.-A. de Marneffe, "Description of a Lightweight Bartering Grid Architecture," in Proc. Cracow Grid Workshop, Cracow, Poland, 2006.
N. Drost, R. V. van Nieuwpoort, and H. E. Bal, "Simple Locality-Aware Co-allocation in Peer-to-Peer Supercomputing," in Proc. GP2P, Singapore, May 2006.
F. Berman, A. J. G. Hey, and G. Fox, Eds., Grid Computing: Making the Global Infrastructure a Reality. Wiley, April 2003.
C. Briquet and P.-A. de Marneffe, "What is the Grid ? Tentative Definitions Beyond Resource Coordination," University of Liège, Liège, Belgium, Tech. Rep., 2006.
P. Cozza, C. Mastroianni, D. Talia, and I. Taylor, "A Super-Peer Protocol for Multiple Job Submission on a Grid," in Proc. CoreGrid Workshop on Grid Middleware, Euro-Par, August 2006.
C. Anglano and M. Canonico, "The File Mover: An Efficient Data Transfer System for Grid Applications." in Proc. CCGrid, Chicago, IL, USA, 2004.
E. Santos-Neto, W. Cirne, F. Brasileiro, and A. Lima, "Exploiting Replication and Data Reuse to Efficiently Schedule Data-intensive Applications on Grids," in Proc. Workshop Job Scheduling Strategies for Parallel Processing, New York, NY, USA, 2004.
K. Ranganathan and I. Foster, "Decoupling Computation and Data Scheduling in Distributed Datarlntensive Applications," in Proc. HPDC, Edinburgh, Scotland, 2002.
F. Desprez and A. Vernois, "Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid," in J Grid Comp. Springer, 2006.
F. A. B. da Silva, S. Carvalho, and E. R. Hruschka, "A Scheduling Algorithm for Running Bag-of-Tasks Data Mining Applications on the Grid," in Proc. Euro-Par, Pisa, Italy, 2004.
B. Cohen, "Incentives Build Robustness in BitTorrent," in Proc. Workshop on Economics of Peer-to-Peer Systems, Berkeley, CA, USA, 2003.
A. Legout, G. Urvoy-Keller, and P. Michiardi, "Understanding Bittorrent: An experimental perspective," INRIA, Sophia Antipolis, France, Tech. Rep., 2005.
B. Wei, G. Fedak, and F. Cappello, "Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent," in Proc. Grid, Seattle, WA, USA, 2005.
S. Jodogne, C. Briquet, and J. Piater, "Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks," in Proc. European Conference on Machine Learning, Berlin, Germany, 2006.
B. Wei, G. Fedak, and F. Cappello, "Collaborative Data Distribution with BitTorrent for Computational Desktop Grids," in Proc. ISPDC, Lille, France, 2005.
I. Foster, "What is the Grid ? a three Point Checklist," Grid Today, July 2002.
R. Olejnik, B. Toursel, M. Tudruj, and E. Laskowski, "DG-ADAJ: a Java Computing Platform for Desktop Grid," in Proc. Cracow Grid Workshop, Cracow, Poland, 2005.
S. Al Kiswany and M. Ripeanu, "A Simulation Study of Data, Distribution Strategies for Large-scale Scientific Data Collaborations," in Proc. CCECE, Vancouver, BC, Canada, April 2007.
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster, "The Globus Striped GridFTP Framework and Server," in Proc. SC, Seattle, WA, USA, 2005.
E. Lawler, J. Lenstra, K. A. Rinnooy, and D. Shmoys, Eds., The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization, ser. Wiley Series in Discrete Mathematics & Optimization. New York: Wiley, 1985.
"J2SE 5.0 API Specification." [Online]. Available: http://java.sun.com/j2se/1.5.0/docs/api/