Gradient Energy Matching for Distributed Asynchronous Gradient Descent

Hermans, Joeri; Louppe, Gilles

Download

Eprint already available on another site (E-prints, working papers and research blog)

Gradient Energy Matching for Distributed Asynchronous Gradient Descent

Hermans, Joeri; Louppe, Gilles

2018

Permalink
https://hdl.handle.net/2268/226232

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

1805.08469.pdf

Author preprint (654.96 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Computer Science - Learning; Computer Science - Distributed; Parallel; and Cluster Computing; Statistics - Machine Learning

Abstract :

[en] Distributed asynchronous SGD has become widely used for deep learning in large-scale systems, but remains notorious for its instability when increasing the number of workers. In this work, we study the dynamics of distributed asynchronous SGD under the lens of Lagrangian mechanics. Using this description, we introduce the concept of energy to describe the optimization process and derive a sufficient condition ensuring its stability as long as the collective energy induced by the active workers remains below the energy of a target synchronous process. Making use of this criterion, we derive a stable distributed asynchronous optimization procedure, GEM, that estimates and maintains the energy of the asynchronous system below or equal to the energy of sequential SGD with momentum. Experimental results highlight the stability and speedup of GEM compared to existing schemes, even when scaling to one hundred asynchronous workers. Results also indicate better generalization compared to the targeted SGD with momentum.

Disciplines :

Computer science

Author, co-author :

Hermans, Joeri ; Université de Liège - ULiège > Doct. sc. (info.)

Louppe, Gilles ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data

Language :

English

Title :

Gradient Energy Matching for Distributed Asynchronous Gradient Descent

Publication date :

22 May 2018

Source :

https://arxiv.org/abs/1805.08469

Additional URL :

https://arxiv.org/abs/1805.08469

Available on ORBi :

since 03 July 2018

Statistics

Number of views

45 (3 by ULiège)

Number of downloads

659 (0 by ULiège)

More statistics