Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 10 von 45
Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2022, p.135-149
2022

Details

Autor(en) / Beteiligte
Titel
Near-optimal sparse allreduce for distributed deep learning
Ist Teil von
  • Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2022, p.135-149
Ort / Verlag
New York, NY, USA: ACM
Erscheinungsjahr
2022
Link zum Volltext
Quelle
ACM Digital Library Complete
Beschreibungen/Notizen
  • Communication overhead is one of the major obstacles to train large deep learning models at scale. Gradient sparsification is a promising technique to reduce the communication volume. However, it is very challenging to obtain real performance improvement because of (1) the difficulty of achieving an scalable and efficient sparse allreduce algorithm and (2) the sparsification overhead. This paper proposes Ok-Topk, a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved. To reduce the sparsification overhead, Ok-Topk efficiently selects the top-k gradient values according to an estimated threshold. Evaluations are conducted on the Piz Daint supercomputer with neural network models from different deep learning domains. Empirical results show that Ok-Topk achieves similar model accuracy to dense allreduce. Compared with the optimized dense and the state-of-the-art sparse allreduces, Ok-Topk is more scalable and significantly improves training throughput (e.g., 3.29x-12.95x improvement for BERT on 256 GPUs).
Sprache
Englisch
Identifikatoren
ISBN: 9781450392044, 1450392040
DOI: 10.1145/3503221.3508399
Titel-ID: cdi_acm_books_10_1145_3503221_3508399_brief

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX