UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 12 von 146

Improving Execution Concurrency of Large-Scale Matrix Multiplication on Distributed Data-Parallel Platforms

IEEE transactions on parallel and distributed systems, 2017-09, Vol.28 (9), p.2539-2552

2017

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

Improving Execution Concurrency of Large-Scale Matrix Multiplication on Distributed Data-Parallel Platforms

Ist Teil von

IEEE transactions on parallel and distributed systems, 2017-09, Vol.28 (9), p.2539-2552

Ort / Verlag

New York: IEEE

Erscheinungsjahr

2017

Quelle

IEEE Electronic Library (IEL)

Beschreibungen/Notizen

Matrix multiplication is a dominant but very time-consuming operation in many big data analytic applications. Thus its performance optimization is an important and fundamental research issue. The performance of large-scale matrix multiplication on distributed data-parallel platforms is determined by both computation and IO costs. For existing matrix multiplication execution strategies, when the execution concurrency scales up above a threshold, their execution performance deteriorates quickly because the increase of the IO cost outweighs the decrease of the computation cost. This paper presents a novel parallel execution strategy CRMM (Concurrent Replication-based Matrix Multiplication) along with a parallel algorithm, Marlin, for large-scale matrix multiplication on data-parallel platforms. The CRMM strategy exploits higher execution concurrency for sub-block matrix multiplication with the same IO cost. To further improve the performance of Marlin, we also propose a number of novel system-level optimizations, including increasing the concurrency of local data exchange by calling native library in batch, reducing the overhead of block matrix transformation, and reducing disk heavy shuffle operations by exploiting the semantics of matrix computation. We have implemented Marlin as a library along with a set of related matrix operations on Spark and also contributed Marlin to the open-source community. For large-sized matrix multiplication, Marlin outperforms existing systems including Spark MLlib, SystemML and SciDB, with about 1.29×, 3.53× and 2.21× speedup on average, respectively. The evaluation upon a real-world DNN workload also indicates that Marlin outperforms above systems by about 12.8×, 5.1× and 27.2× speedup, respectively.

Sprache: Englisch
Identifikatoren: ISSN: 1045-9219
eISSN: 1558-2183
DOI: 10.1109/TPDS.2017.2686384
Titel-ID: cdi_proquest_journals_1927631560

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Improving Execution Concurrency of Large-Scale Matrix Multiplication on Distributed Data-Parallel Platforms

Details

Weiterführende Literatur