UB Paderborn / Katalog / Suche / Details

Euro-Par 2023: Parallel Processing, p.664-678

Autor(en) / Beteiligte

Titel

Faster Segmented Sort on GPUs

Ist Teil von

Ort / Verlag

Cham: Springer Nature Switzerland

Link zum Volltext

Quelle

Alma/SFX Local Collection

Beschreibungen/Notizen

Efficient parallel implementations of various sorting algorithms on modern hardware platforms are essential to numerous application areas. In this paper, we first measure the performance of the leading segmented sort implementation on CUDA-enabled GPUs and determine optimal setups using the resulting runtimes. Subsequently, we propose a number of changes that improve efficiency for segments of specific lengths. Furthermore, an alternative key-only version is introduced, that is specifically optimized to just sort keys instead of key-value pairs, which allows for further optimization. Performance is evaluated by comparing runtimes of the original algorithm with our improved version for segments of different lengths resulting in average speedups between 1.26 and 1.35 on four GPUs of different generations (Pascal, Volta, Ampere, Ada Lovelace). Furthermore, comparison to alternative segmented sort implementations from CUB and ModernGPU results in average speedups of at least 2.2 and 2.5, respectively, across all tested architectures. To illustrate how our improved sorting algorithm can be beneficial in a practical application, we have integrated it into the MetaCache-GPU pipeline for metagenomic DNA classification resulting in speedups of up to 25.6% for the sorting step. Code is publicly available at https://gitlab.rlp.net/pararch/faster-segmented-sort-on-gpus.

Sprache: Englisch
Identifikatoren: ISBN: 3031396979, 9783031396977
ISSN: 0302-9743
eISSN: 1611-3349
DOI: 10.1007/978-3-031-39698-4_45
Titel-ID: cdi_springer_books_10_1007_978_3_031_39698_4_45

Format: –
Schlagworte: Bioinformatics, CUDA, GPUs, Massively Parallel Algorithms, Sorting

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX