Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 22 von 69176
ACM transactions on architecture and code optimization, 2024-01, Vol.21 (2), p.1-22, Article 26
2024

Details

Autor(en) / Beteiligte
Titel
Assessing the Impact of Compiler Optimizations on GPUs Reliability
Ist Teil von
  • ACM transactions on architecture and code optimization, 2024-01, Vol.21 (2), p.1-22, Article 26
Ort / Verlag
New York, NY: ACM
Erscheinungsjahr
2024
Link zum Volltext
Quelle
EZB Electronic Journals Library
Beschreibungen/Notizen
  • Graphics Processing Units (GPUs) compilers have evolved in order to support general-purpose programming languages for multiple architectures. NVIDIA CUDA Compiler (NVCC) has many compilation levels before generating the machine code and applies complex optimizations to improve performance. These optimizations modify how the software is mapped in the underlying hardware; thus, as we show in this article, they can also affect GPU reliability. We evaluate the effects on the GPU error rate of the optimization flags applied at the NVCC Parallel Thread Execution (PTX) compiling phase by analyzing two NVIDIA GPU architectures (Kepler and Volta) and two compiler versions (NVCC 10.2 and 11.3). We compare and combine fault propagation analysis based on software fault injection, hardware utilization distribution obtained with application-level profiling, and machine instructions radiation-induced error rate measured with beam experiments. We consider eight different workloads and 144 combinations of compilation flags, and we show that optimizations can impact the GPUs’ error rate of up to an order of magnitude. Additionally, through accelerated neutron beam experiments on a NVIDIA Kepler GPU, we show that the error rate of the unoptimized GEMM (-O0 flag) is lower than the optimized GEMM’s (-O3 flag) error rate. When the performance is evaluated together with the error rate, we show that the most optimized versions (-O1 and -O3) always produce a higher amount of correct data than the unoptimized code (-O0).
Sprache
Englisch
Identifikatoren
ISSN: 1544-3566
eISSN: 1544-3973
DOI: 10.1145/3638249
Titel-ID: cdi_hal_primary_oai_HAL_hal_04398273v1

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX