Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Accurately measuring collective operations at massive scale
Ist Teil von
2008 IEEE International Symposium on Parallel and Distributed Processing, 2008, p.1-8
Ort / Verlag
IEEE
Erscheinungsjahr
2008
Quelle
IEEE Xplore
Beschreibungen/Notizen
Accurate, reproducible and comparable measurement of collective operations is a complicated task. Although different measurement schemes are implemented in well- known benchmarks, many of these schemes introduce different systematic errors in their measurements. We characterize these errors and select a window-based approach as the most accurate method. However, this approach complicates measurements significantly and introduces a clock synchronization as a new source of systematic errors. We analyze approaches to avoid or correct those errors and develop a scalable synchronization scheme to conduct benchmarks on massively parallel systems. Our results are compared to the window-based scheme implemented in the SKaMPI benchmarks and show a reduction of the synchronization overhead by a factor of 16 on 128 processes.