Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 6 von 1012
2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017, p.625-636
2017

Details

Autor(en) / Beteiligte
Titel
Efficient Sequential Consistency in GPUs via Relativistic Cache Coherence
Ist Teil von
  • 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017, p.625-636
Ort / Verlag
IEEE
Erscheinungsjahr
2017
Link zum Volltext
Quelle
IEEE Electronic Library (IEL)
Beschreibungen/Notizen
  • Recent work has argued that sequential consistency (SC) in GPUs can perform on par with weak memory models, provided ordering stalls are made less frequent by relaxing ordering for private and read-only data. In this paper, we address the complementary problem of reducing stall latencies for both read-only and read-write data. We find that SC stalls are particularly problematic for workloads with inter-workgroup sharing, and occur primarily due to earlier stores in the same thread; a substantial part of the overhead comes from the need to stall until write permissions are obtained (to ensure write atomicity). To address this, we propose RCC, a GPU coherence protocol which grants write permissions without stalling but can still be used to implement SC. RCC uses logical timestamps to determine a global memory order and L1 read permissions; even though each core may see a different logical "time," SC ordering can still be maintained. Unlike previous GPU SC proposals, our design does not require invasive core changes and additional per-core storage to classify read-only/private data. For workloads with interworkgroup sharing overall performance is 29% better and energy is 25% less than in best previous GPU SC proposals, and within 7% of the best non-SC design.
Sprache
Englisch
Identifikatoren
eISSN: 2378-203X
DOI: 10.1109/HPCA.2017.40
Titel-ID: cdi_ieee_primary_7920861

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX