Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 2 von 5
2012 IEEE 18th International Conference on Parallel and Distributed Systems, 2012, p.696-697
2012
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Supporting User-directed Fault Tolerance over Standard MPI
Ist Teil von
  • 2012 IEEE 18th International Conference on Parallel and Distributed Systems, 2012, p.696-697
Ort / Verlag
IEEE
Erscheinungsjahr
2012
Quelle
IEEE Electronic Library (IEL)
Beschreibungen/Notizen
  • User-directed means the process of carrying out fault tolerance is dynamic and the fault tolerance mode is chosen by users based on application requirements. In this paper, we introduce a general scheme based on standard MPI to provide the user directed support for application level algorithmic fault tolerance. The user-directed fault tolerance plays the role as a connection between applications and algorithmic fault tolerance. As a case study, our scheme has been incorporated to HPL combined with a non-blocking ABFT technique. We have tested the functional availability of our scheme for fault tolerance in real circumstance. We also evaluated that when there is no failure occurring, our support only brings 2.5 percent overhead. When failure occurs, with our scheme, the scalability of algorithmic fault tolerance maintains well.
Sprache
Englisch
Identifikatoren
ISBN: 9781467345651, 1467345652
ISSN: 1521-9097
eISSN: 2690-5965
DOI: 10.1109/ICPADS.2012.100
Titel-ID: cdi_ieee_primary_6413632

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX