Loading [MathJax]/extensions/tex2jax.js

A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Martin Hoffmann
  • Peter Ulbrich
  • Christian Dietrich
  • Horst Schirmeier
  • Daniel Lohmann

Externe Organisationen

  • Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU Erlangen-Nürnberg)
  • Technische Universität Dortmund

Details

OriginalspracheEnglisch
Titel des Sammelwerks2014 IEEE 15th International Symposium on High-Assurance Systems Engineering
Seiten33-40
Seitenumfang8
ISBN (elektronisch)978-1-4799-3466-9, 978-1-4799-3465-2
PublikationsstatusVeröffentlicht - 6 März 2014
Extern publiziertJa
Veranstaltung2014 IEEE 15th International Symposium on High-Assurance Systems Engineering, HASE 2014 - Miami, FL, USA / Vereinigte Staaten
Dauer: 9 Jan. 201411 Jan. 2014

Abstract

Arithmetic error coding schemes (AN codes) are a well known and effective technique for soft error mitigation. Although coding theory being a rich area of mathematics, their implementation seems to be fairly easy. However, compliance with the theory can be lost easily while moving towards an actual implementation - finally jeopardizing the aspired fault-tolerance characteristics. In this paper, we present our experiences and lessons learned from implementing AN codes in the Cored dependable voter. We focus on the challenges and pitfalls in the transition from maths to machine code for a binary computer from a systems perspective. Our results show, that practical misconceptions (such as the use of prime numbers) and architecture-dependent implementation glitches occur on every stage of this transition. We identify typical pitfalls and describe practical measures to find and resolve them. Our measures eliminate all remaining SDCs in the Cored voter, which is validated by an extensive fault-injection campaign that covers 100 percent of the fault space for 1-bit and 2-bit errors.

ASJC Scopus Sachgebiete

Zitieren

A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes. / Hoffmann, Martin; Ulbrich, Peter; Dietrich, Christian et al.
2014 IEEE 15th International Symposium on High-Assurance Systems Engineering. 2014. S. 33-40.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Hoffmann, M, Ulbrich, P, Dietrich, C, Schirmeier, H, Lohmann, D & Schroder-Preikschat, W 2014, A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes. in 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering. S. 33-40, 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering, HASE 2014, Miami, FL, USA / Vereinigte Staaten, 9 Jan. 2014. https://doi.org/10.1109/hase.2014.14
Hoffmann, M., Ulbrich, P., Dietrich, C., Schirmeier, H., Lohmann, D., & Schroder-Preikschat, W. (2014). A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes. In 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering (S. 33-40) https://doi.org/10.1109/hase.2014.14
Hoffmann M, Ulbrich P, Dietrich C, Schirmeier H, Lohmann D, Schroder-Preikschat W. A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes. in 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering. 2014. S. 33-40 doi: 10.1109/hase.2014.14
Hoffmann, Martin ; Ulbrich, Peter ; Dietrich, Christian et al. / A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes. 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering. 2014. S. 33-40
Download
@inproceedings{9ebb9843109e438aa7d42e02d706f2dc,
title = "A Practitioner{\textquoteright}s Guide to Software-based Soft-Error Mitigation Using AN-Codes",
abstract = "Arithmetic error coding schemes (AN codes) are a well known and effective technique for soft error mitigation. Although coding theory being a rich area of mathematics, their implementation seems to be fairly easy. However, compliance with the theory can be lost easily while moving towards an actual implementation - finally jeopardizing the aspired fault-tolerance characteristics. In this paper, we present our experiences and lessons learned from implementing AN codes in the Cored dependable voter. We focus on the challenges and pitfalls in the transition from maths to machine code for a binary computer from a systems perspective. Our results show, that practical misconceptions (such as the use of prime numbers) and architecture-dependent implementation glitches occur on every stage of this transition. We identify typical pitfalls and describe practical measures to find and resolve them. Our measures eliminate all remaining SDCs in the Cored voter, which is validated by an extensive fault-injection campaign that covers 100 percent of the fault space for 1-bit and 2-bit errors.",
keywords = "AN code, Arithmetic error coding, Fault injection, Redundancy, Soft errors, Software-based fault tolerance",
author = "Martin Hoffmann and Peter Ulbrich and Christian Dietrich and Horst Schirmeier and Daniel Lohmann and Wolfgang Schroder-Preikschat",
year = "2014",
month = mar,
day = "6",
doi = "10.1109/hase.2014.14",
language = "English",
pages = "33--40",
booktitle = "2014 IEEE 15th International Symposium on High-Assurance Systems Engineering",
note = "2014 IEEE 15th International Symposium on High-Assurance Systems Engineering, HASE 2014 ; Conference date: 09-01-2014 Through 11-01-2014",

}

Download

TY - GEN

T1 - A Practitioner’s Guide to Software-based Soft-Error Mitigation Using AN-Codes

AU - Hoffmann, Martin

AU - Ulbrich, Peter

AU - Dietrich, Christian

AU - Schirmeier, Horst

AU - Lohmann, Daniel

AU - Schroder-Preikschat, Wolfgang

PY - 2014/3/6

Y1 - 2014/3/6

N2 - Arithmetic error coding schemes (AN codes) are a well known and effective technique for soft error mitigation. Although coding theory being a rich area of mathematics, their implementation seems to be fairly easy. However, compliance with the theory can be lost easily while moving towards an actual implementation - finally jeopardizing the aspired fault-tolerance characteristics. In this paper, we present our experiences and lessons learned from implementing AN codes in the Cored dependable voter. We focus on the challenges and pitfalls in the transition from maths to machine code for a binary computer from a systems perspective. Our results show, that practical misconceptions (such as the use of prime numbers) and architecture-dependent implementation glitches occur on every stage of this transition. We identify typical pitfalls and describe practical measures to find and resolve them. Our measures eliminate all remaining SDCs in the Cored voter, which is validated by an extensive fault-injection campaign that covers 100 percent of the fault space for 1-bit and 2-bit errors.

AB - Arithmetic error coding schemes (AN codes) are a well known and effective technique for soft error mitigation. Although coding theory being a rich area of mathematics, their implementation seems to be fairly easy. However, compliance with the theory can be lost easily while moving towards an actual implementation - finally jeopardizing the aspired fault-tolerance characteristics. In this paper, we present our experiences and lessons learned from implementing AN codes in the Cored dependable voter. We focus on the challenges and pitfalls in the transition from maths to machine code for a binary computer from a systems perspective. Our results show, that practical misconceptions (such as the use of prime numbers) and architecture-dependent implementation glitches occur on every stage of this transition. We identify typical pitfalls and describe practical measures to find and resolve them. Our measures eliminate all remaining SDCs in the Cored voter, which is validated by an extensive fault-injection campaign that covers 100 percent of the fault space for 1-bit and 2-bit errors.

KW - AN code

KW - Arithmetic error coding

KW - Fault injection

KW - Redundancy

KW - Soft errors

KW - Software-based fault tolerance

UR - http://www.scopus.com/inward/record.url?scp=84898643057&partnerID=8YFLogxK

U2 - 10.1109/hase.2014.14

DO - 10.1109/hase.2014.14

M3 - Conference contribution

AN - SCOPUS:84898643057

SP - 33

EP - 40

BT - 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering

T2 - 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering, HASE 2014

Y2 - 9 January 2014 through 11 January 2014

ER -