Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandBeitrag in Buch/SammelwerkForschungPeer-Review

Autoren

Organisationseinheiten

Externe Organisationen

  • Robert Bosch GmbH
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksComputer Safety, Reliability, and Security
Untertitel42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings
Herausgeber/-innenJérémie Guiochet, Stefano Tonetta, Friedemann Bitsch
Herausgeber (Verlag)Springer International Publishing AG
Seiten243–256
Seitenumfang14
ISBN (elektronisch)978-3-031-40923-3
ISBN (Print)978-3-031-40922-6
PublikationsstatusVeröffentlicht - 2023

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band14181 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Abstract

Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

ASJC Scopus Sachgebiete

Zitieren

Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. / Beyer, Michael; Borrmann, Jan Micha; Guntoro, Andre et al.
Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Hrsg. / Jérémie Guiochet; Stefano Tonetta; Friedemann Bitsch. Springer International Publishing AG, 2023. S. 243–256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 14181 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandBeitrag in Buch/SammelwerkForschungPeer-Review

Beyer, M, Borrmann, JM, Guntoro, A & Blume, H 2023, Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. in J Guiochet, S Tonetta & F Bitsch (Hrsg.), Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bd. 14181 LNCS, Springer International Publishing AG, S. 243–256. https://doi.org/10.1007/978-3-031-40923-3_18
Beyer, M., Borrmann, J. M., Guntoro, A., & Blume, H. (2023). Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. In J. Guiochet, S. Tonetta, & F. Bitsch (Hrsg.), Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings (S. 243–256). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 14181 LNCS). Springer International Publishing AG. https://doi.org/10.1007/978-3-031-40923-3_18
Beyer M, Borrmann JM, Guntoro A, Blume H. Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. in Guiochet J, Tonetta S, Bitsch F, Hrsg., Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Springer International Publishing AG. 2023. S. 243–256. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2023 Sep 11. doi: 10.1007/978-3-031-40923-3_18
Beyer, Michael ; Borrmann, Jan Micha ; Guntoro, Andre et al. / Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Hrsg. / Jérémie Guiochet ; Stefano Tonetta ; Friedemann Bitsch. Springer International Publishing AG, 2023. S. 243–256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inbook{a408e49d5422442eaaa06e3b97b993b7,
title = "Online Quantization Adaptation for Fault-Tolerant Neural Network Inference",
abstract = "Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN{\textquoteright}s classification performance. We can preserve a NN{\textquoteright}s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.",
keywords = "Approximate Computing, Automotive, Fault Tolerance, Neural Network Hardware, Neural Networks, Quantization",
author = "Michael Beyer and Borrmann, {Jan Micha} and Andre Guntoro and Holger Blume",
note = "This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062).",
year = "2023",
doi = "10.1007/978-3-031-40923-3_18",
language = "English",
isbn = "978-3-031-40922-6",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer International Publishing AG",
pages = "243–256",
editor = "J{\'e}r{\'e}mie Guiochet and Stefano Tonetta and Friedemann Bitsch",
booktitle = "Computer Safety, Reliability, and Security",
address = "Switzerland",

}

Download

TY - CHAP

T1 - Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

AU - Beyer, Michael

AU - Borrmann, Jan Micha

AU - Guntoro, Andre

AU - Blume, Holger

N1 - This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062).

PY - 2023

Y1 - 2023

N2 - Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

AB - Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

KW - Approximate Computing

KW - Automotive

KW - Fault Tolerance

KW - Neural Network Hardware

KW - Neural Networks

KW - Quantization

UR - http://www.scopus.com/inward/record.url?scp=85172099424&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-40923-3_18

DO - 10.1007/978-3-031-40923-3_18

M3 - Contribution to book/anthology

SN - 978-3-031-40922-6

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 243

EP - 256

BT - Computer Safety, Reliability, and Security

A2 - Guiochet, Jérémie

A2 - Tonetta, Stefano

A2 - Bitsch, Friedemann

PB - Springer International Publishing AG

ER -

Von denselben Autoren