Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

Michael Beyer; Jan Micha Borrmann; Andre Guntoro; Holger Blume

doi:10.1007/978-3-031-40923-3_18

Details

Original language	English
Title of host publication	Computer Safety, Reliability, and Security
Subtitle of host publication	42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings
Editors	Jérémie Guiochet, Stefano Tonetta, Friedemann Bitsch
Publisher	Springer International Publishing AG
Pages	243–256
Number of pages	14
ISBN (electronic)	978-3-031-40923-3
ISBN (print)	978-3-031-40922-6
Publication status	Published - 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14181 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

Keywords

Approximate Computing, Automotive, Fault Tolerance, Neural Network Hardware, Neural Networks, Quantization

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. / Beyer, Michael; Borrmann, Jan Micha; Guntoro, Andre et al.
Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. ed. / Jérémie Guiochet; Stefano Tonetta; Friedemann Bitsch. Springer International Publishing AG, 2023. p. 243–256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14181 LNCS).

Research output: Chapter in book/report/conference proceeding › Contribution to book/anthology › Research › peer review

Beyer, M, Borrmann, JM, Guntoro, A & Blume, H 2023, Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. in J Guiochet, S Tonetta & F Bitsch (eds), Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14181 LNCS, Springer International Publishing AG, pp. 243–256. https://doi.org/10.1007/978-3-031-40923-3_18

Beyer, M., Borrmann, J. M., Guntoro, A., & Blume, H. (2023). Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. In J. Guiochet, S. Tonetta, & F. Bitsch (Eds.), Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings (pp. 243–256). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14181 LNCS). Springer International Publishing AG. https://doi.org/10.1007/978-3-031-40923-3_18

Beyer M, Borrmann JM, Guntoro A, Blume H. Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. In Guiochet J, Tonetta S, Bitsch F, editors, Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. Springer International Publishing AG. 2023. p. 243–256. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2023 Sept 11. doi: 10.1007/978-3-031-40923-3_18

Beyer, Michael ; Borrmann, Jan Micha ; Guntoro, Andre et al. / Online Quantization Adaptation for Fault-Tolerant Neural Network Inference. Computer Safety, Reliability, and Security: 42nd International Conference, SAFECOMP 2023, Toulouse, France, September 20–22, 2023, Proceedings. editor / Jérémie Guiochet ; Stefano Tonetta ; Friedemann Bitsch. Springer International Publishing AG, 2023. pp. 243–256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inbook{a408e49d5422442eaaa06e3b97b993b7,

title = "Online Quantization Adaptation for Fault-Tolerant Neural Network Inference",

abstract = "Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN{\textquoteright}s classification performance. We can preserve a NN{\textquoteright}s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.",

keywords = "Approximate Computing, Automotive, Fault Tolerance, Neural Network Hardware, Neural Networks, Quantization",

author = "Michael Beyer and Borrmann, {Jan Micha} and Andre Guntoro and Holger Blume",

note = "This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062).",

year = "2023",

doi = "10.1007/978-3-031-40923-3_18",

language = "English",

isbn = "978-3-031-40922-6",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer International Publishing AG",

pages = "243–256",

editor = "J{\'e}r{\'e}mie Guiochet and Stefano Tonetta and Friedemann Bitsch",

booktitle = "Computer Safety, Reliability, and Security",

address = "Switzerland",

}

Download

TY - CHAP

T1 - Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

AU - Beyer, Michael

AU - Borrmann, Jan Micha

AU - Guntoro, Andre

AU - Blume, Holger

N1 - This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062).

PY - 2023

Y1 - 2023

N2 - Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

AB - Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

KW - Approximate Computing

KW - Automotive

KW - Fault Tolerance

KW - Neural Network Hardware

KW - Neural Networks

KW - Quantization

UR - http://www.scopus.com/inward/record.url?scp=85172099424&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-40923-3_18

DO - 10.1007/978-3-031-40923-3_18

M3 - Contribution to book/anthology

SN - 978-3-031-40922-6

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 243

EP - 256

BT - Computer Safety, Reliability, and Security

A2 - Guiochet, Jérémie

A2 - Tonetta, Stefano

A2 - Bitsch, Friedemann

PB - Springer International Publishing AG

ER -

Research@Leibniz University

Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

SmartHeaP- A High-level Programmable and Customized Hearing Aid System on Chip Integrated in a Research Hearing Aid Prototype

Modified Parabolic Synthesis for Hardware-Oriented Approximation of Unary Functions

Towards real-time LiDAR processing on RISC-V-based ASIPs: fast trigonometric approximations via parabolic synthesis

A Practical Survey on Static Task Scheduling Optimization Approaches for Heterogeneous Architectures

Fiber deviation and optimized toolpath strategies in melt electrowriting of tubular scaffolds

SmartHeaP- A High-level Programmable and Customized Hearing Aid System on Chip Integrated in a Research Hearing Aid Prototype

Modified Parabolic Synthesis for Hardware-Oriented Approximation of Unary Functions

Towards real-time LiDAR processing on RISC-V-based ASIPs: fast trigonometric approximations via parabolic synthesis

A Practical Survey on Static Task Scheduling Optimization Approaches for Heterogeneous Architectures

Fiber deviation and optimized toolpath strategies in melt electrowriting of tubular scaffolds

SmartHeaP- A High-level Programmable and Customized Hearing Aid System on Chip Integrated in a Research Hearing Aid Prototype