Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

Li Zhang; Jialu Fan; Wenqian Xue; Victor G. Lopez; Jinna Li; Tianyou Chai; Frank L. Lewis

doi:10.1109/TNNLS.2021.3112457

Details

Originalsprache	Englisch
Seiten (von - bis)	3553-3567
Seitenumfang	15
Fachzeitschrift	IEEE Transactions on Neural Networks and Learning Systems
Jahrgang	34
Ausgabenummer	7
Publikationsstatus	Veröffentlicht - 18 Okt. 2021

Abstract

This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

ASJC Scopus Sachgebiete

Informatik (insg.)
Software
Informatik (insg.)
Angewandte Informatik
Informatik (insg.)
Computernetzwerke und -kommunikation
Informatik (insg.)
Artificial intelligence

Zitieren

Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. / Zhang, Li; Fan, Jialu; Xue, Wenqian et al.
in: IEEE Transactions on Neural Networks and Learning Systems, Jahrgang 34, Nr. 7, 18.10.2021, S. 3553-3567.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Zhang, L, Fan, J, Xue, W, Lopez, VG, Li, J, Chai, T & Lewis, FL 2021, 'Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning', IEEE Transactions on Neural Networks and Learning Systems, Jg. 34, Nr. 7, S. 3553-3567. https://doi.org/10.1109/TNNLS.2021.3112457

Zhang, L., Fan, J., Xue, W., Lopez, V. G., Li, J., Chai, T., & Lewis, F. L. (2021). Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. IEEE Transactions on Neural Networks and Learning Systems, 34(7), 3553-3567. https://doi.org/10.1109/TNNLS.2021.3112457

Zhang L, Fan J, Xue W, Lopez VG, Li J, Chai T et al. Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. IEEE Transactions on Neural Networks and Learning Systems. 2021 Okt 18;34(7):3553-3567. doi: 10.1109/TNNLS.2021.3112457

Zhang, Li ; Fan, Jialu ; Xue, Wenqian et al. / Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. in: IEEE Transactions on Neural Networks and Learning Systems. 2021 ; Jahrgang 34, Nr. 7. S. 3553-3567.

Download

@article{d9bc7c1487d0485c852325f4a6c51e33,

title = "Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning",

abstract = "This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.",

keywords = "Hcontrol, off-policy Q-learning, Q-learning, static output feedback (OPFB), zero-sum game",

author = "Li Zhang and Jialu Fan and Wenqian Xue and Lopez, {Victor G.} and Jinna Li and Tianyou Chai and Lewis, {Frank L.}",

note = "Funding Information: This work was supported in part by the NSFC under Grant 61991400, Grant 61991404, Grant 61533015, and Grant 62073158; in part by the 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008; and in part by the Liaoning Revitalization Talents Program under Grant XLYC2007135.",

year = "2021",

month = oct,

day = "18",

doi = "10.1109/TNNLS.2021.3112457",

language = "English",

volume = "34",

pages = "3553--3567",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "7",

}

Download

TY - JOUR

T1 - Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

AU - Zhang, Li

AU - Fan, Jialu

AU - Xue, Wenqian

AU - Lopez, Victor G.

AU - Li, Jinna

AU - Chai, Tianyou

AU - Lewis, Frank L.

N1 - Funding Information: This work was supported in part by the NSFC under Grant 61991400, Grant 61991404, Grant 61533015, and Grant 62073158; in part by the 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008; and in part by the Liaoning Revitalization Talents Program under Grant XLYC2007135.

PY - 2021/10/18

Y1 - 2021/10/18

N2 - This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

AB - This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

KW - Hcontrol

KW - off-policy Q-learning

KW - Q-learning

KW - static output feedback (OPFB)

KW - zero-sum game

UR - http://www.scopus.com/inward/record.url?scp=85164272276&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2021.3112457

DO - 10.1109/TNNLS.2021.3112457

M3 - Article

C2 - 34662280

AN - SCOPUS:85164272276

VL - 34

SP - 3553

EP - 3567

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 7

ER -

Research@Leibniz University

Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren