The effects on adaptive behaviour of negatively valenced signals in reinforcement learning

Nicolas Navarro-Guerrero; Robert J. Lowe; Stefan Wermter

doi:10.1109/devlrn.2017.8329800

Details

Originalsprache	Englisch
Titel des Sammelwerks	7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
Herausgeber (Verlag)	Institute of Electrical and Electronics Engineers Inc.
Seiten	148-155
Seitenumfang	8
ISBN (elektronisch)	9781538637159
Publikationsstatus	Veröffentlicht - 2 Juli 2017
Extern publiziert	Ja
Veranstaltung	7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 - Lisbon, Portugal Dauer: 18 Sept. 2017 → 21 Sept. 2017

Abstract

Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.

ASJC Scopus Sachgebiete

Informatik (insg.)
Artificial intelligence
Ingenieurwesen (insg.)
Maschinenbau
Mathematik (insg.)
Steuerung und Optimierung
Neurowissenschaften (insg.)
Entwicklungsneurowissenschaften

Zitieren

The effects on adaptive behaviour of negatively valenced signals in reinforcement learning. / Navarro-Guerrero, Nicolas; Lowe, Robert J.; Wermter, Stefan.
7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017. Institute of Electrical and Electronics Engineers Inc., 2017. S. 148-155.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Navarro-Guerrero, N, Lowe, RJ & Wermter, S 2017, The effects on adaptive behaviour of negatively valenced signals in reinforcement learning. in 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017. Institute of Electrical and Electronics Engineers Inc., S. 148-155, 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017, Lisbon, Portugal, 18 Sept. 2017. https://doi.org/10.1109/devlrn.2017.8329800

Navarro-Guerrero, N., Lowe, R. J., & Wermter, S. (2017). The effects on adaptive behaviour of negatively valenced signals in reinforcement learning. In 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 (S. 148-155). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/devlrn.2017.8329800

Navarro-Guerrero N, Lowe RJ, Wermter S. The effects on adaptive behaviour of negatively valenced signals in reinforcement learning. in 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017. Institute of Electrical and Electronics Engineers Inc. 2017. S. 148-155 doi: 10.1109/devlrn.2017.8329800

Navarro-Guerrero, Nicolas ; Lowe, Robert J. ; Wermter, Stefan. / The effects on adaptive behaviour of negatively valenced signals in reinforcement learning. 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017. Institute of Electrical and Electronics Engineers Inc., 2017. S. 148-155

Download

@inproceedings{2116921fa5e444d0a3afbdc33875394d,

title = "The effects on adaptive behaviour of negatively valenced signals in reinforcement learning",

abstract = "Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.",

author = "Nicolas Navarro-Guerrero and Lowe, {Robert J.} and Stefan Wermter",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.; 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 ; Conference date: 18-09-2017 Through 21-09-2017",

year = "2017",

month = jul,

day = "2",

doi = "10.1109/devlrn.2017.8329800",

language = "English",

pages = "148--155",

booktitle = "7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

Download

TY - GEN

T1 - The effects on adaptive behaviour of negatively valenced signals in reinforcement learning

AU - Navarro-Guerrero, Nicolas

AU - Lowe, Robert J.

AU - Wermter, Stefan

PY - 2017/7/2

Y1 - 2017/7/2

N2 - Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.

AB - Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.

UR - http://www.scopus.com/inward/record.url?scp=85050366417&partnerID=8YFLogxK

U2 - 10.1109/devlrn.2017.8329800

DO - 10.1109/devlrn.2017.8329800

M3 - Conference contribution

AN - SCOPUS:85050366417

SP - 148

EP - 155

BT - 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017

Y2 - 18 September 2017 through 21 September 2017

ER -

Research@Leibniz University

The effects on adaptive behaviour of negatively valenced signals in reinforcement learning

Autorschaft

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Cognitive inspired aspects of robot learning

Quantifying the Effect of Feedback Frequency in Interactive Reinforcement Learning for Robotic Tasks

Visuo-haptic object perception for robots: an overview

Continual Domain Randomization

Optimizing BioTac Simulation for Realistic Tactile Perception

Cognitive inspired aspects of robot learning

Quantifying the Effect of Feedback Frequency in Interactive Reinforcement Learning for Robotic Tasks

Visuo-haptic object perception for robots: an overview

Continual Domain Randomization

Optimizing BioTac Simulation for Realistic Tactile Perception

Cognitive inspired aspects of robot learning