<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ignasi Cos</style></author><author><style face="normal" font="default" size="100%">Lola Cañamero</style></author><author><style face="normal" font="default" size="100%">Gillian M Hayes</style></author><author><style face="normal" font="default" size="100%">Gillies, Andrew</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Hedonic Value: Enhancing Adaptation for Motivated Agents</style></title><secondary-title><style face="normal" font="default" size="100%">Adaptive Behavior</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Actor-Critic</style></keyword><keyword><style  face="normal" font="default" size="100%">Grounding</style></keyword><keyword><style  face="normal" font="default" size="100%">Hedonic Value</style></keyword><keyword><style  face="normal" font="default" size="100%">Motivation</style></keyword><keyword><style  face="normal" font="default" size="100%">Reinforcement Learning</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2013</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://journals.sagepub.com/doi/10.1177/1059712313486817</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">SAGE</style></publisher><volume><style face="normal" font="default" size="100%">21</style></volume><pages><style face="normal" font="default" size="100%">465–483</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Reinforcement learning (RL) in the context of artificial agents is typically used to produce behavioural responses as a function of the reward obtained by interaction with the environment. When the problem consists of learning the shortest path to a goal, it is common to use reward functions yielding a fixed value after each decision, for example a positive value if the target location has been attained and a negative one at each intermediate step. However, this fixed strategy may be overly simplistic for agents to adapt to dynamic environments, in which resources may vary from time to time. By contrast, there is significant evidence that most living beings internally modulate reward value as a function of their context to expand their range of adaptivity. Inspired by the potential of this operation, we present a review of its underlying processes and we introduce a simplified formalisation for artificial agents. The performance of this formalism is tested by monitoring the adaptation of an agent endowed with a model of motivated actor-critic, embedded with our formalisation of value and constrained by physiological stability, to environments with different resource distribution. Our main result shows that the manner in which reward is internally processed as a function of the agent’s motivational state, strongly influences adaptivity of the behavioural cycles generated and the agent’s physiological stability.</style></abstract><issue><style face="normal" font="default" size="100%">6</style></issue><notes><style face="normal" font="default" size="100%">&lt;a href=&quot;https://journals.sagepub.com/doi/10.1177/1059712313486817&quot;&gt;Download&lt;/a&gt;</style></notes></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Cos-Aguilera, Ignasi</style></author><author><style face="normal" font="default" size="100%">Lola Cañamero</style></author><author><style face="normal" font="default" size="100%">Gillian M Hayes</style></author><author><style face="normal" font="default" size="100%">Gillies, Andrew</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Joanna J Bryson</style></author><author><style face="normal" font="default" size="100%">Tony J Prescott</style></author><author><style face="normal" font="default" size="100%">Anil K Seth</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Ecological Integration of Affordances and Drives for Behaviour Selection</style></title><secondary-title><style face="normal" font="default" size="100%">Proc. IJCAI 2005 Workshop on Modeling Natural Action Selection</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2005</style></year></dates><pub-location><style face="normal" font="default" size="100%">Edinburgh, Scotland</style></pub-location><pages><style face="normal" font="default" size="100%">225–228</style></pages><isbn><style face="normal" font="default" size="100%">1-902956-40-9</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">This paper shows a study of the integration of physiology and perception in a biologically inspired robotic architecture that learns behavioural patterns by interaction with the environment. This implements a hierarchical view of learning and behaviour selection which bases adaptation on a relationship between reinforcement and the agent’s inner motivations. This view ingrains together the basic principles necessary to explain the underlying processes of learning behavioural patterns and the way these change via interaction with the environment. These principles have been experimentally tested and the results are presented and discussed throughout the paper.</style></abstract></record></records></xml>