<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>5</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Arnaud J Blanchard</style></author><author><style face="normal" font="default" size="100%">Lola Cañamero</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Martin V Butz</style></author><author><style face="normal" font="default" size="100%">Olivier Sigaud</style></author><author><style face="normal" font="default" size="100%">Giovanni Pezzulo</style></author><author><style face="normal" font="default" size="100%">Gianluca Baldassarre</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Anticipating Rewards in Continuous Time and Space: A Case Study in Developmental Robotics</style></title><secondary-title><style face="normal" font="default" size="100%">Anticipatory Behavior in Adaptive Learning Systems: From Brains to Individual and Social Behavior</style></secondary-title><tertiary-title><style face="normal" font="default" size="100%">Lecture Notes in Artificial Intelligence</style></tertiary-title></titles><dates><year><style  face="normal" font="default" size="100%">2007</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://www.springer.com/gp/book/9783540742616</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">Springer</style></publisher><pub-location><style face="normal" font="default" size="100%">Berlin, Heidelberg</style></pub-location><volume><style face="normal" font="default" size="100%">4520</style></volume><pages><style face="normal" font="default" size="100%">267–284</style></pages><isbn><style face="normal" font="default" size="100%">978-3-540-74261-6</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">This paper presents the first basic principles, implementation and experimental results of what could be regarded as a new approach to reinforcement learning, where agents—physical robots interacting with objects and other agents in the real world—can learn to anticipate rewards using their sensory inputs. Our approach does not need discretization, notion of events, or classification, and instead of learning rewards for the different possible actions of an agent in all the situations, we propose to make agents learn only the main situations worth avoiding and reaching. However, the main focus of our work is not reinforcement learning as such, but modeling cognitive development on a small autonomous robot interacting with an “adult” caretaker, typically a human, in the real world; the control architecture follows a Perception-Action approach incorporating a basic homeostatic principle. This interaction occurs in very close proximity, uses very coarse and limited sensory-motor capabilities, and affects the “well-being” and affective state of the robot. The type of anticipatory behavior we are concerned with in this context relates to both sensory and reward anticipation. We have applied and tested our model on a real robot.</style></abstract></record></records></xml>