Improving reinforcement learning by using case based heuristics

Reinaldo A.C. Bianchi, Raquel Ros, Ramon Lopez De Mantaras

Producció científica: Capítol de llibreContribució a congrés/conferènciaAvaluat per experts

34 Cites (Scopus)

Resum

This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Reinforcement Learning (RL) techniques. This approach, called Case Based Heuristically Accelerated Reinforcement Learning (CB-HARL), builds upon an emerging technique, the Heuristic Accelerated Reinforcement Learning (HARL), in which RL methods are accelerated by making use of heuristic information. CB-HARL is a subset of RL that makes use of a heuristic function derived from a case base, in a Case Based Reasoning manner. An algorithm that incorporates CBR techniques into the Heuristically Accelerated Q-Learning is also proposed. Empirical evaluations were conducted in a simulator for the RoboCup Four-Legged Soccer Competition, and results obtained shows that using CB-HARL, the agents learn faster than using either RL or HARL methods.

Idioma originalAnglès
Títol de la publicacióCase-Based Reasoning Research and Development - 8th International Conference on Case-Based Reasoning, ICCBR 2009, Proceedings
Pàgines75-89
Nombre de pàgines15
DOIs
Estat de la publicacióPublicada - 2009
Publicat externament
Esdeveniment8th International Conference on Case-Based Reasoning, ICCBR 2009 - Seattle, WA, United States
Durada: 20 de jul. 200923 de jul. 2009

Sèrie de publicacions

NomLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volum5650 LNAI
ISSN (imprès)0302-9743
ISSN (electrònic)1611-3349

Conferència

Conferència8th International Conference on Case-Based Reasoning, ICCBR 2009
País/TerritoriUnited States
CiutatSeattle, WA
Període20/07/0923/07/09

Fingerprint

Navegar pels temes de recerca de 'Improving reinforcement learning by using case based heuristics'. Junts formen un fingerprint únic.

Com citar-ho