Vidal's libraryTitle: | Hierarchical MultiAgent Reinforcement Learning |
Author: | Rajbala Makar, Sridhar Mahadevan, and Mohammad Ghavamzadeh |
Book Tittle: | Proceedings of the Fifth International Conference on Autonomous Agents |
Pages: | 246--253 |
Year: | 2001 |
Abstract: | In this paper we investigate the use of hierarchical rein- forcement learning to speed up the acquisition of cooper- ative multi-agent tasks. We extend the MAXQ framework to the multi-agent case. Each agent uses the same MAXQ hierarchy to decompose a task into sub-tasks. Learning is decentralized, with each agent learning three interrelated skills: how to perform subtasks, which order to do them in, and how to coordinate with other agents. Coordination skills among agents are learned by using joint actions at the highest level(s) of the hierarchy. The Q nodes at the high- est level(s) of the hierarchy are congured to represent the joint task-action space among multiple agents. In this ap- proach, each agent only knows what other agents are doing at the level of sub-tasks, and is unaware of lower level (prim- itive) actions. This hierarchical approach allows agents to learn coordination faster by sharing information at the level of sub-tasks, rather than attempting to learn coordination taking into account primitive joint state-action values. We apply this hierarchical multi-agent reinforcement learning algorithm to a complex AGV scheduling task and compare its performance and speed with other learning approaches, including at multi-agent, single agent using MAXQ, selsh multiple agents using MAXQ (where each agent acts inde- pendently without communicating with the other agents), as well as several well-known AGV heuristics like "rst come rst serve", "highest queue rst" and "nearest station rst". We also compare the tradeos in learning speed vs. perfor- mance of modeling joint action values at multiple levels in the MAXQ hierarchy. |
Cited by 1 - Google Scholar
@InProceedings{makar01a,
author = {Rajbala Makar and Sridhar Mahadevan and Mohammad
Ghavamzadeh},
title = {Hierarchical MultiAgent Reinforcement Learning},
booktitle = {Proceedings of the Fifth International Conference on
Autonomous Agents},
pages = {246--253},
year = 2001,
abstract = {In this paper we investigate the use of hierarchical
rein- forcement learning to speed up the acquisition
of cooper- ative multi-agent tasks. We extend the
MAXQ framework to the multi-agent case. Each agent
uses the same MAXQ hierarchy to decompose a task
into sub-tasks. Learning is decentralized, with each
agent learning three interrelated skills: how to
perform subtasks, which order to do them in, and how
to coordinate with other agents. Coordination skills
among agents are learned by using joint actions at
the highest level(s) of the hierarchy. The Q nodes
at the high- est level(s) of the hierarchy are
congured to represent the joint task-action space
among multiple agents. In this ap- proach, each
agent only knows what other agents are doing at the
level of sub-tasks, and is unaware of lower level
(prim- itive) actions. This hierarchical approach
allows agents to learn coordination faster by
sharing information at the level of sub-tasks,
rather than attempting to learn coordination taking
into account primitive joint state-action values. We
apply this hierarchical multi-agent reinforcement
learning algorithm to a complex AGV scheduling task
and compare its performance and speed with other
learning approaches, including at multi-agent,
single agent using MAXQ, selsh multiple agents using
MAXQ (where each agent acts inde- pendently without
communicating with the other agents), as well as
several well-known AGV heuristics like "rst come rst
serve", "highest queue rst" and "nearest station
rst". We also compare the tradeos in learning speed
vs. perfor- mance of modeling joint action values at
multiple levels in the MAXQ hierarchy.},
keywords = {multiagent reinforcement learning},
url = {http://jmvidal.cse.sc.edu/library/makar01a.pdf},
cluster = {2242243449876899562}
}
Last modified: Wed Mar 9 10:15:16 EST 2011