Engineering Adaptive Multiagent Systems

José M. Vidal

University of South Carolina


In the past twenty years we have seen an increasing emphasis on the study of the dynamics of complex systems such as the human immune system and the economy. Some of this work is reflected in volumes such as [1, 2, 3]. The belief is that we need to understand the relationships between disparate phenomena in order to determine the long-term repercussions of local actions. For example, economic policies like sanctions can have dire consequences on the environment because they might set off a sequence of events that result in undesirable consequences.

The rise of Internet and pervasive computing are prompting the need to understand and engineer systems of similar complexity. These systems, however, are composed of articial agents which implement machine learning techniques and will be refered to as adaptive multiagent systems. Examples range from simple adaptive embeded systems for pervasive computing and systems for dynamic resource allocation under uncertainty to grandiose world-wide electronic marketplaces of agents, global information exchanges, and ``global brains'' [4] that accrue humanity's knowledge.

In the following section I present my plan to study the dynamics of adaptive multiagent systems in order to develop a predictive theory of their dynamics and a methodology for the engineering of utility-based adaptive multiagent systems. I also present an system architecture and development plan for an incentive-compatible information exchange system. This system will use the engineering techniques I develop as well as serve as a first step towards realizing the visionary scenarios described above.

Adaptive Multiagent Systems

An adaptive multiagent system is composed of articial (software) agents that use machine-learning techniques to improve their behavior over time. The behaviors it exhibits are sometimes those of a naturally-occurring complex adaptive systems [5] but, since artificial agents have different characteristics from animals (e.g., they learn in a different way and do not reproduce), their emergent behavior is different. Since the agents in these systems are constantly learning about the effects of their actions, the world they live in, the behaviors of the other agents, and the effect that these behaviors have on the system, it is a challenge to build systems that will exhibit a specified behavior.

A simple example is the use of machine-learning agents in the iterated Prisoner's Dilemma [6]. Another example is the e-commerce system with agents buying and selling from each other and learning about the quality of goods and services received [7]. In this system the agents could learn models of other agents and can make better decisions about who to trade with and who to avoid [8].

Since the agents in an adaptive multiagent system are constantly changing their behavior to incorporate the new learned knowledge and they interact with each other in complex ways, it is hard to determine how the system as a whole will behave. Currently the only general way to find out what will happen is by actually taking the time to build the whole system and running it. However, it is very important to determine beforehand what to expect in terms of system dynamics. For example, we would not want to build a system where the agents end up not communicating with each other, or where they communicate so much they overwhelm the communications infrastructure, or where a few agents are destined to become dictators of the whole system, etc.

Engineering Adaptive Multiagent Systems

The theory I am developing views agents as utility maximizing or rational entities in much the same way economists view humans as rational utility maximizing agents. However my agents must learn which behaviors lead them to the highest payoff. Of course, as one agent adapts by changing its behavior, this might affect the utility that all the other agents receive.

As the agents modify their behavior the system designers will be concerned with the global utility of the system. The designer of the system defines a global utility for the system and then designs the system, to the best of his abilities and within the given constraints, so it will behave in a way that maximizes this utility. The global utility can be as simple as just the sum of the individual agent's utilities, or it can be a more complex function of the agents' behavior.

We can get a better understanding of the dynamics of such a system if we picture a 2-dimensional plane where each point represents a possible behavior for an agent. The points are connected to each other via directed links, each connection representing the possibility that the agent could learn and modify its behavior, moving it from one point to the other. We can then visualize a third dimension that represents the utility the agent will receive if it engages in the given behavior.

An agent's learning can then be viewed as a path in this landscape. Depending on the agent's learning algorithm, we can often assume the agent generally goes ``up'' on the utility landscape. That is, it tries to maximize its utility. However, since the other agents are also changing their behavior, this means the agent's landscape might change after each step and each of its steps might change other agents' landscapes.

Furthermore, we can take a global view where each point in the xy plane is a vector that contains all the agents' behaviors. In this landscape the z axis is the global utility. This is he picture the system designer is most concerned with. Does the system behavior converge to a global optima? What behavior does it engage in? The answers to these questions will depend on the particulars of the learning algorithm used. My research will answer these questions.

My research will develop a general utility-based approach to multiagent systems engineering. The basic steps of this approach we have already hinted at. We first consider each agent as endowed with its own utility function and we describe the desired global behavior using another utility function. Then we decide how the agents will behave. The agents can be either selfish utility-maximizing agents, they can form teams (or ``patches'') and maximize the utility of the group, or they can be made to maximize the global utility. Some of these choices might not be possible in some problem domains. Communication and coordination strategies are chosen in order to align the agents' behavior with the global utility. Finally, a machine learning algorithm might be used in some agents in order to increase their effectiveness over time and alighn their actions with their utility and with the global utility.

Deciding on which type of agent behavior (e.g., selfish or team-oriented), coordination or communication protocol, and machine learning algorithm to use for a particular implementation is still an open problem. Our research, however, will shed some light into this question. Specifically, once we can view the agent interactions as search in an n-dimensional utility landscape and we have the tools to analyze the dynamics of this search, then plotting the effect of any combination of behavior/protocol/learning algorithms should be a mostly automatic task.

That is, when this resarch is finished I will be able to provide tools, either mathematical or programmatic, which will take as input parameters describing the proposed system (e.g., team-based agents with size of five, a contract-net protocol, and reinforcement learning with learning rate of .8), and deliver a description of the expected system dynamics.

Impact of this Research

The theoretical research proposed in this section will enable the development of a wide range of multiagent systems. These include the allocation of weapons to targets in a military domain, the allocation of goods an services in an electronic commerce application , and the development of a manufacturing factory scheduler These systems could be made more robust and dynamic with the use of adaptive agents, if we could predict how local adaptivity affects global behavior. The coming emergence of nanotechnologies and of pervasive computing, where every electronic gadget is on the network, brought about by new technologies like Jini and Bluetooth will also enable the deployment of a mirriad of new adaptive multiagent systems.

The proposed research will also enable us to take one more step towards the goal of a ``global brain'' where the ``sum'' of the knowledge is made accessible to anyone, at any time. The WWW has allowed us to place a lot of information online. We must now endeavor to trasform this information into knowledge. One way to accomplish this is with the use of intelligent learning agents that can share and seek information in an intelligent fashion. For example, a doctor will have an agent that holds his information and answers his queries by asking other agents that it trusts (from past experience), or asking those agents to recommend other agents. Another example is a democracy where every person constantly expresses an opinion, via his agent, on those subjects he cares about so our leaders are always able to carry out the actual will of the people. Imagine a government that makes funding decisions by instantenously aggregating the knowledge of all the appropriate researchers in the field, rather than relying on the knowledge of a handful of reviewers.


  1. Stuart Kauffman. The Origins of Order: Self-Organization and Selection in Evolution. Oxford Univeristy Press. 1993.
  2. Mitchel Resnick. Turtles, Termites and Traffic Jams. The MIT Press. 1994.
  3. Robert Axelrod. The Complexity of Cooperation. p. 10--29, Princeton University Press. 1997.
  4. Michael Brooks. Global Brain. New Scientist, June 24; 2000.
  5. John H. Holland. Hidden Order. Addison-Wesley. 1995.
  6. Robert M. Axelrod. The Evolution of Cooperation. Basic Books. 1984.
  7. Jeffrey O. Kephart, James E. Hanson, and Amy R. Greenwald. Dynamic Pricing by Software Agents. Computer Networks, 32(6):731--752, 2000.
  8. José M. Vidal and Edmund H. Durfee. Learning Nested Models in an Information Economy Journal of Experimental and Theoretical Artificial Intelligence. 10(3):291--308, 1998.

José M Vidal-
This paper is available at
Last modified: Sat Jun 8 11:19:47 EDT 2002