We have presented an algorithm that provides an effective way of pruning a recursive model so that only a few critical nodes need to be expanded in order to get a quality solution. The algorithm does the double task of delivering an answer within a set of payoff/time cost constraints, and pruning unnecessary knowledge from the recursive model of a situation. We also gave a formal framework for talking about strategies, expected strategies, and expected gains from expanding a recursive model.
The experimental results proved that this algorithm works, in the example domain. We expect that the algorithm will provide similar results in other problem domains, provided that they show some correspondence between the physical situation and the strategy played by the agent. The test runs taught us three main lessons. Firstly that it is possible and beneficial to ignore some of the deeper models. Secondly, that the question of which models can safely be ignored, and which ones should not be ignored, can only be answered at run time. Our data showed that the actual nodes chosen for expansion differ from situation to situation (i.e. depending on the relative locations of the predators and prey). Thirdly, they showed us how much memory is actually needed for handling the computation of expected strategies, even without considering the agent's mental state. This seems to suggest that more complete implementations will require a very smart similarity function for detecting which situations are similar to which others if we want to maintain the solution quality. We are currently investigating which techniques (e.g. reinforcement learning, neural networks, case-based reasoning) are best suited for this task.
From our experience with the algorithm, the test results, and our ongoing work, we arrived at some important conclusions. First of all, recursive models appear very frequently when working with Multi-Agent Systems (for another example see ). They are not just artifacts of our modeling techniques but are, in fact, dictated by the fact that ``smart'' agents when interacting with other agents, will usually chose to build models of them. Moreover, if the agents do a reasonable job in producing these models, they will find them useful for predicting the actions of others and, therefore, for determining the best action to take. The success of these models leads to the building of recursive models for predicting the actions of others. This escalation in recursive modeling quickly leads to models that are too unwieldy to use, as argued before. However, our second conclusion points to the fact that one can not pick an arbitrary model size and say that agents will only build recursive models smaller than that, even when given a fixed environment. The problem is that there are times when using deeper models will be profitable and times when the best models are the shallow and quick ones. We see our algorithm as a first step in developing a principled method that an agent can use to decide the depth of the models its should use.
Experience has also shown us that there is a definite set of challenges that lie ahead. We have seen that, in order to apply our techniques for reducing the amount of information the agent considers, we actually need more information. For our experiments, we made some ad-hoc decisions as to what extra information to gather and how to store it. We believe that such decisions will always have to be made and will be domain-dependent. That is why we are trying to divorce our theory from the particulars of the modeling/learning techniques and develop a decision framework that would allow an agent, given its own modeling/learning techniques, to determine what particular models to use and when to use them. Finally, we are looking for ways to apply or ideas to other modeling techniques. Our algorithm is based on the RMM framework which is a formal and clean modeling technique that made it easier for us to formalize our ideas. Unfortunately, these benefits do not come for free, namely, RMM suffers from a combinatorial explosion since it requires knowledge of the payoffs for all the combinations of all the possible actions. We are currently trying to apply the ideas distilled from this work and form a more general approach that can work with different agent modeling techniques such as those based on Beliefs, Desires and Intentions logics .
Next: References Up: Using Recursive Agent Models Previous: Results