Agents in a system where agents are learning face a
moving target function.
As their behavior changes the others' goal changes, and
viz.
My CLRI theory can predict the expected error of an
agent given its changing rate, retention rate, learning rate,
and impact. But this in only a first step:
What are the systems' dynamics?
Does the addition of learning really make a system
more robust?
When is learning needed? useful?
How does a selfish agent determine when learning could be a strategic advantage?
Specifically, we are studying the behavior of
reinforcement learning agents.