<BODY LANG="EN" bgcolor="#ffffff" text="#000000"> <P> <H1><A NAME="SECTION00050000000000000000">Learning a moving target</A></H1> <P> The sample complexities we have been talking about give us an idea of the time it would take for the learning algorithm to learn a <em>fixed</em> function. However, in a lot of MASs the function that the agents are trying to learn is constantly changing, mostly because other agents are also learning and changing their behaviors. Our results still apply because an agent that takes longer to learn a fixed function will also take longer to learn a changing function, since this problem is broken down to the problem of learning different fixed functions over time. Still, we wish to know more about the relative effectiveness of learning algorithms with different sample complexities, when learning target functions that change at different rates. <P> The first thing we need to do is to characterize the rate at which the learning algorithm learns, and the rate at which the target function changes. We do this using a differential equation that tells us what the error (as defined in Equation&nbsp;<A HREF="node4.html#error" target="contents">1</A>) of the model will be at time <I>t</I>+1 given the error at time <I>t</I>. <P> The learning algorithm is trying to learn the function <I>f</I>, where <IMG WIDTH=32 HEIGHT=19 ALIGN=MIDDLE ALT="tex2html_wrap_inline1479" SRC="img50.gif"> is merely one of the possible models that the agent can have. If this function has an error of 1 it means that none of its mappings is correct (i.e. it takes the wrong action in every state). An agent with such a function at time <I>t</I>, will observe the next action and will definitely learn something since all its mappings are incorrect. Its error at time <I>t</I>+1 will drop accordingly. But, as it learns more and more, the probability that the next action it observes is not one that it already knows keeps decreasing and, therefore, its error at time <I>t</I>+1 keeps getting closer and closer to its error at time <I>t</I>. We model this behavior by assuming that the error <I>e</I><SUB><I>l</I></SUB> of a learning algorithm <I>l</I> at time <I>t</I>+1 is given by <P> <BR><A NAME="lerror">&#160;</A><IMG WIDTH=298 HEIGHT=12 ALIGN=MIDDLE ALT="displaymath1590" SRC="img51.gif"><BR> <P> where <IMG WIDTH=49 HEIGHT=19 ALIGN=MIDDLE ALT="tex2html_wrap_inline1495" SRC="img52.gif"> is the slope of the learning curve. <IMG WIDTH=7 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1497" SRC="img53.gif"> can be determined from the sample complexity and the learning algorithm used. In general, it is safe to say that bigger sample complexities will lead to bigger <IMG WIDTH=7 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1497" SRC="img53.gif"> values. If <IMG WIDTH=28 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1501" SRC="img54.gif"> then the algorithm never learns anything, if <IMG WIDTH=29 HEIGHT=10 ALIGN=BOTTOM ALT="tex2html_wrap_inline1503" SRC="img55.gif"> it learns everything from just one example. We have plotted such a function in Figure&nbsp;<A HREF="node5.html#landmplot" target="contents">1</A> for<A NAME="tex2html10" HREF="footnode.html#421" target="footer"><IMG ALIGN=BOTTOM ALT="gif" SRC="http://ai.eecs.umich.edu/people/jmvidal/icons/foot_motif.gif"></A> <IMG WIDTH=32 HEIGHT=10 ALIGN=BOTTOM ALT="tex2html_wrap_inline1021" SRC="img7.gif">. <P> In a similar way, the moving target will introduce an error to the current model at each time. If the current error is 1 then no amount of moving will make it bigger (by definition). While, as the error gets closer to 0, the moving target function is expected to introduce larger errors. We can characterize the error function <I>e</I><SUB><I>m</I></SUB> for the moving target <I>M</I> with the differential equation: <BR><A NAME="mterror">&#160;</A><IMG WIDTH=324 HEIGHT=12 ALIGN=MIDDLE ALT="displaymath1592" SRC="img56.gif"><BR> <P> <I>v</I> is where the line intersects the x-axis and the slope is (1-<I>v</I>). We have also shown this curve in Figure&nbsp;<A HREF="node5.html#landmplot" target="contents">1</A>, for <I>v</I> = .1. It is easy to see that if <I>v</I>=0 then the target function is always fixed, while if <I>v</I>=1 it is moving in the worst possible way. We use <I>v</I> as a measure of the <em>velocity</em> at which the target function moves away from the learned model. <P> In Figure&nbsp;<A HREF="node5.html#landmplot" target="contents">1</A> we trace the successive errors for a model given the two curves, assuming that the curves have been defined using the same time scales. We see that the error will converge to some point or, rather, a pair of points. These points can be determined by combining Equation&nbsp;<A HREF="node5.html#lerror" target="contents">3</A> and&nbsp;<A HREF="node5.html#mterror" target="contents">4</A> into one error function. <BR><IMG WIDTH=322 HEIGHT=12 ALIGN=MIDDLE ALT="displaymath1594" SRC="img57.gif"><BR> <P> The solution such that <I>e</I>(<I>t</I>+1) = <I>e</I>(<I>t</I>) is <BR><A NAME="emin">&#160;</A><IMG WIDTH=301 HEIGHT=29 ALIGN=MIDDLE ALT="displaymath1596" SRC="img58.gif"><BR> <BR><A NAME="emax">&#160;</A><IMG WIDTH=349 HEIGHT=25 ALIGN=MIDDLE ALT="displaymath1598" SRC="img59.gif"><BR> <P> <P><A NAME="576">&#160;</A><A NAME="landmplot">&#160;</A><IMG WIDTH=306 HEIGHT=223 ALIGN=BOTTOM ALT="figure448" SRC="img60.gif"><BR> <STRONG>Figure:</STRONG> Plot of the errors for the Learning function <I>e</I><SUB><I>l</I></SUB> and the Moving Target function <I>e</I><SUB><I>m</I></SUB> for <IMG WIDTH=32 HEIGHT=10 ALIGN=BOTTOM ALT="tex2html_wrap_inline1021" SRC="img7.gif">, and <I>v</I>=.1.<BR> <P> <P> Equation&nbsp;<A HREF="node5.html#emin" target="contents">6</A> gives us the minimum error that we expect given <I>e</I><SUB><I>l</I></SUB> and <I>e</I><SUB><I>m</I></SUB>. It corresponds to the error right after the agent has tried to learn the target function. Equation&nbsp;<A HREF="node5.html#emax" target="contents">7</A> gives the error right after the target moves. We now have the necessary math to examine the differences in the learning and the moving target error functions. Lets say we have two agents with different sample complexities and, therefore, different learning rates <IMG WIDTH=7 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1497" SRC="img53.gif"> and <IMG WIDTH=38 HEIGHT=18 ALIGN=MIDDLE ALT="tex2html_wrap_inline1547" SRC="img61.gif">. Using Equation&nbsp;<A HREF="node5.html#emax" target="contents">7</A> we can calculate the difference in their expected maximum errors. <BR><A NAME="deltaerror">&#160;</A><IMG WIDTH=359 HEIGHT=25 ALIGN=MIDDLE ALT="displaymath1600" SRC="img62.gif"><BR> <P> This is a rather complicated equation so we have plotted it in Figure&nbsp;<A HREF="node5.html#deltaerrorplot" target="contents">2</A> for different values of <IMG WIDTH=17 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1549" SRC="img63.gif"> and <IMG WIDTH=32 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1551" SRC="img64.gif">. We notice that, for small values of <I>v</I>, <IMG WIDTH=15 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1019" SRC="img6.gif"> increases linearly as a function of <I>v</I>, but it quickly reaches a maximum and starts to decrease. This means that the difference in the maximum expected error between the agent with the lower sample complexity (i.e. smaller <IMG WIDTH=7 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1497" SRC="img53.gif">), and the one with higher sample complexity, will increase as a function of <I>v</I> for small values of <I>v</I>, and will decrease for bigger values. Still, we notice that <IMG WIDTH=15 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1019" SRC="img6.gif"> is always a positive value, and is only 0 for <I>v</I>=0 and <I>v</I>=1, an unlikely situation. <P> <P><A NAME="577">&#160;</A><A NAME="deltaerrorplot">&#160;</A><IMG WIDTH=311 HEIGHT=224 ALIGN=BOTTOM ALT="figure469" SRC="img65.gif"><BR> <STRONG>Figure:</STRONG> Difference in the error of an agent with high sample complexity (<IMG WIDTH=38 HEIGHT=18 ALIGN=MIDDLE ALT="tex2html_wrap_inline1547" SRC="img61.gif">) minus one with low sample complexity (<IMG WIDTH=7 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1497" SRC="img53.gif">), as given by Equation&nbsp;<A HREF="node5.html#deltaerror" target="contents">8</A>, plotted as a function of <I>v</I>.<BR> <P> <P> <BR><A NAME="thmvolatility">&#160;</A><IMG WIDTH=411 HEIGHT=26 ALIGN=BOTTOM ALT="theorem476" SRC="img66.gif"><BR> <b>Proof</b> Equation&nbsp;<A HREF="node5.html#deltaerror" target="contents">8</A> shows that <IMG WIDTH=37 HEIGHT=19 ALIGN=MIDDLE ALT="tex2html_wrap_inline1579" SRC="img67.gif"> for all legal values of <IMG WIDTH=17 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1549" SRC="img63.gif">, given our assumptions about the learning and moving target error functions. Note that, even thought the lower sample complexities always lead to smaller errors, only for the smaller values of <I>v</I> do we find a correlation between increased <I>v</I> and increased difference in error <IMG WIDTH=15 HEIGHT=9 ALIGN=BOTTOM ALT="tex2html_wrap_inline1019" SRC="img6.gif">. <P> This theorem leads us to look for metrics (e.g. <EM>price volatility</EM> in a market system) that can be used to determine how fast the system changes (i.e. the <I>v</I>) and, in turn, how much better the models of the agents with smaller sample complexity will be. For example, the designer of a 1-level agent in a market system might, after some time has passed, notice that the price volatility in the system is close to zero. He should then, using Theorem&nbsp;<A HREF="node5.html#thmvolatility" target="contents">3</A>, consider making his agent a 0-level modeler. <P> <HR> <P><ADDRESS> <I>Jose M. Vidal <BR> jmvidal@umich.edu <BR> Thu Apr 24 15:00:31 EDT 1997</I> </ADDRESS> </BODY>