Bar Experiment
- We define the G(S) to be the sum over all time of the the
utilities that all the agents received for their actions. That
is
- G(S) = SUMt SUMk=1..t lk(xk(S,t))
where xk(S,t) is the number of agents that attended
on night k at week t and lk(y) =
ak*y*exp(-y/c) is the utility that is derived from
that attendance. This function is maximized when y=c.
- Two different choices for ak were explored. One
were attendance on all nights is equally weighted and one were
we are only concerned with attendance on one specific night.
- Three different reward functions were tested (where
dw is the night selected by w).
- Uniform Division reward UD = ldw(xdw(S,t))/xdw(S,t)
- Global reward GR = SUMk=1..7 lk(xk(S,t))
- Wonderful Life reward WL = ldw(xdw(S,t)) - ldw(xdw(CLw(S),t))
- Each agent is in its own subworld.
- The microlearning algorithms used is a basic reinforcement
algorithm with Boltzmann stochastic decisions.
José M. Vidal
.
4 of 13