El Farol Bar Problem

Agents try to decide which night of the week they should go to El Farol Bar.
Each agent makes an independent decision at the beginning of the week.
The agents have the best time if there are c agents in the bar that night.
Lets say the agents use reinforcement learning to learn which night is best. If we, as system designers, want to maximize the sum of all agent utilities, what reward function should we use?
- The agent gets a reward based on how many agents where there on the night it attended?
- On the nights he did not attend?
- Something else?
COllective INtelligence (COIN) tries to answer this question for the general case.