Each buyer can now keep a history of the qualities it ascribes to the goods returned by each seller. She can, in fact, remember the last N qualities returned by some seller s for some good g, and define a probability density function over the qualities x returned by s (i.e. returns the probability that s returns an instance of good g that has quality x). She can use the expected value of this function to calculate which seller she expects will give her the highest expected value.
The buyer does not need to model other buyers since they do not affect the value she gets.