go back

Generating and Adapting to Diverse Ad-Hoc Partners in Hanabi

Rodrigo Canaan, Xianbo Gao, Julian Togelius, Andy Nealen, Stefan Menzel, "Generating and Adapting to Diverse Ad-Hoc Partners in Hanabi", IEEE Transactions on Games, 2022.


Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect. In this paper, we focus on ad-hoc settings with no previous coordination between partners. We introduce a “Bayesian Meta-Agent” that maintains a belief distribution over hypotheses of partner policies. The policies that serve as initial hypotheses are generated using MAP-Elites, to ensure behavioral diversity. We evaluate an “Adaptive” version of the agent, which selects a response policy based on the updated belief distribution and a “Generalist” version, which selects a response based on the uniform prior. In short episodes of 10 games with a consistent partner, the “Adaptive” version outperforms the “Generalist” when the training and evaluation populations are the same. This presents a first step towards an agent that can model its partner and adapt within a time frame that is compatible with human interaction.

Download Bibtex file Download PDF