Rationality of Learning Algorithms in Repeated Games
Primary Investigator:
Vijay Gupta
Shivam Bajaj, Pranoy Das, Yevgeniy Vorobeychik\, Vijay Gupta
Abstract
Many learning algorithms are known to converge to an equilibrium for specific classes of games if the same learning algorithm is adopted by all agents. However, in applications in which AI and humans work together, a natural question is whether a (human) agent have an incentive to unilaterally shift to an alternative learning algorithm. We capture such incentives as an algorithm’s rationality ratio, which is the ratio of the highest payoff an agent can obtain by unilaterally deviating from a learning algorithm to its payoff from
following it. We define a learning algorithm to be c-rational if its rationality ratio is at most c irrespective of the game. We show that popular learning algorithms such as fictitious play and regret-matching are not c-rational for any constant c. We propose a framework that can build upon any existing learning algorithm and establish that our proposed algorithm is (i) c-rational for a given c and (ii) the strategies of the agents converge to an equilibrium, with high probability.