menu_book Explore the article's raw data

Decentralized Nash Equilibria Learning for Online Game With Bandit Feedback

Abstract

This article studies distributed online bandit learning of generalized Nash equilibria for online games, where the cost functions of all players and coupled constraints are time-varying. The function values, rather than full information about cost and local constraint functions, are revealed to local players with time delays. The goal of each player is to selfishly minimize its own cost function with no future information, subject to a strategy set constraint and time-varying coupled inequality constraints. To this end, a distributed online algorithm based on mirror descent and one-point delayed bandit feedback is designed for seeking generalized Nash equilibria in the online game. It is shown that the devised online algorithm achieves sublinear expected regrets and accumulated constraint violation if the path variation of the generalized Nash equilibrium sequence is sublinear. Simulations are presented to illustrate the efficiency of the theoretical result.

article Article
date_range 2024
language English
link Link of the paper
format_quote
Sorry! There is no raw data available for this article.
Loading references...
Loading citations...
Featured Keywords

Distributed online learning
generalized Nash equilibrium
mirror descent
one-point delayed bandit feedback
online game
Citations by Year

Share Your Research Data, Enhance Academic Impact