Submodular Bandits
Principal Investigator: Vaneet Aggarwal
We investigate the problem of stochastic, combinatorial multi-armed bandits where the learner only has access to bandit feedback and the reward function can be non-linear. We provide a general framework for adapting discrete offline approximation algorithms into sub-linear regret methods that only require bandit feedback. The framework only requires the offline algorithms to be robust to small errors in function evaluation. The adaptation procedure does not even require explicit knowledge of the offline approximation algorithm -- the offline algorithm can be used as black box subroutine. Such approaches are useful in wide variety of setups including social influence maximization, revenue management, market design, product ranking optimization in online platforms, and reserve price optimization in auctions. Further, many of these problems have continuous combinatorial actions, which is also being explored.Â
Â
Representative publications include:
- Mohammad Pedramfar, Christopher John Quinn, and Vaneet Aggarwal, "A Unified Approach for Maximizing Continuous DR-submodular Functions," May 2023
- Mohammad Pedramfar and Vaneet Aggarwal, "Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback," Mar 2023.
- Guanyu Nie, Yididiya Y Nadew, Yanhui Zhu, Vaneet Aggarwal, and Christopher John Quinn, "A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback,"in Proc. ICML, Jul 2023. Â
- Fares Fourati, Vaneet Aggarwal, Christopher Quinn, and Mohamed-Slim Alouini, "Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback," in Proc. AISTATS, Apr 2023.
- Guanyu Nie, Mridul Agarwal, Abhishek Kumar Umrawal, Vaneet Aggarwal, Christopher John Quinn, "An Explore-then-Commit Algorithm for Submodular Maximization Under Full-bandit Feedback," in Proc. UAI, Aug 2022.
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "Stochastic Top K-Subset Bandits with Linear Space and Non-Linear Feedback with Applications to Social Influence Maximization," ACM/IMS Transactions on Data Science, vol. 2, issue 4, Article 38, Nov 2021.
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "DART: aDaptive Accept RejecT for non-linear top-K subset identification," in Proc. AAAI, Feb 2021 (21¬ceptance rate, 1692/7911).
- Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, and Abhishek Umrawal, "Stochastic Combinatorial Bandits with Linear Space and Non-Linear Feedback," in Proc. ALT, Mar 2021 (PMLR 132:306-339, 2021.) (29.3¬ceptance rate, 46/157).