Principal Investigator: Vaneet Aggarwal
Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Amrit Singh Bedi, and Dinesh Manocha, "Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic," in Proc. ICML, Jul 2024
Qinbo Bai, Washim Uddin Mondal, and Vaneet Aggarwal, "Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes," in Proc. AAAI, Feb 2024.
Qinbo Bai, Washim Uddin Mondal, and Vaneet Aggarwal, "Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm," in Proc. Neurips, Dec 2024
Swetha Ganesh, Washim Uddin Mondal, Vaneet Aggarwal, "Variance-Reduced Policy Gradient Approaches for Infinite Horizon Average Reward Markov Decision Processes," arXiv, Apr 2024
Swetha Ganesh and Vaneet Aggarwal, "An Accelerated Multi-level Monte Carlo Approach for Average Reward Reinforcement Learning with General Policy Parametrization," arXiv, Jul 2024.