I am an assistant professor in the Departments of Industrial Engineering & Management Sciences and Computer Science (by courtesy) at Northwestern University (since 2018). I am affiliated with the Centers for Deep Learning and Optimization & Statistical Learning.
The long-term goal of my research is to develop a new generation of data-driven decision-making methods, theory, and systems, which tailor artificial intelligence towards addressing pressing societal challenges. To this end, my research aims at:
- making deep reinforcement learning more efficient, both computationally and statistically, in a principled manner to enable its applications in critical domains;
- scaling deep reinforcement learning to design and optimize societal-scale multi-agent systems, especially those involving cooperation and/or competition among humans and/or robots.
Selected Recent Papers [Overview] [Conference] [Journal] [Citation]
Maximize to Explore: A Single Objective Fusing Estimation, Planning, and Exploration
Advances in Neural Information Processing Systems (NeurIPS), 2023 (spotlight) [Arxiv] |
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency International Conference on Learning Representations (ICLR), 2023 [Arxiv] |
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency International Conference on Machine Learning (ICML), 2022 [Arxiv] |
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic (alphabetical) SIAM Journal on Optimization (SIOPT), 2022 [Arxiv] |
Is Pessimism Provably Efficient for Offline RL?
International Conference on Machine Learning (ICML), 2021 [Arxiv] |
Principled Exploration via Optimistic Bootstrapping and Backward Induction
International Conference on Machine Learning (ICML), 2021 [Arxiv] [Github] |
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Advances in Neural Information Processing Systems (NeurIPS), 2021 [Arxiv] |
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Advances in Neural Information Processing Systems (NeurIPS), 2020 (oral) [Arxiv] |
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Advances in Neural Information Processing Systems (NeurIPS), 2020 (spotlight) [Arxiv] |
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
Advances in Neural Information Processing Systems (NeurIPS), 2020 [Arxiv] [Demo] |
Provably Efficient Exploration in Policy Optimization
International Conference on Machine Learning (ICML), 2020 [Arxiv] |
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium Annual Conference on Learning Theory (COLT), 2020 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
Provably Efficient Reinforcement Learning with Linear Function Approximation
Annual Conference on Learning Theory (COLT), 2020 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
International Conference on Learning Representations (ICLR), 2020 [Arxiv] |
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Advances in Neural Information Processing Systems (NeurIPS), 2019 [Arxiv] |
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Advances in Neural Information Processing Systems (NeurIPS), 2019 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
A Theoretical Analysis of Deep Q-Learning
(alphabetical) Submitted, 2020 [Arxiv] |
|