I am an associate professor in the Departments of Industrial Engineering & Management Sciences and Computer Science at Northwestern University. I am also with the Centers for Deep Learning and Optimization & Statistical Learning.
The long-term goal of my research is to develop a new generation of data-driven decision-making methods, theory, and systems, which tailor artificial intelligence towards addressing societal challenges. To this end, my research aims at:
- making autonomous learning agents more efficient, both computationally and statistically, in a principled manner to enable their critical applications;
- designing and optimizing societal-scale multi-agent systems, especially those involving cooperation and/or competition among humans and/or robots.
Selected Recent Papers [Overview] [Conference] [Journal] [Citation]
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
International Conference on Machine Learning (ICML), 2024 [Arxiv] [Demo] [GitHub] |
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer Advances in Neural Information Processing Systems (NeurIPS), 2024 [Arxiv] |
Maximize to Explore: A Single Objective Fusing Estimation, Planning, and Exploration
Advances in Neural Information Processing Systems (NeurIPS), 2023 (spotlight) [Arxiv] |
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency International Conference on Learning Representations (ICLR), 2023 [Arxiv] |
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency International Conference on Machine Learning (ICML), 2022 [Arxiv] |
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic (alphabetical) SIAM Journal on Optimization (SIOPT), 2022 [Arxiv] |
Is Pessimism Provably Efficient for Offline RL?
International Conference on Machine Learning (ICML), 2021 [Arxiv] |
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Advances in Neural Information Processing Systems (NeurIPS), 2021 [Arxiv] |
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Advances in Neural Information Processing Systems (NeurIPS), 2020 (oral) [Arxiv] |
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Advances in Neural Information Processing Systems (NeurIPS), 2020 (spotlight) [Arxiv] |
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
Advances in Neural Information Processing Systems (NeurIPS), 2020 [Arxiv] [Demo] |
Provably Efficient Exploration in Policy Optimization
International Conference on Machine Learning (ICML), 2020 [Arxiv] |
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium Annual Conference on Learning Theory (COLT), 2020 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
Provably Efficient Reinforcement Learning with Linear Function Approximation
Annual Conference on Learning Theory (COLT), 2020 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
International Conference on Learning Representations (ICLR), 2020 [Arxiv] |
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Advances in Neural Information Processing Systems (NeurIPS), 2019 [Arxiv] |
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Advances in Neural Information Processing Systems (NeurIPS), 2019 Mathematics of Operations Research (MOR), 2022 [Arxiv] |
A Theoretical Analysis of Deep Q-Learning
(alphabetical) Submitted, 2020 [Arxiv] |
|