Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality S Chen, H Sheen, T Wang, Z Yang arXiv preprint arXiv:2402.19442, 2024 | 33 | 2024 |
Adaptive model design for Markov decision process S Chen, D Yang, J Li, S Wang, Z Yang, Z Wang International Conference on Machine Learning, 3679-3700, 2022 | 12 | 2022 |
Learning to incentivize information acquisition: Proper scoring rules meet principal-agent model S Chen, J Wu, Y Wu, Z Yang International Conference on Machine Learning, 5194-5218, 2023 | 8 | 2023 |
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention H Sheen, S Chen, T Wang, HH Zhou arXiv preprint arXiv:2403.08699, 2024 | 7 | 2024 |
Wasserstein flow meets replicator dynamics: A mean-field analysis of representation learning in actor-critic Y Zhang, S Chen, Z Yang, M Jordan, Z Wang Advances in Neural Information Processing Systems 34, 15993-16006, 2021 | 6 | 2021 |
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems J He, S Chen, F Zhang, Z Yang arXiv preprint arXiv:2405.19883, 2024 | 4 | 2024 |
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers S Chen, H Sheen, T Wang, Z Yang arXiv preprint arXiv:2409.10559, 2024 | 3 | 2024 |
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods X Hu, F Zhang, S Chen, Z Yang arXiv preprint arXiv:2408.14511, 2024 | 3 | 2024 |
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks S Chen, M Wang, Z Yang arXiv preprint arXiv:2307.14085, 2023 | 3 | 2023 |
A unified framework of policy learning for contextual bandit with confounding bias and missing observations S Chen, Y Wang, Z Wang, Z Yang arXiv preprint arXiv:2303.11187, 2023 | 3 | 2023 |
Contractual Reinforcement Learning: Pulling Arms with Invisible Hands J Wu, S Chen, M Wang, H Wang, H Xu arXiv preprint arXiv:2407.01458, 2024 | 2 | 2024 |