Folgen
Siyu Chen
Siyu Chen
Ph.D. of S&DS, Yale University
Bestätigte E-Mail-Adresse bei yale.edu - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality
S Chen, H Sheen, T Wang, Z Yang
arXiv preprint arXiv:2402.19442, 2024
332024
Adaptive model design for Markov decision process
S Chen, D Yang, J Li, S Wang, Z Yang, Z Wang
International Conference on Machine Learning, 3679-3700, 2022
122022
Learning to incentivize information acquisition: Proper scoring rules meet principal-agent model
S Chen, J Wu, Y Wu, Z Yang
International Conference on Machine Learning, 5194-5218, 2023
82023
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
H Sheen, S Chen, T Wang, HH Zhou
arXiv preprint arXiv:2403.08699, 2024
72024
Wasserstein flow meets replicator dynamics: A mean-field analysis of representation learning in actor-critic
Y Zhang, S Chen, Z Yang, M Jordan, Z Wang
Advances in Neural Information Processing Systems 34, 15993-16006, 2021
62021
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
J He, S Chen, F Zhang, Z Yang
arXiv preprint arXiv:2405.19883, 2024
42024
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
S Chen, H Sheen, T Wang, Z Yang
arXiv preprint arXiv:2409.10559, 2024
32024
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods
X Hu, F Zhang, S Chen, Z Yang
arXiv preprint arXiv:2408.14511, 2024
32024
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks
S Chen, M Wang, Z Yang
arXiv preprint arXiv:2307.14085, 2023
32023
A unified framework of policy learning for contextual bandit with confounding bias and missing observations
S Chen, Y Wang, Z Wang, Z Yang
arXiv preprint arXiv:2303.11187, 2023
32023
Contractual Reinforcement Learning: Pulling Arms with Invisible Hands
J Wu, S Chen, M Wang, H Wang, H Xu
arXiv preprint arXiv:2407.01458, 2024
22024
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–11