Folgen
Kaifeng Lyu
Kaifeng Lyu
Bestätigte E-Mail-Adresse bei princeton.edu - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
K Lyu, J Li
2020 International Conference on Learning Representations (ICLR 2020), 2020
3442020
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning
Z Li, Y Luo, K Lyu
2021 International Conference on Learning Representations (ICLR 2021), 2021
1462021
Theoretical analysis of auto rate-tuning by batch normalization
S Arora, Z Li, K Lyu
2019 International Conference on Learning Representations (ICLR 2019), 2019
1302019
Learning gradient descent: Better generalization and longer horizons
K Lv, S Jiang, J Li
34th International Conference on Machine Learning (ICML 2017) 70, 2247-2255, 2017
1202017
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
K Lyu, Z Li, R Wang, S Arora
35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021
842021
Understanding the generalization benefit of normalization layers: Sharpness reduction
K Lyu, Z Li, S Arora
36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022
782022
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
Z Li, K Lyu, S Arora
34th Conference on Neural Information Processing Systems (NeurIPS 2020), 2020
762020
DistillSpec: Improving speculative decoding via knowledge distillation
Y Zhou, K Lyu, AS Rawat, AK Menon, A Rostamizadeh, S Kumar, JF Kagy, ...
2024 International Conference on Learning Representations (ICLR 2024), 2023
522023
Fine-grained complexity meets IP = PSPACE
L Chen, S Goldwasser, K Lyu, GN Rothblum, A Rubinstein
30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2019), 1-20, 2019
402019
Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing
J Jin, Z Li, K Lyu, SS Du, JD Lee
International Conference on Machine Learning (ICML 2023), 15200-15238, 2023
392023
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
S Malladi, K Lyu, A Panigrahi, S Arora
36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022
352022
Why (and When) does Local SGD Generalize Better than SGD?
X Gu, K Lyu, L Huang, S Arora
2023 International Conference on Learning Representations (ICLR 2023), 2023
242023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
K Lyu, J Jin, Z Li, SS Du, JD Lee, W Hu
2024 International Conference on Learning Representations (ICLR 2024), 2023
232023
Safety Alignment Should Be Made More Than Just a Few Tokens Deep
X Qi, A Panda, K Lyu, X Ma, S Roy, A Beirami, P Mittal, P Henderson
arXiv preprint arXiv:2406.05946, 2024
182024
Keeping LLMs aligned after fine-tuning: The crucial role of prompt templates
K Lyu, H Zhao, X Gu, D Yu, A Goyal, S Arora
38th Conference on Neural Information Processing Systems (NeurIPS 2024), 2024
152024
RNNs are not Transformers (yet): The key bottleneck on in-context retrieval
K Wen, X Dang, K Lyu
arXiv preprint arXiv:2402.18510, 2024
132024
Single-Source Bottleneck Path Algorithm Faster than Sorting for Sparse Graphs
R Duan, K Lyu, H Wu, Y Xie
45th International Colloquium on Automata, Languages, and Programming (ICALP …, 2018
82018
The marginal value of momentum for small learning rate SGD
R Wang, S Malladi, T Wang, K Lyu, Z Li
2024 International Conference on Learning Representations (ICLR 2024), 2023
72023
New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
A Gupta, N Saunshi, D Yu, K Lyu, S Arora
36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022
72022
Ai-assisted generation of difficult math questions
V Shah, D Yu, K Lyu, S Park, J Yu, Y He, NR Ke, M Mozer, Y Bengio, ...
arXiv preprint arXiv:2407.21009, 2024
52024
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20