Folgen
Zhiyuan Li
Zhiyuan Li
Bestätigte E-Mail-Adresse bei ttic.edu - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
S Arora, S Du, W Hu, Z Li, R Wang
International Conference on Machine Learning, 322-332, 2019
10892019
On exact computation with an infinitely wide neural net
S Arora, SS Du, W Hu, Z Li, RR Salakhutdinov, R Wang
Advances in neural information processing systems 32, 2019
10302019
Towards understanding the role of over-parametrization in generalization of neural networks
B Neyshabur, Z Li, S Bhojanapalli, Y LeCun, N Srebro
arXiv preprint arXiv:1805.12076, 2018
6232018
An exponential learning rate schedule for deep learning
Z Li, S Arora
arXiv preprint arXiv:1910.07454, 2019
2312019
Harnessing the power of infinitely wide deep nets on small-data tasks
S Arora, SS Du, Z Li, R Salakhutdinov, R Wang, D Yu
1972019
Towards resolving the implicit bias of gradient descent for matrix factorization: Greedy low-rank learning
Z Li, Y Luo, K Lyu
arXiv preprint arXiv:2012.09839, 2020
1472020
Enhanced convolutional neural tangent kernels
Z Li, R Wang, D Yu, SS Du, W Hu, R Salakhutdinov, S Arora
arXiv preprint arXiv:1911.00809, 2019
1382019
Theoretical analysis of auto rate-tuning by batch normalization
S Arora, Z Li, K Lyu
arXiv preprint arXiv:1812.03981, 2018
1312018
Learning in games: Robustness of fast convergence
DJ Foster, Z Li, T Lykouris, K Sridharan, E Tardos
Advances in Neural Information Processing Systems 29, 2016
1282016
Understanding gradient descent on the edge of stability in deep learning
S Arora, Z Li, A Panigrahi
International Conference on Machine Learning, 948-1024, 2022
1122022
Sophia: A scalable stochastic second-order optimizer for language model pre-training
H Liu, Z Li, D Hall, P Liang, T Ma
arXiv preprint arXiv:2305.14342, 2023
1092023
What Happens after SGD Reaches Zero Loss?--A Mathematical Framework
Z Li, T Wang, S Arora
arXiv preprint arXiv:2110.06914, 2021
1092021
Simple and effective regularization methods for training on noisily labeled data with generalization guarantee
W Hu, Z Li, D Yu
International Conference on Learning Representations (ICLR 2020), 2019
99*2019
Explaining landscape connectivity of low-cost solutions for multilayer nets
R Kuditipudi, X Wang, H Lee, Y Zhang, Z Li, W Hu, R Ge, S Arora
Advances in neural information processing systems 32, 2019
962019
On the validity of modeling sgd with stochastic differential equations (sdes)
Z Li, S Malladi, S Arora
Advances in Neural Information Processing Systems 34, 12712-12725, 2021
912021
Gradient descent on two-layer nets: Margin maximization and simplicity bias
K Lyu, Z Li, R Wang, S Arora
Advances in Neural Information Processing Systems 34, 12978-12991, 2021
842021
Understanding the generalization benefit of normalization layers: Sharpness reduction
K Lyu, Z Li, S Arora
Advances in Neural Information Processing Systems 35, 34689-34708, 2022
792022
Reconciling modern deep learning with traditional optimization analyses: The intrinsic learning rate
Z Li, K Lyu, S Arora
Advances in Neural Information Processing Systems 33, 14544-14555, 2020
762020
How Does Sharpness-Aware Minimization Minimizes Sharpness?
K Wen, T Ma, Z Li
The Eleventh International Conference on Learning Representations, 2023
732023
Why are convolutional nets more sample-efficient than fully-connected nets?
Z Li, Y Zhang, S Arora
arXiv preprint arXiv:2010.08515, 2020
612020
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20