Resurrecting recurrent neural networks for long sequences A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De International Conference on Machine Learning, 26670-26698, 2023 | 189 | 2023 |
Learning explanations that are hard to vary G Parascandolo, A Neitz, A Orvieto, L Gresele, B Schölkopf International Conference on Learning Representations (2021), 2020 | 167 | 2020 |
A continuous-time perspective for modeling acceleration in Riemannian optimization F Alimisis, A Orvieto, G Bécigneul, A Lucchi International Conference on Artificial Intelligence and Statistics, 1297-1307, 2020 | 62 | 2020 |
Faster single-loop algorithms for minimax optimization without strong concavity J Yang, A Orvieto, A Lucchi, N He International Conference on Artificial Intelligence and Statistics, 5485-5517, 2022 | 56 | 2022 |
Momentum improves optimization on Riemannian manifolds F Alimisis, A Orvieto, G Becigneul, A Lucchi International conference on artificial intelligence and statistics, 1351-1359, 2021 | 51* | 2021 |
Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse L Noci, S Anagnostidis, L Biggio, A Orvieto, SP Singh, A Lucchi Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022 | 46 | 2022 |
Anticorrelated noise injection for improved generalization A Orvieto, H Kersting, F Proske, F Bach, A Lucchi International Conference on Machine Learning (ICML), 2022, 2022 | 41 | 2022 |
Continuous-time models for stochastic optimization algorithms A Orvieto, A Lucchi Advances in Neural Information Processing Systems 32 (2019), 2018 | 35 | 2018 |
Achieving a better stability-plasticity trade-off via auxiliary networks in continual learning S Kim, L Noci, A Orvieto, T Hofmann Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 29 | 2023 |
Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution A Orvieto, S Lacoste-Julien, N Loizou Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022 | 27 | 2022 |
Explicit regularization in overparametrized models via noise injection A Orvieto, A Raj, H Kersting, F Bach International Conference on Artificial Intelligence and Statistics, 7265-7287, 2023 | 25 | 2023 |
The role of memory in stochastic optimization A Orvieto, J Kohler, A Lucchi Uncertainty in Artificial Intelligence, 356-366, 2020 | 25 | 2020 |
Shadowing properties of optimization algorithms A Orvieto, A Lucchi Advances in Neural Information Processing Systems 32 (2019), 2019 | 21 | 2019 |
An accelerated dfo algorithm for finite-sum convex functions Y Chen, A Orvieto, A Lucchi International Conference on Machine Learning (ICML), 2020, 2020 | 19 | 2020 |
On the effectiveness of randomized signatures as reservoir for learning rough dynamics EM Compagnoni, A Scampicchio, L Biggio, A Orvieto, T Hofmann, ... 2023 International Joint Conference on Neural Networks (IJCNN), 1-8, 2023 | 18* | 2023 |
Vanishing Curvature in Randomly Initialized Deep ReLU Networks. A Orvieto, J Kohler, D Pavllo, T Hofmann, A Lucchi AISTATS, 7942-7975, 2022 | 16* | 2022 |
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues A Orvieto, S De, C Gulcehre, R Pascanu, SL Smith Forty-first International Conference on Machine Learning, 2024 | 15* | 2024 |
An SDE for Modeling SAM: Theory and Insights E Monzio Compagnoni, L Biggio, A Orvieto, FN Proske, H Kersting, ... arXiv e-prints, arXiv: 2301.08203, 2023 | 15* | 2023 |
Theoretical Foundations of Deep Selective State-Space Models N Muca Cirone, A Orvieto, B Walker, C Salvi, T Lyons arXiv e-prints, arXiv: 2402.19047, 2024 | 13* | 2024 |
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization P Zhang, A Orvieto, H Daneshmand, T Hofmann, R Smith International Conference on Artificial Intelligence and Statistics (2021), 2021 | 11 | 2021 |