Follow
Yanping Huang
Title
Cited by
Cited by
Year
Regularized evolution for image classifier architecture search
E Real, A Aggarwal, Y Huang, QV Le
Proceedings of the aaai conference on artificial intelligence 33 (01), 4780-4789, 2019
32082019
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
Journal of Machine Learning Research 25 (70), 1-53, 2024
15252024
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Y Huang, Y Cheng, A Bapna, O Firat, MX Chen, D Chen, HJ Lee, J Ngiam, ...
Advances in Neural Information Processing Systems 32, 103--112, 2019
14352019
Lamda: Language models for dialog applications
R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ...
arXiv preprint arXiv:2201.08239, 2022
11712022
Palm 2 technical report
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 2023
7652023
Gshard: Scaling giant models with conditional computation and automatic sharding
D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat, Y Huang, M Krikun, N Shazeer, ...
International Conference on Learning Representations (ICLR), 2020
6802020
Predictive coding
Y Huang, RPN Rao
Wiley Interdisciplinary Reviews: Cognitive Science 2 (5), 580-593, 2011
6772011
Glam: Efficient scaling of language models with mixture-of-experts
N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ...
International Conference on Machine Learning, 5547-5569, 2022
442*2022
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
1982019
Alpa: Automating Inter-and Intra-Operator Parallelism for Distributed Deep Learning
L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ...
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
1762022
H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei. 2022. Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ...
arXiv preprint arXiv:2210.11416, 2022
158*2022
Just pick a sign: Optimizing deep multitask models with gradient sign dropout
Z Chen, J Ngiam, Y Huang, T Luong, H Kretzschmar, Y Chai, D Anguelov
Advances in Neural Information Processing Systems 33, 2039-2050, 2020
1582020
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition
Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ...
IEEE Journal of Selected Topics in Signal Processing 16 (6), 1519-1532, 2022
1422022
Mixture-of-experts with expert choice routing
Y Zhou, T Lei, H Liu, N Du, Y Huang, V Zhao, AM Dai, QV Le, J Laudon
Advances in Neural Information Processing Systems 35, 7103-7114, 2022
1152022
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
852021
Beyond distillation: Task-level mixture-of-experts for efficient inference
S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, MT Luong, O Firat
arXiv preprint arXiv:2110.03742, 2021
732021
Designing effective sparse expert models
B Zoph, I Bello, S Kumar, N Du, Y Huang, J Dean, N Shazeer, W Fedus
arXiv preprint arXiv:2202.08906 2 (3), 17, 2022
702022
Building machine translation systems for the next thousand languages
A Bapna, I Caswell, J Kreutzer, O Firat, D van Esch, A Siddhant, M Niu, ...
arXiv preprint arXiv:2205.03983, 2022
552022
Sunipa Dev
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vladimir Feinberg, Fangxiaoyu …, 2023
512023
Neurons as Monte Carlo samplers: Bayesian Inference and Learning in Spiking Networks
Y Huang, RPN Rao
Advances in neural information processing systems 27, 1943-1951, 2014
502014
The system can't perform the operation now. Try again later.
Articles 1–20