Follow
Wei-Ning Hsu
Wei-Ning Hsu
Facebook AI Research (FAIR)
Verified email at csail.mit.edu - Homepage
Title
Cited by
Cited by
Year
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
WN Hsu, B Bolte, YHH Tsai, K Lakhotia, R Salakhutdinov, A Mohamed
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3451-3460, 2021
17942021
Data2vec: A general framework for self-supervised learning in speech, vision and language
A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli
International Conference on Machine Learning, 1298-1312, 2022
6072022
An unsupervised autoregressive model for speech representation learning
YA Chung, WN Hsu, H Tang, J Glass
INTERSPEECH, 2019
4242019
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
Thirty-first Conference on Neural Information Processing Systems (NeurIPS), 2017
3892017
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
Seventh International Conference on Learning Representations (ICLR), 2019
283*2019
Unsupervised speech recognition
A Baevski, WN Hsu, A Conneau, M Auli
Advances in Neural Information Processing Systems 34, 27826-27839, 2021
2572021
On generative spoken language modeling from raw audio
K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ...
Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021
2312021
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ...
INTERSPEECH, 2021
2082021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ...
INTERSPEECH, 2021
2062021
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
1972019
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
1842022
Active learning by learning
WN Hsu, HT Lin
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
1832015
Learning Latent Representations for Speech Generation and Transformation
WN Hsu, Y Zhang, J Glass
INTERSPEECH, 1273-1277, 2017
1752017
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation
WN Hsu, Y Zhang, J Glass
2017 IEEE automatic speech recognition and understanding workshop (ASRU), 16-23, 2017
1572017
Semi-supervised training for improving data efficiency in end-to-end speech synthesis
YA Chung, Y Wang, WN Hsu, Y Zhang, RJ Skerry-Ryan
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1362019
Disentangling correlated speaker and noise for speech synthesis via data augmentation and adversarial factorization
WN Hsu, Y Zhang, RJ Weiss, YA Chung, Y Wang, Y Wu, J Glass
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1212019
Direct speech-to-speech translation with discrete units
A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ...
arXiv preprint arXiv:2107.05604, 2021
1082021
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
arXiv preprint arXiv:2305.13516, 2023
932023
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
D Harwath, WN Hsu, J Glass
Eighth International Conference on Learning Representations (ICLR), 2020
922020
Textless speech-to-speech translation on real data
A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ...
arXiv preprint arXiv:2112.08352, 2021
892021
The system can't perform the operation now. Try again later.
Articles 1–20