Folgen
Ziyang Ma
Titel
Zitiert von
Zitiert von
Jahr
MT4SSL: Boosting self-supervised speech representation learning by integrating multiple targets
Z Ma, Z Zheng, C Tang, Y Wang, X Chen
Proc. Interspeech 2023, 2022
182022
Lauragpt: Listen, attend, understand, and regenerate audio with gpt
Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, W Wang, S Zheng, ...
arXiv preprint arXiv:2310.04673, 2023
112023
Hierarchical deep residual reasoning for temporal moment localization
Z Ma, X Han, X Song, Y Cui, L Nie
Proceedings of the 3rd ACM International Conference on Multimedia in Asia, 1-7, 2021
102021
Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition
Z Ma, W Wu, Z Zheng, Y Guo, Q Chen, S Zhang, X Chen
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
42024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering
Y Song, Z Chen, X Wang, Z Ma, X Chen
arXiv preprint arXiv:2401.07333, 2024
42024
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Z Ma, Z Zheng, G Yang, Y Wang, C Zhang, X Chen
Proc. Interspeech 2023, 2023
42023
Tessp: text-enhanced self-supervised speech pre-training
Z Yao, S Ren, S Chen, Z Ma, P Guo, L Xie
arXiv preprint arXiv:2211.13443, 2022
42022
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Y Guo, C Du, Z Ma, X Chen, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
3*2024
ChatMusician: Understanding and Generating Music Intrinsically with LLM
R Yuan, H Lin, Y Wang, Z Tian, S Wu, T Shen, G Zhang, Y Wu, C Liu, ...
arXiv preprint arXiv:2402.16153, 2024
32024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
W Chen, Y Liang, Z Ma, Z Zheng, X Chen
arXiv preprint arXiv:2401.03497, 2024
32024
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning
G Yang, Z Ma, Z Zheng, Y Song, Z Niu, X Chen
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023
32023
Front-end adapter: Adapting front-end input of speech based self-supervised learning for speech recognition
X Chen, Z Ma, C Tang, Y Wang, Z Zheng
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Improving few-shot learning for talking face system with tts data augmentation
Q Chen, Z Ma, T Liu, X Tan, Q Lu, K Yu, X Chen
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Towards universal speech discrete tokens: A case study for asr and tts
Y Yang, F Shen, C Du, Z Ma, K Yu, D Povey, X Chen
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Z Ma, Z Zheng, J Ye, J Li, Z Gao, S Zhang, X Chen
arXiv preprint arXiv:2312.15185, 2023
22023
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Z Zheng, Z Ma, Y Wang, X Chen
Proc. Interspeech 2023, 2023
22023
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation
Z Liang, Z Song, Z Ma, C Du, K Yu, X Chen
Proc. Interspeech 2023, 2023
22023
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition
F Yu, H Wang, Z Ma, S Zhang
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
12024
MuPT: A Generative Symbolic Music Pretrained Transformer
X Qu, Y Bai, Y Ma, Z Zhou, KM Lo, J Liu, R Yuan, L Min, X Liu, T Zhang, ...
arXiv preprint arXiv:2404.06393, 2024
12024
Exploring effective distillation of self-supervised speech models for automatic speech recognition
Y Wang, C Tang, Z Ma, Z Zheng, X Chen, WQ Zhang
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-6, 2023
12023
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20