Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1610 | 2023 |
Hate speech dataset from a white supremacy forum O De Gibert, N Perez, A García-Pablos, M Cuadros arXiv preprint arXiv:1809.04444, 2018 | 544 | 2018 |
Are multilingual models the best choice for moderately under-resourced languages? A comprehensive assessment for Catalan J Armengol-Estapé, CP Carrino, C Rodriguez-Penagos, OG Bonet, ... arXiv preprint arXiv:2107.07903, 2021 | 47 | 2021 |
On the multilingual capabilities of very large-scale English language models J Armengol-Estapé, OG Bonet, M Melero arXiv preprint arXiv:2108.13349, 2021 | 28 | 2021 |
Spanish biomedical crawled corpus: A large, diverse dataset for spanish biomedical language models CP Carrino, J Armengol-Estapé, OG Bonet, A Gutiérrez-Fandiño, ... arXiv preprint arXiv:2109.07765, 2021 | 18 | 2021 |
Spanish biomedical and clinical language embeddings A Gutiérrez-Fandino, J Armengol-Estapé, CP Carrino, O De Gibert, ... arXiv preprint arXiv:2102.12843, 2021 | 10 | 2021 |
Estrategia multidimensional para la selección de candidatos de traducción automática para posedición N Aranberri, O de Gibert Linguamática 11 (2), 3-16, 2019 | 10 | 2019 |
A New Massive Multilingual Dataset for High-Performance Language Technologies O De Gibert, G Nail, N Arefyev, M Bañón, J Van Der Linde, S Ji, ... arXiv preprint arXiv:2403.14009, 2024 | 8 | 2024 |
Four approaches to low-resource multilingual nmt: The helsinki submission to the americasnlp 2023 shared task O De Gibert, R Vázquez, M Aulamo, Y Scherrer, S Virpioja, J Tiedemann Proceedings of the Workshop on Natural Language Processing for Indigenous …, 2023 | 6 | 2023 |
Quality versus Quantity: Building Catalan-English MT Resources O de Gibert, K Kharitonova, BC Figueras, J Armengol-Estapé, M Melero | 6* | 2022 |
Automatic removal of identifying information in official EU languages for public administrations: The MAPA Project L Gianola, Ē Ajausks, V Arranz, C Bendahman, L Bié, C Borg, A Cerdà, ... Legal Knowledge and Information Systems, 223-226, 2020 | 6 | 2020 |
Findings of the AmericasNLP 2024 Shared Task on Machine Translation into Indigenous Languages A Ebrahimi, O de Gibert, R Vazquez, R Coto-Solano, P Denisov, R Pugh, ... Proceedings of the 4th Workshop on Natural Language Processing for …, 2024 | 4 | 2024 |
The catalan language club C Rodriguez-Penagos, C Armentano-Oller, M Villegas, M Melero, ... arXiv preprint arXiv:2112.01894, 2021 | 4 | 2021 |
The OPUS-MT Dashboard–A Toolkit for a Systematic Evaluation of Open Machine Translation Models J Tiedemann, O De Gibert Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 3 | 2023 |
Sequence-to-Sequence Resources for Catalan O de Gibert, K Kharitonova, BC Figueras, J Armengol-Estapé, M Melero arXiv preprint arXiv:2202.06871, 2022 | 3 | 2022 |
Spanish Datasets for Sensitive Entity Detection in the Legal Domain O de Gibert, A Garcıa-Pablos, M Cuadros, M Melero | 3* | 2022 |
Unsupervised Machine Translation in Real-World Scenarios O de Gibert, I Goenaga, J Armengol-Estapé, O Perez-de-Vinaspre | 2* | 2022 |
to post-edit or to translate... That is the question: a case study of a recommender system for Quality Estimation of Machine Translation based on linguistic features O de Gibert Bonet | 2 | 2018 |
Hybrid distillation from RBMT and NMT: Helsinki-NLP’s submission to the Shared Task on Translation into Low-Resource Languages of Spain O De Gibert, M Aulamo, Y Scherrer, J Tiedemann Proceedings of the Ninth Conference on Machine Translation, 908-917, 2024 | 1 | 2024 |
MAMMOTH: Massively Multilingual Modular Open Translation@ Helsinki T Mickus, SA Grönroos, J Attieh, M Boggia, O De Gibert, S Ji, NA Lopi, ... arXiv preprint arXiv:2403.07544, 2024 | 1 | 2024 |