Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1493 | 2023 |
Exploiting cloze questions for few shot text classification and natural language inference T Schick, H Schütze arXiv preprint arXiv:2001.07676, 2020 | 1487 | 2020 |
Toolformer: Language models can teach themselves to use tools T Schick, J Dwivedi-Yu, R Dessì, R Raileanu, M Lomeli, E Hambro, ... Advances in Neural Information Processing Systems 36, 2024 | 1119 | 2024 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1017 | 2022 |
It's not just size that matters: Small language models are also few-shot learners T Schick, H Schütze arXiv preprint arXiv:2009.07118, 2020 | 889 | 2020 |
Atlas: Few-shot learning with retrieval augmented language models G Izacard, P Lewis, M Lomeli, L Hosseini, F Petroni, T Schick, ... arXiv preprint arXiv 2208, 2022 | 530* | 2022 |
Augmented language models: a survey G Mialon, R Dessì, M Lomeli, C Nalmpantis, R Pasunuru, R Raileanu, ... arXiv preprint arXiv:2302.07842, 2023 | 403 | 2023 |
Self-diagnosis and self-debiasing: A proposal for reducing corpus-based bias in nlp T Schick, S Udupa, H Schütze Transactions of the Association for Computational Linguistics 9, 1408-1424, 2021 | 307 | 2021 |
Unnatural instructions: Tuning language models with (almost) no human labor O Honovich, T Scialom, O Levy, T Schick arXiv preprint arXiv:2212.09689, 2022 | 247 | 2022 |
Generating datasets with pretrained language models T Schick, H Schütze arXiv preprint arXiv:2104.07540, 2021 | 202 | 2021 |
Automatically identifying words that can serve as labels for few-shot text classification T Schick, H Schmid, H Schütze arXiv preprint arXiv:2010.13641, 2020 | 192 | 2020 |
Self-alignment with instruction backtranslation X Li, P Yu, C Zhou, T Schick, O Levy, L Zettlemoyer, J Weston, M Lewis arXiv preprint arXiv:2308.06259, 2023 | 139 | 2023 |
Few-shot text generation with natural language instructions T Schick, H Schütze Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021 | 120 | 2021 |
Rare words: A major problem for contextualized embeddings and how to fix it by attentive mimicking T Schick, H Schütze Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 8766-8774, 2020 | 114 | 2020 |
Few-shot text generation with pattern-exploiting training T Schick, H Schütze arXiv preprint arXiv:2012.11926, 2020 | 99 | 2020 |
Peer: A collaborative language model T Schick, J Dwivedi-Yu, Z Jiang, F Petroni, P Lewis, G Izacard, Q You, ... arXiv preprint arXiv:2208.11663, 2022 | 98 | 2022 |
Task-aware retrieval with instructions A Asai, T Schick, P Lewis, X Chen, G Izacard, S Riedel, H Hajishirzi, ... arXiv preprint arXiv:2211.09260, 2022 | 74 | 2022 |
True few-shot learning with Prompts—A real-world perspective T Schick, H Schütze Transactions of the Association for Computational Linguistics 10, 716-731, 2022 | 55 | 2022 |
Attentive mimicking: Better word embeddings by attending to informative contexts T Schick, H Schütze arXiv preprint arXiv:1904.01617, 2019 | 54 | 2019 |
BERTRAM: Improved word embeddings have big impact on contextualized model performance T Schick, H Schütze arXiv preprint arXiv:1910.07181, 2019 | 49 | 2019 |