Follow
Ryan S Park
Ryan S Park
Stanford Student
Verified email at stanford.edu
Title
Cited by
Cited by
Year
Disentangling length from quality in direct preference optimization
R Park, R Rafailov, S Ermon, C Finn
arXiv preprint arXiv:2403.19159, 2024
622024
From to : Your Language Model is Secretly a Q-Function
R Rafailov, J Hejna, R Park, C Finn
arXiv preprint arXiv:2404.12358, 2024
552024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
R Rafailov, Y Chittepu, R Park, H Sikchi, J Hejna, B Knox, C Finn, ...
arXiv preprint arXiv:2406.02900, 2024
212024
Preference Optimization for Molecular Language Models
R Park, R Theisen, N Sahni, M Patek, A Cichońska, R Rahman
arXiv preprint arXiv:2310.12304, 2023
22023
The system can't perform the operation now. Try again later.
Articles 1–4