Boris Belousov
TitleCited byYear
Catching heuristics are optimal control policies
B Belousov, G Neumann, CA Rothkopf, JR Peters
Advances in neural information processing systems, 1426-1434, 2016
f-Divergence constrained policy improvement
B Belousov, J Peters
arXiv preprint arXiv:1801.00056, 2017
Entropic Regularization of Markov Decision Processes
B Belousov, J Peters
Entropy 21 (7), 2019
Belief space model predictive control for approximately optimal system identification
B Belousov, H Abdulsamad, M Schultheis, J Peters
4th Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2019
Entropic Risk Measure in Policy Search
D Nass, B Belousov, J Peters
arXiv preprint arXiv:1906.09090, 2019
Interactive Assemblies: Man-Machine Collaboration through Building Components for As-Built Digital Models
B Wibranek, B Belousov, A Sadybakasov, O Tessmann
Computer-Aided Architectural Design Futures (CAAD Futures), 2019
The Chow-Robbins game with an unknown coin
B Belousov
Mean squared advantage minimization as a consequence of entropic policy improvement regularization
B Belousov, J Peters
14th European Workshop on Reinforcement Learning, 2018
Goal-Directed Reward Generation
A Sadybakasov, B Belousov
The system can't perform the operation now. Try again later.
Articles 1–9