Boris Belousov
TitleCited byYear
Catching heuristics are optimal control policies
B Belousov, G Neumann, CA Rothkopf, JR Peters
Advances in neural information processing systems, 1426-1434, 2016
f-Divergence constrained policy improvement
B Belousov, J Peters
arXiv preprint arXiv:1801.00056, 2017
Receding Horizon Curiosity
M Schultheis, B Belousov, H Abdulsamad, J Peters
arXiv preprint arXiv:1910.03620, 2019
Self-Paced Contextual Reinforcement Learning
P Klink, H Abdulsamad, B Belousov, J Peters
arXiv preprint arXiv:1910.02826, 2019
Building a Library of Tactile Skills Based on FingerVision
B Belousov, A Sadybakasov, B Wibranek, F Veiga, O Tessmann, J Peters
2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids), 2019
HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints
M Lutter, B Belousov, K Listmann, D Clever, J Peters
arXiv preprint arXiv:1909.06153, 2019
Interactive Structure: Robotic Repositioning of Vertical Elements in Man-Machine Collaborative Assembly through Vision-Based Tactile Sensing
B Wibranek, B Belousov, A Sadybakasov, J Peters, O Tessmann
37th eCAADe and 23rd SIGraDi Conference 2, 705-713, 2019
Entropic Regularization of Markov Decision Processes
B Belousov, J Peters
Entropy 21 (7), 2019
Reverse ELBO
B Belousov
Belief space model predictive control for approximately optimal system identification
B Belousov, H Abdulsamad, M Schultheis, J Peters
4th Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2019
Entropic Risk Measure in Policy Search
D Nass, B Belousov, J Peters
arXiv preprint arXiv:1906.09090, 2019
Interactive Assemblies: Man-Machine Collaboration through Building Components for As-Built Digital Models
B Wibranek, B Belousov, A Sadybakasov, O Tessmann
Computer-Aided Architectural Design Futures (CAAD Futures), 2019
The Chow-Robbins game with an unknown coin
B Belousov
Mean squared advantage minimization as a consequence of entropic policy improvement regularization
B Belousov, J Peters
14th European Workshop on Reinforcement Learning, 2018
Goal-Directed Reward Generation
A Sadybakasov, B Belousov
The system can't perform the operation now. Try again later.
Articles 1–15