フォロー
Stephen McAleer
Stephen McAleer
OpenAI
確認したメール アドレス: openai.com - ホームページ
タイトル
引用先
引用先
Highly accurate machine fault diagnosis using deep transfer learning
S Shao, S McAleer, R Yan, P Baldi
IEEE Transactions on Industrial Informatics 15 (4), 2446-2455, 2018
10782018
Solving the Rubik’s cube with deep reinforcement learning and search
F Agostinelli*, S McAleer*, A Shmakov*, P Baldi
Nature Machine Intelligence 1 (8), 356-363, 2019
2192019
Language Models can Solve Computer Tasks
G Kim, P Baldi, S McAleer
Neural Information Processing Systems (NeurIPS), 2023
1702023
Mastering the game of stratego with model-free multiagent reinforcement learning
J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub, V de Boer, ...
Science 378 (6623), 990-996, 2022
1562022
Solving the Rubik's Cube with Approximate Policy Iteration
S McAleer*, F Agostinelli*, A Shmakov*, P Baldi
International Conference on Learning Representations (ICLR), 2018
96*2018
Llemma: An Open Language Model for Mathematics
Z Azerbayev, H Schoelkopf, K Paster, M Dos Santos, S McAleer, AQ Jiang, ...
International Conference on Learning Representations (ICLR), 2023
942023
AI Alignment: A Comprehensive Survey
J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang, Y Duan, Z He, J Zhou, ...
arXiv preprint arXiv:2310.19852, 2023
852023
Pipeline PSRO: A scalable approach for finding approximate nash equilibria in large games
S McAleer*, J Lanier*, R Fox, P Baldi
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
752020
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Y Chen, Y Yang, T Wu, S Wang, X Feng, J Jiang, SM McAleer, H Dong, ...
36th Conference on Neural Information Processing Systems (NeurIPS 2022 …, 2022
662022
Evolutionary reinforcement learning for sample-efficient multiagent coordination
S Majumdar, S Khadka, S Miret, S McAleer, K Tumer
International Conference on Machine Learning (ICML), 2020
632020
XDO: A double oracle algorithm for extensive-form games
S McAleer, J Lanier, P Baldi, R Fox
Advances in Neural Information Processing Systems (NeurIPS), 2021
532021
Independent Natural Policy Gradient Always Converges in Markov Potential Games
R Fox, S McAleer, W Overman, I Panageas
AISTATS 2022, 2021
472021
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems (NeurIPS), 2021
46*2021
Online Double Oracle
LC Dinh, Y Yang, S McAleer, NP Nieves, O Slumbers, Z Tian, DH Mguni, ...
Transactions on Machine Learning Research, 2021
292021
Deep-learning-based reconstruction of the neutrino direction and energy for in-ice radio detectors
C Glaser, S McAleer, S Stjärnholm, P Baldi, SW Barwick
Astroparticle Physics 145, 102781, 2023
27*2023
White Paper: ARIANNA-200 high energy neutrino telescope
A Anker, P Baldi, SW Barwick, D Bergman, H Bernhoff, DZ Besson, ...
arXiv preprint arXiv:2004.09841, 2020
262020
Alphazero-like tree-search can guide large language model decoding and training
X Feng, Z Wan, M Wen, S McAleer, Y Wen, W Zhang, J Wang
arXiv preprint arXiv:2309.17179, 2023
222023
Curiosity-Driven Multi-Criteria Hindsight Experience Replay
J Lanier, S McAleer, P Baldi
NeurIPS 2019 Deep RL Workshop, 2019
222019
Reducing variance in temporal-difference value estimation via ensemble of deep networks
L Liang, Y Xu, S McAleer, D Hu, A Ihler, P Abbeel, R Fox
International Conference on Machine Learning (ICML), 2022
21*2022
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games
S McAleer, JB Lanier, K Wang, P Baldi, R Fox, T Sandholm
International Conference on Learning Representations (ICLR), 2022
20*2022
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–20