Finite-sample analysis for sarsa with linear function approximation S Zou, T Xu, Y Liang Advances in neural information processing systems 32, 2019 | 141 | 2019 |
Improving sample complexity bounds for actor-critic algorithms T Xu, Z Wang, Y Liang arXiv preprint arXiv:2004.12956, 2020 | 94* | 2020 |
Two time-scale off-policy TD learning: Non-asymptotic analysis over Markovian samples T Xu, S Zou, Y Liang Advances in Neural Information Processing Systems 32, 2019 | 71 | 2019 |
Crpo: A new approach for safe reinforcement learning with convergence guarantee T Xu, Y Liang, G Lan International Conference on Machine Learning, 11480-11491, 2021 | 58* | 2021 |
Reanalysis of variance reduced temporal difference learning T Xu, Z Wang, Y Zhou, Y Liang arXiv preprint arXiv:2001.01898, 2020 | 37 | 2020 |
Algorithms for the estimation of transient surface heat flux during ultra-fast surface cooling ZF Zhou, TY Xu, B Chen International Journal of Heat and Mass Transfer 100, 1-10, 2016 | 36 | 2016 |
Enhanced first and zeroth order variance reduced algorithms for min-max optimization T Xu, Z Wang, Y Liang, HV Poor | 27* | 2020 |
Non-asymptotic convergence of adam-type reinforcement learning algorithms under Markovian sampling H Xiong, T Xu, Y Liang, W Zhang Proceedings of the AAAI Conference on Artificial Intelligence 35 (12), 10460 …, 2021 | 26 | 2021 |
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models? T Xu, Y Zhou, K Ji, Y Liang arXiv preprint arXiv:1806.04339, 2018 | 21* | 2018 |
Sample complexity bounds for two timescale value-based reinforcement learning algorithms T Xu, Y Liang International Conference on Artificial Intelligence and Statistics, 811-819, 2021 | 18 | 2021 |
Doubly robust off-policy actor-critic: Convergence and optimality T Xu, Z Yang, Z Wang, Y Liang International Conference on Machine Learning, 11581-11591, 2021 | 17 | 2021 |
Proximal gradient descent-ascent: Variable convergence under k {\L} geometry Z Chen, Y Zhou, T Xu, Y Liang arXiv preprint arXiv:2102.04653, 2021 | 15 | 2021 |
Faster algorithm and sharper analysis for constrained markov decision process T Li, Z Guan, S Zou, T Xu, Y Liang, G Lan arXiv preprint arXiv:2110.10351, 2021 | 9 | 2021 |
When will generative adversarial imitation learning algorithms attain global convergence Z Guan, T Xu, Y Liang International Conference on Artificial Intelligence and Statistics, 1117-1125, 2021 | 8 | 2021 |
Model-Based Offline Meta-Reinforcement Learning with Regularization S Lin, J Wan, T Xu, Y Liang, J Zhang arXiv preprint arXiv:2202.02929, 2022 | 4 | 2022 |
PER-ETD: A polynomially efficient emphatic temporal difference learning method Z Guan, T Xu, Y Liang arXiv preprint arXiv:2110.06906, 2021 | 2 | 2021 |
Provably efficient offline reinforcement learning with trajectory-wise reward T Xu, Y Liang arXiv preprint arXiv:2206.06426, 2022 | 1 | 2022 |
A Unifying Framework of Off-Policy General Value Function Evaluation T Xu, Z Yang, Z Wang, Y Liang Advances in Neural Information Processing Systems, 0 | 1* | |
Deterministic policy gradient: Convergence analysis H Xiong, T Xu, L Zhao, Y Liang, W Zhang Uncertainty in Artificial Intelligence, 2159-2169, 2022 | | 2022 |
Towards the Understanding of Sample Efficient Reinforcement Learning Algorithms T Xu The Ohio State University, 2022 | | 2022 |