Follow
Hiteshi Sharma
Hiteshi Sharma
Verified email at microsoft.com
Title
Cited by
Cited by
Year
Model-free reinforcement learning in infinite-horizon average-reward markov decision processes
CY Wei, MJ Jahromi, H Luo, H Sharma, R Jain
International conference on machine learning, 10170-10180, 2020
902020
A universal empirical dynamic programming algorithm for continuous state MDPs
WB Haskell, R Jain, H Sharma, P Yu
IEEE Transactions on Automatic Control 65 (1), 115-129, 2019
202019
Fine-tuning language models with advantage-induced policy alignment
B Zhu, H Sharma, FV Frujeri, S Dong, C Zhu, MI Jordan, J Jiao
arXiv preprint arXiv:2306.02231, 2023
162023
Approximate relative value learning for average-reward continuous state MDPs
H Sharma, M Jafarnia-Jahromi, R Jain
Uncertainty in Artificial Intelligence, 956-964, 2020
142020
An empirical relative value learning algorithm for non-parametric MDPs with continuous state space
H Sharma, R Jain, A Gupta
2019 18th European Control Conference (ECC), 1368-1373, 2019
132019
Evaluating cognitive maps and planning in large language models with CogEval
I Momennejad, H Hasanbeig, F Vieira Frujeri, H Sharma, N Jojic, ...
Advances in Neural Information Processing Systems 36, 2024
102024
Randomized function fitting-based empirical value iteration
WB Haskell, P Yu, H Sharma, R Jain
2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2467-2472, 2017
92017
Evaluating cognitive maps in large language models with cogeval: No emergent planning
I Momennejad, H Hasanbeig, FV Frujeri, H Sharma, RO Ness, N Jojic, ...
Advances in neural information processing systems 37, 2023
72023
An approximately optimal relative value learning algorithm for averaged MDPs with continuous states and actions
H Sharma, R Jain
2019 57th Annual Allerton Conference on Communication, Control, and …, 2019
72019
Allure: A systematic protocol for auditing and improving llm-based evaluation of text using iterative in-context-learning
H Hasanbeig, H Sharma, L Betthauser, FV Frujeri, I Momennejad
arXiv preprint arXiv:2309.13701, 2023
62023
Language models can be logical solvers
J Feng, R Xu, J Hao, H Sharma, Y Shen, D Zhao, W Chen
arXiv preprint arXiv:2311.06158, 2023
42023
An empirical dynamic programming algorithm for continuous MDPs
WB Haskell, R Jain, H Sharma, P Yu
arXiv preprint arXiv:1709.07506, 2017
42017
Optimal spectrum sensing for cognitive radio with imperfect detector
H Sharma, A Patel, SN Merchant, UB Desai
2014 IEEE 79th Vehicular Technology Conference (VTC Spring), 1-5, 2014
42014
Finite Time Guarantees for Continuous State MDPs with Generative Model
H Sharma, R Jain
2020 59th IEEE Conference on Decision and Control (CDC), 3617-3622, 2020
12020
Randomized Policy Learning for Continuous State and Action MDPs
H Sharma, R Jain
arXiv preprint arXiv:2006.04331, 2020
12020
Empirical algorithms for general stochastic systems with continuous states and actions
H Sharma, R Jain, W Haskell
2019 IEEE 58th Conference on Decision and Control (CDC), 6344-6349, 2019
12019
QoS aware optimal base station ON/OFF policy and frequency planning
H Sharma, V Vaid, P Chaporkar, GS Kasbekar
Indian Inst. Technol. Bombay, 2015
12015
A dynamical systems framework for stochastic iterative optimization
WB Haskell, R Jain, H Sharma
2016 IEEE 55th Conference on Decision and Control (CDC), 4504-4509, 2016
2016
Empirical algorithms for stochastic iterative optimization
WB Haskell, R Jain, H Sharma
2015
Empirical dynamic programming on continuous state spaces
WB Haskell, R Jain, H Sharma
2015
The system can't perform the operation now. Try again later.
Articles 1–20