BEVFormer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision C Yang, Y Chen, H Tian, C Tao, X Zhu, Z Zhang, G Huang, H Li, Y Qiao, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 120 | 2023 |
Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe H Li, C Sima, J Dai, W Wang, L Lu, H Wang, J Zeng, Z Li, J Yang, H Deng, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 67 | 2023 |
Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, et al. Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models … X Zhu, Y Chen, H Tian, C Tao arXiv preprint arXiv:2305.17144 2 (3), 5, 2023 | 53 | 2023 |
Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models with text-based knowledge and memory X Zhu, Y Chen, H Tian, C Tao, W Su, C Yang, G Huang, B Li, L Lu, ... arXiv preprint arXiv:2305.17144, 2023 | 33 | 2023 |
Unsupervised object detection with lidar clues H Tian, Y Chen, J Dai, Z Zhang, X Zhu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 28 | 2021 |
Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, and Jifeng Dai X Zhu, Y Chen, H Tian, C Tao Ghost in the minecraft: Generally capable agents for open-world environments …, 2023 | 24 | 2023 |
Drivemlm: Aligning multi-modal large language models with behavioral planning states for autonomous driving W Wang, J Xie, CY Hu, H Zou, J Fan, W Tong, Y Wen, S Wu, H Deng, Z Li, ... arXiv preprint arXiv:2312.09245, 2023 | 13 | 2023 |
Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, and Jifeng Dai. 2023. Ghost in the minecraft: Generally capable agents for open … X Zhu, Y Chen, H Tian, C Tao arXiv preprint arXiv:2305.17144, 0 | 9 | |
Delving into the Devils of Bird’s-eye-view Perception: A Review H Li, C Sima, J Dai, W Wang, L Lu, H Wang, E Xie, Z Li, H Deng, H Tian Evaluation and Recipe. arXiv, 2022 | 7 | 2022 |
InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation R Fang, S Yan, Z Huang, J Zhou, H Tian, J Dai, H Li arXiv preprint arXiv:2311.18835, 2023 | 3 | 2023 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ... arXiv preprint arXiv:2404.16821, 2024 | | 2024 |
Ghost in the Minecraft: Hierarchical Agents for Minecraft via Large Language Models with Text-based Knowledge and Memory X Zhu, Y Chen, H Tian, C Tao, W Su, C Yang, G Huang, B Li, L Lu, ... | | 2023 |