Wentong Li 李文通

College of Artificial Intelligence

Nanjing University of Aeronautics and Astronautics (NUAA)

No.29 Jiangjun Road, Nanjing,China

Office: 1205, No.1 Building

     

About Me

I am an Associate Professor of the College of Artificial Intelligence at Nanjing University of Aeronautics and Astronautics. In August 2025, I was a visiting researcher at Department of Computing, The Hong Kong Polytechnic University, where I collaborated with my Ph.D. advisor, Prof. Lei Zhang (IEEE Fellow). Previously, I completed my Ph.D at College of Computer Science and Technology, Zhejiang University, supervised by Prof. Jianke Zhu and Prof. Lei Zhang , in June 2024. My recent research interests are Visual/Scene Understanding, Embodied AI and Vision-Language Models, particularly in:

Looking for self-motivated Masters, Research Interns/Assistants and Ph.Ds (co-supervised), please email me if you have interest.

News

Publications

(*:equal contribution, #:corresponding author, +:project leader)

Preprints

photo
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems
Song Wang, Lingdong Kong, Xiaolu Liu, Hao Shi, Wentong Li, Jianke Zhu, Steven C. H. Hoi
Arxiv, 2025

PaperCode

photo
PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity
Yuqian Yuan, Wenqiao Zhang, Xin Li, Shihao Wang, Kehan Li, Wentong Li#, Jun Xiao, Lei Zhang, Beng Chin Ooi
Arxiv, 2025

Selected Publications

photo
VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration
Hanxun Yu*, Wentong Li*, Xuan Qu*, Song Wang, Junbo Chen, Jianke Zhu
ICLR, 2026

PaperCode

photo
MUVR: A Multi-Modal Untrimmed Video Retrieval Benchmark with Multi-Level Visual Correspondence
Yue Feng, Jinwei Hu, Qijia Lu, Jiawei Niu, Li Tan, Shuo Yuan, Ziyi Yan, Yizhen Jia, Qingzhi He, Shiping Ge, Ethan Q. Chen, Wentong Li#, Limin Wang, Jie Qin
NeurIPS (DB Track), 2025

PaperCodeData

photo
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
Yuqian Yuan*, Ronghao Dang*, Long Li*, Wentong Li*, Diao Jiao, Xin Li, Deli Zhao, Fan Wang, Wenqiao Zhang, Jun Xiao, Yueting Zhuang
NeurIPS (DB Track), 2025
photo
TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li*, Yuqian Yuan*, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang
IJCV, 2025
photo
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing
CVPR, 2025
photo
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu*, Wentong Li*, Song Wang, Junbo Chen, Jianke Zhu
CVPR, 2025 (Highlight, 2.9%)

PaperCode

photo
Osprey: Pixel Understanding with Visual Instruction Tuning
Yuqian Yuan*, Wentong Li*+, Jian Liu, Dongqi Tang, Xinjie Luo, Chi Qin, Lei Zhang, Jianke Zhu
CVPR, 2024 (Project Leader)
photo
Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang
T-PAMI, 2024

Research Experiences

Honors

Academic Services

Tech. Talks

Teaching

© Wentong Li | Last update: Jan. 2026