Wentong Li 李文通
College of Artificial Intelligence
Nanjing University of Aeronautics and Astronautics (NUAA)
Nanjing,China
|
|
About Me
I am an Associate Professor of the College of Artificial Intelligence at Nanjing University of Aeronautics and Astronautics.
Previously, I completed my Ph.D at College of Computer Science and Technology, Zhejiang University, fortunately supervised by Prof.
Jianke Zhu and Prof.
Lei Zhang (PolyU, HK), in June 2024.
My recent research interests are Visual/Scene Understanding, Embodied AI and Multimodal Large Language Models, particularly in:
1. Enabling MLLMs/VLMs with common visual tasks, including visual referring and grounding for image/video/3D scene.
2. Embodied scene understanding&reasoning, including ego-centric image/video analysis, reasoning and interaction, scene navigation.
3. Efficient and effective MLLMs, including token reduction, lightweight mllm, efficient high-resolution understanding.
Before, I mainly focus on the field of the techniques for object detection, image segmentaion and their weakly-supervised/label-efficient approaches. Besides, I am also interested in autonomous driving tasks (HD-Map, 3D-Occupancy, etc.) and 3D reconstruction tasks.
News
-
[2024.12]: Won Outstanding Doctoral Dissertation Award of ZJU (浙江大学优秀博士论文).
-
[2024.6]: Obtained my Ph.D. degree from ZJU.
Preprints
TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li*, Yuqian Yuan*, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang
ArXiv:2407.02392, 2024
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing
ArXiv:2501.00599, 2025
Scalable Autoregressive Monocular Depth Estimation
Jinhong Wang, Jian Liu, Dongqi Tang, Weiqiang Wang, Wentong Li, Danny Chen, Jian Wu
ArXiv:2411.11361, 2024
ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning
Song Wang, Zhongdao Wang, Jiawei Yu, Wentong Li, Bailan Feng, Junbo Chen, Jianke Zhu
ArXiv:2409.18026, 2024
Selected Publications
Osprey: Pixel Understanding with Visual Instruction Tuning
Yuqian Yuan*, Wentong Li*, Jian Liu, Dongqi Tang, Xinjie Luo, Chi Qin, Lei Zhang, Jianke Zhu
CVPR, 2024. (Project Leader)
Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang
T-PAMI, 2024
Label-efficient Segmentation via Affinity Propagation
Wentong Li*, Yuqian Yuan*, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang
NeurIPS, 2023
Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport
Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
ICCV, 2023
Box-supervised Instance Segmentation with Level Set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang
ECCV, 2022
Oriented RepPoints for Aerial Object Detection
Wentong Li, Yijie Chen, Kaixuan Hu, Jianke Zhu
CVPR, 2022
Label-efficient Semantic Scene Completion with Scribble Annotations
Song Wang, Jiawei Yu, Wentong Li, Hao Shi, Kailun Yang, Junbo Chen, Jianke Zhu
IJCAI, 2024
MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu
CVPR, 2024
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Junbo Chen, Jianke Zhu
CVPR, 2024
Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering
Qiqun Gan, Wentong Li, Jinwei Ren, Jianke Zhu
AAAI, 2024
H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection
Xue Yang, Gefan Zhang, Wentong Li, Xuehui Wang, Yue Zhou, Junchi Yan
ICLR, 2023
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Song Wang, Wentong Li, Wenyu Liu, Xiaolu Liu, Jianke Zhu
CVPR, 2023
Honors
-
Outstanding Doctoral Dissertation Award of Zhejiang University, 2024
-
Outstanding Graduate of Zhejiang Province, 2024
-
Outstanding Graduate of Zhejiang University, 2024
-
Tencent Scholarship, 2023
-
Five-A Postgraduate Student, 2023
-
Outstanding Postgraduate Student, 2020-2023
-
Longhu Scholarship, 2022
-
First-class Academic Scholarship, 2018-2023
-
National Scholarship, 2016
Research Intern
Academic Services
-
Conference Reviewer:
AAAI2025, ICLR2025, CVPR2025, ICML2025
CVPR2024, ICLR2024, ICML2024, ECCV2024, ACM MM2024, NeurIPS2024
CVPR2023, ICCV2023, NeurIPS2023, ACM MM2023
-
Journal Reviewer:
Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
International Journal of Computer Vision (IJCV)
Transactions on Circuits and Systems for Video Technology (TCSVT)
Transactions on Multimedia (TMM)
Transactions on Geoscience and Remote Sensing (TGRS)
Pattern Recognition (PR)
ACM Computing Surveys
ISPRS Journal of Photogrammetry and Remote Sensing (P&RS)
Neurcomputing
Tech. Talks
-
Fine-grained Image Understanding with MLLMs, ECNU, Visual Perception+X(VPX) Group, 2024/09.
-
Osprey:Pixel Understanding with Visual Instruction Tuning, Video, slides, AI TIME, 2024/01.
-
Point-supervised Image Segmentation, AntGroup, Machine Intelligence Group, 2023/09.
Teaching Assistant
-
Teaching Assistant, Police Brain of Zhejiang Province, Image Processing and Analysis, Fall 2022.
-
Teaching Assistant, Zhejiang University, FDS2021: Foundation of Data Structure, Fall 2021.
© Wentong Li | Last update: Jan. 2025 |