Wentong Li 李文通

PhD

Zhejiang University
Email: liwentong@zju.edu.cn

     

About Me

I did my Ph.D at College of Computer Science and Technology, Zhejiang University, fortunately supervised by Prof. Jianke Zhu and Prof. Lei Zhang (PolyU, HK). My recent research interests are visual understanding and multimodal large language models, particularly in:
1. Enabling MLLMs with downstream vision tasks, including visual referring and grounding for image/video/3D scene.
2. Efficient and effective MLLMs , including token reduction, lightweight mllm, efficient high-resolution understanding.
Before, I mainly focus on the field of the techniques for object detection, image segmentaion and their weakly-supervised/label-efficient approaches. Besides, I am also interested in autonomous driving tasks (HD-Map, 3D-Occupancy, etc.) and 3D reconstruction tasks.

Preprints

photo
TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li*, Yuqian Yuan*, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang,
ArXiv:2407.02392, 2024

Selected Publications

photo
Osprey: Pixel Understanding with Visual Instruction Tuning
Yuqian Yuan*, Wentong Li*, Jian Liu, Dongqi Tang, Xinjie Luo, Chi Qin, Lei Zhang, Jianke Zhu
CVPR, 2024
photo
Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang
T-PAMI, 2024
photo
Label-efficient Segmentation via Affinity Propagation
Wentong Li*, Yuqian Yuan*, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang
NeurIPS, 2023
photo
Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport
Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
ICCV, 2023
photo
Box-supervised Instance Segmentation with Level Set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang
ECCV, 2022
photo
Oriented RepPoints for Aerial Object Detection
Wentong Li, Yijie Chen, Kaixuan Hu, Jianke Zhu
CVPR, 2022
photo
MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu
CVPR, 2024
photo
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Junbo Chen, Jianke Zhu
CVPR, 2024

PaperCode

photo
Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering
Qiqun Gan, Wentong Li, Jinwei Ren, Jianke Zhu
AAAI, 2024

PaperCode

photo
H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection
Xue Yang, Gefan Zhang, Wentong Li, Xuehui Wang, Yue Zhou, Junchi Yan
ICLR, 2023
photo
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Song Wang, Wentong Li, Wenyu Liu, Xiaolu Liu, Jianke Zhu
CVPR, 2023

PaperCode

Honors

Research Intern

Academic Services

Tech. Talks

Teaching Assistant

© Wentong Li | Last update: Oct. 2024