Lightweight End-to-End Multimodal Model for Autonomous Driving
0
73
Zhijie Qiao1,†, Haowei Li1,†, Zhong Cao1, Henry X. Liu1,2,*This research was partially funded by the DARPA TIAMAT Challenge (HR0011-24-9-0429).1Z.Qiao, H.Li, Z.Cao, and H.X.Liu are with the Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, MI 48109, USA.2H.X.Liu is also with University of Michigan Transportation Research Institute, Ann Arbor, MI 48109, USA. †These authors contributed equally to this work.*Corresponding author: Henry X. Liu (henryliu@umich.edu).AbstractVision-Language Models (VLMs) have demonstrated significant potential for end-to-end autonomous driving. However, fully exploiting their capabilities for safe and reliable vehicle control remains an open research challenge. To systematically examine advances and limitations of VLMs in driving tasks, we introduce LightEMMA, a Lightweight End-to-End Multimodal Model for Autonomous driving.LightEMMA provides a unified, VLM-based autonomous driving framework without ad hoc customizations, e...

