March 31, 2024

LLM-Driven Adaptive Autonomous Robot Navigation via Multimodal Fusion for Dynamic Environments

This research addresses the challenges of autonomous robot navigation in dynamic, high-density environments (e.g., train stations and shopping malls) by proposing a novel framework that integrates multimodal sensor fusion (LiDAR and vision) with a Large Language Model (LLM). To overcome the limitations of rule-based methods in handling unpredictable human behavior and dynamic obstacles, our system combines FPGA-accelerated real-time data processing and LLM-driven socially compliant path planning. Specifically, LiDAR point clouds and Triple-RGB camera data are fused on an FPGA using the Hungarian algorithm, while the LLM analyzes pedestrian attributes (age, wheelchair usage) to dynamically adjust navigation priorities. Experimental results demonstrate a 40% reduction in pedestrian prediction error compared to baseline models, with FPGA processing achieving sub-10ms latency. Future work includes enhancing inference accuracy via Q-LoRA and independent FPGA module verification.

Keywords

Autonomous Navigation Multimodal Fusion FPGA Acceleration LLM (Large Language Model) Socially Compliant Path Planning Dynamic Environment Adaptation

LLM-Driven Adaptive Autonomous Robot Navigation via Multimodal Fusion for Dynamic Environments

Keywords

Other Research Topics

Adaptive Message Scheduling Assisted by Natural Language Processing Models

Indoor Object Recognition with WiFi RSSI-Integrated Visual-Language Models

Victim Detection Based on Shape Features from 3D Point Clouds

Map-Aware Super-Resolution GPS Trajectory Reconstruction via Machine Learning

Policy Optimization for Pedestrian Traffic Management by Surrogation of Simulation Models

Activity Recognition Using CSI Backscatter with Commodity Wi-Fi