Upcoming Schedule

Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities

Name: Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities
Start: 2025-05-23T08:30:00-04:00
End: 2025-05-23T17:30:00-04:00

This event has passed.

May 23, 2025 @ 8:30 am - 5:30 pm

Meeting Room: 411

This workshop focuses on the safety implications of using vision-language foundation models (VLMs) in robotics, which are increasingly employed in systems like mobile manipulators and autonomous cars. VLMs, pre-trained on internet-scale data, offer the potential to enhance robots’ understanding of complex environments, human feedback, and action planning. However, as robots become more integrated into safety-critical applications, the potential risks from incorrect visual or language interpretations, misaligned behaviors, or slow inference times become a serious concern.

The workshop aims to address how to safely leverage VLMs in robotics, exploring safety challenges throughout the lifecycle of these models. Topics include:

Training: Determining the types of embodied data needed to achieve desired robotic capabilities.
Fine-tuning: Aligning models with human input and ensuring they are adaptable to different contexts.
Deployment: Ensuring models can run in real-time, detect out-of-distribution scenarios, and reliably hand over control to fallback strategies when necessary.

The goal is to foster dialogue between academia, industry, and regulators, raising important questions about the core safety challenges, promising strategies, limitations, and regulatory considerations related to deploying VLMs in robotics.

Website:

https://sites.google.com/stanford.edu/safe-vlm-icra/home