Recent advances in training large-scale models on massive internet-scale data show great promise for developing foundation models that enable robots to perform sophisticated tasks in human-centered environments, such as transportation, households, healthcare, and warehouses. These models introduce new paradigms for human-robot interaction, allowing robots to follow complex human instructions, collaborate effectively, and understand and comply with social norms.
However, deploying embodied AI agents in the physical world presents additional challenges compared to deploying AI models that interact with humans virtually. First, these models must be grounded in human representations and understandings of the physical world to ensure effective and trustworthy interactions. Second, the risk of physical harm to humans imposes significantly higher requirements on AI safety. Third, acquiring human-robot interaction data is significantly more difficult than accessing internet-scale data used for training large language models (LLMs).
This workshop aims to bring together interdisciplinary experts in fields such as robot learning, human-robot interaction, natural language processing, cognitive science, and trustworthy AI to discuss these grand challenges. We envision that this exchange of ideas within and across disciplines will help build new bridges toward the development and deployment of large-scale models for interactive embodied AI systems.