go back

LaMI: Large Language Models for Multi-Modal Human-Robot Interaction

Chao Wang, Stephan Hasler, Daniel Tanneberg, Felix Ocker, Frank Joublin, Antonello Ceravola, Jörg Deigmöller, Michael Gienger, "LaMI: Large Language Models for Multi-Modal Human-Robot Interaction ", CHI 2024, 2024.

Abstract

In current approaches for designing human-robot interaction, engineers specialized in the field of robotics establish rules based on the context of an application scenario and a multimodal input from a user in order to define how the robot should react in the specific situation, and to generate an output accordingly. This represents a challenging task, as manually setting up the robot's interactive behavior in a specific situation is complex and requires considerable effort by a specialized engineer. On another hands, large language model (LLM) has the capability of social interaction with users. In this study, we suggested a framework to translate human’s multimodal input/output into text for learning social interaction by observing human-human interaction, later using this prior knowledge to drive multimodal output for robot-human interaction.



Download Bibtex file Download PDF

Search

Honda Research Institute Europe
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.