Skip to main content Skip to secondary navigation

Human-Robot Cooperative Transport

Main content start

Project Motivation

As robots become more capable assistants, it is important that they be able to collaborate leveraging implicit communication and situational awareness. This project of human-robot cooperative transport exemplifies a scenario where the robot serves as a valuable teammate but it is untenable for the human to issue constant explicit commands. The robot instead must be able to observe the human as well as the environment and predict where the human is trying to go.  In a centralized system (all robots), a single governing controller would specify how every agent should move to transport the object to the goal location and 'waste' as little energy as possible compressing or stretching the object during transport. This compressing or stretching can be characterized as interaction forces (forces that don't contribute to motion), and minimizing these is often considered a metric for efficient transport. Therefore, the goal is to have the robot leverage its knowledge of the human and environment to transport efficiently. 

Research Objectives

Research objectives include:

  • Develop robotic platform capable of modeling humans intended motion from measured quantity
  • Utilize human observation and the surrounding environment to predict the desired motion
  • Understand how to grasp/re-grasp objects with the human leader for successful placement

Current Students

Related Publications


Active User Studies

Modeling multimodal human behavior accurately has been a key barrier to increasing the level of interaction between human and robot, particularly for collaborative tasks. Our key insight is that the predictive accuracy of human behaviors on physical tasks is bottlenecked by the model for methods involving human behavior prediction. We present a method for training denoising diffusion probabilistic models on a dataset of collaborative human-human demonstrations and conditioning on past human partner actions to plan sequences of robot actions that synergize well with humans during test time. We demonstrate the method outperforms other state-of-art learning methods on human-robot table-carrying, a continuous state-action task, in both simulation and real settings with a human in the loop. Moreover, we qualitatively highlight compelling robot behaviors that arise during evaluations that demonstrate evidence of true human-robot collaboration, including mutual adaptation, shared task understanding, leadership switching, learned partner behaviors, and low levels of wasteful interaction forces arising from dissent.

Cooperative table-carrying is a complex task due to the continuous nature of the action and state-spaces, multimodality of strategies, and the need for instantaneous adaptation to other agents. In this work, we present a method for predicting realistic motion plans for cooperative human-robot teams on the task. Using a Variational Recurrent Neural Network (VRNN) to model the variation in the trajectory of a human-robot team across time, we are able to capture the distribution over the team’s future states while leveraging information from interaction history. The key to our approach is leveraging human demonstration data to generate trajectories that synergize well with humans during test time in a receding horizon fashion. Comparison between a baseline, sampling-based planner RRT (Rapidly-exploring Random Trees) and the VRNN planner in centralized planning shows that the VRNN generates motion more similar to the distribution of human-human demonstrations than the RRT. Results in a human-in-the-loop user study show that the VRNN planner outperforms decentralized RRT on task-related metrics, and is significantly more likely to be perceived as human than the RRT planner. Finally, we demonstrate the VRNN planner on a real robot paired with a human teleoperating with another robot.

Preliminary Work: The intended pose of the human leader is estimated implicitly by observing the human's head direction, the applied wrench through the rigid carried object, the human steps, the twist of the carried object and the surrounding obstacles. Each of the above quantities can be intuitively related to the expected pose by an integrator (e.g. twist integrates to pose, wrench integrated twice predicts pose) with surrounding obstacles being constraints. However the combination of all of these terms is non-trivial and is the main contribution of this work.