Visual Robotic Bridge

1 minute read

Paper Title	Affordances from Human Videos as a Versatile Representation for Robotics
Authors	Bahl et al.
Date	2023-06
Link	https://robo-affordances.github.io/resources/vrb_paper.pdf

Paper Review

Short Summary

The paper introduce a representation learning method and transfering methods that bridge the gap between deep-learning-based visual models and robotic tasks. The proposed representation is learning Point-of-contact and Post-contact trajectory from human-based ego-centric videos. The authors also describe methods to use the learned representation to boostrap 4 different tasks, centering around focusing the robot’s attention to a narrower set of point of contact and action spaces. Finally, the experiments show that the learned representation was able to out-perform current methods on most of the benchmark datasets.

Strengths

A novel, simple, yet effective method to bridging the gap between the success of visual models and the robotic tasks.
Leverage existing off-the-shelf tools to effectively collect labels, a lot of good design choices was made during this step (e.g. using GMM to collect the contact points; clever way to limit the data disparity between human-centric images and robot-centric images, etc.)
Comprehensive work from ideation to deployment and experiment in a real-life robotic setting.

Weaknesses

The affordance model was not described in enough details about the set up and training (e.g. what is the format of the output for the transformer-based trajectory network, they only mention “trajectory of length 5”).
They did not discuss in detailed the connection between their approach and other bridging approaches and why do they think theirs is superior.

Reflection

I had never thought of and didn’t realize about this gap between the ML models (vision, nlp) and the real-world problems, and how despite the recent success in deep learning not much has been transfered to the offline world. This paper remind me of this gap and the challenges in bridging it.

Most interesting thought/idea from reading this paper

The gap between ML model and real-world, physical problems, may create a lot of jobs for ML practitioners in the next few years.

Share on

Twitter Facebook LinkedIn

Visual Robotic Bridge

Paper Review

Short Summary

Strengths

Weaknesses

Reflection

Most interesting thought/idea from reading this paper

Share on

You may also enjoy

Key Concepts Of Langchain

A Beginner Introduction To Ranking Model

Observation Vs Ground Truth And Why Data Analysts Are Important

Applying Sequence Classification To Grocery Data Using Product Embeddings