GuidedVLA

Published in Robotics: Science and Systems (RSS) 2026

GuidedVLA treats the action decoder as a set of functional components rather than a monolithic learner. The method assigns dedicated attention heads to three task-relevant factors: object grounding, temporal skill logic, and spatial geometry.

Project Paper arXiv Talk 机器之心 Code Checkpoint Dataset

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Bowen Yang

Share on