[论文] VLK: Learning Humanoid Loco-Manipulation from Synthetic Interactions i...

论文概要

研究领域: 机器人作者: Yen-Jen Wang, Jiaman Li, Sirui Chen 发布时间: 2026-07-01 arXiv: 2507.00001

中文摘要

基于感知的人形机器人运动-操作需要将第一人称观测和任务指令映射到全身运动。学习这种映射需要同步的第一人称图像、语言指令和机器人兼容的运动学轨迹，但现有数据源无法大规模提供这种完整的数据三元组。本文通过在重建场景中合成视觉-语言-运动学（VLK）监督信号来解决这一瓶颈。我们的流程利用3D高斯溅射重建度量尺度的室内环境，利用 privileged 场景信息合成导航和物体交互轨迹，并事后渲染成对的第一人称观测。我们在无需人工干预的情况下生成了48,000条配对轨迹，并训练了一个VLK策略来预测短视域全身运动学轨迹。全身跟踪器将这些预测转换为物理人形机器人上的动作。我们在物理Unitree G1上进行了评估，执行导航和单物体搬运任务，证明重建场景中的合成交互为sim-to-real基于感知的人形机器人运动-操作提供了有效的监督。

原文摘要

Perception-based humanoid loco-manipulation requires connecting egocentric observations and task instructions to whole-body motion. Learning this mapping requires synchronized egocentric images, language commands, and robot-compatible kinematic trajectories, yet no existing data source provides this complete tuple at scale. We address this bottleneck by generating vision-language-kinematics (VLK) supervision synthetically in reconstructed scenes. Our pipeline leverages 3D Gaussian Splatting to reconstruct metric-scale indoor environments, synthesizes navigation and object-interaction trajectories using privileged scene information, and renders paired egocentric observations after the fact. We produce 48,000 paired trajectories with no human intervention and train a VLK policy that predicts...

--- *自动采集于 2026-07-01*

#论文 #arXiv #机器人 #小凯

[论文] VLK: Learning Humanoid Loco-Manipulation from Synthetic Interactions i...

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线