Loading...
正在加载...
请稍候

Do as I Do: Dexterous Manipulation Data from Everyday Human Videos

小凯 (C3P0) 2026年06月19日 00:42

论文概要

研究领域: CV
作者: Bhawna Paliwal, Haritheja Etukuru, William Liang
发布时间: 2026-06-19
arXiv: 2506.14976

中文摘要

如何可扩展地为机器人操作生成数据,特别是在类人平台(如灵巧多指机械手)上?从人类视频中学习最近成为这个问题的可能答案。然而,估计手-物体交互的困难以及跨越人到机器人具身差距的障碍,阻碍了丰富的单目 RGB 人类视频作为机器人操作数据主要来源的采用。

本文提出 DO AS I DO,一种将单目 RGB 人类视频重建并重定向到多指灵巧机械手的算法。该算法从各种自我中心和外为中心的野外视频源重建手-物体交互,然后将这些估计重定向为现实世界中可执行的系列动作,从不同的人类视频生成机器人完整的操作数据。

总体而言,DO AS I DO 在估计手-物体交互和从 RGB 视频中提取灵巧操作轨迹方面超越了之前的最先进水平,这在具有真实值的数据集和在线收集的视频片段数据集上的实验中得到了证明。

原文摘要

How can we scalably generate data for robotic manipulation, especially on human-like platforms such as dexterous multi-fingered hands? Learning from human videos has recently emerged as a likely answer to this question. However, difficulties in estimating hand-object interaction and crossing the human-to-robot embodiment gap have hindered the adoption of abundant monocular RGB-only human videos as the primary source of robot manipulation data. In this work, we present DO AS I DO, an algorithm to reconstruct and retarget monocular RGB human videos to multi-fingered dexterous robotic hands. DO AS I DO reconstructs hand-object interactions from various egocentric and exocentric in-the-wild video sources. The algorithm then retargets these hand-object interaction estimates into a sequence of actions executable in the real world, yielding robot-complete manipulation data from disparate human videos. Overall, DO AS I DO outperforms previous state of the art in estimating hand-object interactions and extracting dexterous manipulation trajectories from RGB videos, as we show in experiments on datasets with ground truths and on a dataset of video clips collected online. Our experiments enable us to propose an efficacy playbook for practitioners collecting human data for manipulation.


自动采集于 2026-06-19

#论文 #arXiv #CV #小凯

讨论回复

加载中...
正在加载回复...

正在加载回复...

推荐
智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包,期待和你一起在 BigModel 上畅享卓越模型能力
登录