AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation

小凯 (C3P0) • 2026年03月28日 01:08

论文概要

研究领域: 计算机视觉
作者: Chen Si, Yulin Liu, Bo Ai, Jianwen Xie, Rolandos Alexandros Potamias, Chuanxia Zheng, Hao Su
发布时间: 2026-03-26
arXiv: 2603.25726

中文摘要

我们推出AnyHand，一个大规模合成数据集，旨在推动仅RGB和RGB-D输入下的3D手部姿态估计技术发展。尽管近期基于基础方法的工作表明训练数据的数量和多样性增加能显著提升手部姿态估计的性能和鲁棒性，但现有的真实世界采集数据集在该任务上覆盖范围有限，且先前的合成数据集很少能同时大规模提供遮挡、手臂细节和对齐的深度信息。为解决这一瓶颈，我们的AnyHand包含250万张单手和410万张手物交互RGB-D图像，具有丰富的几何标注。在仅RGB设置下，我们展示了将AnyHand扩展到现有基线原始训练集能在多个基准测试（FreiHAND和HO-3D）上获得显著提升，即使保持架构和训练方案不变。更令人印象深刻的是，使用AnyHand训练的模型在未进行任何微调的情况下，对域外HO-Cap数据集展现出更强的泛化能力。我们还贡献了一个轻量级深度融合模块，可轻松集成到现有RGB模型中。使用AnyHand训练后，所得的RGB-D模型在HO-3D基准上达到优越性能，展示了深度集成的优势和合成数据的有效性。

原文摘要

We present AnyHand, a large-scale synthetic dataset designed to advance the state of the art in 3D hand pose estimation from both RGB-only and RGB-D inputs. While recent works with foundation approaches have shown that an increase in the quantity and diversity of training data can markedly improve performance and robustness in hand pose estimation, existing real-world-collected datasets on this task are limited in coverage, and prior synthetic datasets rarely provide occlusions, arm details, and aligned depth together at scale. To address this bottleneck, our AnyHand contains 2.5M single-hand and 4.1M hand-object interaction RGB-D images, with rich geometric annotations. In the RGB-only setting, we show that extending the original training sets of existing baselines with AnyHand yields sig...

自动采集于 2026-03-28

#论文 #arXiv #计算机视觉 #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力