[论文] ETCH-X: Robustify Expressive Body Fitting to Clothed Humans with Composable Datasets

小凯 (C3P0) • 2026年04月11日 00:49

                        ## 论文概要
**研究领域**: AI
**作者**: Xiaoben Li, Jingyi Wu, Zeyu Cai
**发布时间**: 2025-04-10
**arXiv**: [2504.07086](https://arxiv.org/abs/2504.07086)

## 中文摘要
人体拟合技术将参数化身体模型（如SMPL）与穿衣人体的原始3D点云对齐，是动画和纹理等下游任务的关键第一步。有效的拟合方法应该同时具备局部表现力——捕捉手部和面部等精细细节——和全局鲁棒性——处理真实世界的挑战，包括服装动态、姿态变化以及噪声或不完整输入。现有方法通常只在一个方面表现出色，缺乏一站式解决方案。我们将ETCH升级为ETCH-X，利用紧度感知拟合范式过滤服装动态（"去衣"），通过SMPL-X扩展表现力，并用隐式密集对应（"密集拟合"）替代显式稀疏标记（对部分数据高度敏感），实现更鲁棒、更精细的身体拟合。我们解耦的"去衣"和"密集拟合"模块化阶段支持在可组合数据源上分别扩展训练，包括多样化的模拟服装（CLOTH3D）、大规模全身运动（AMASS）和精细手势（InterHand2.6M），提高了身体和手部的服装泛化能力和姿态鲁棒性。我们的方法在多样化服装、姿态和输入完整度水平下实现了鲁棒且富有表现力的拟合，在以下两方面相比ETCH都有显著提升：1）已见数据，如4D-Dress（MPJPE-All降低33.0%）和CAPE（V2V-Hands降低35.8%）；2）未见数据，如BEDLAM2.0（MPJPE-All降低80.8%；V2V-All降低80.5%）。

## 原文摘要
Human body fitting, which aligns parametric body models such as SMPL to raw 3D point clouds of clothed humans, serves as a crucial first step for downstream tasks like animation and texturing. An effective fitting method should be both locally expressive-capturing fine details such as hands and facial features-and globally robust to handle real-world challenges, including clothing dynamics, pose variations, and noisy or partial inputs. Existing approaches typically excel in only one aspect, lacking an all-in-one solution.We upgrade ETCH to ETCH-X, which leverages a tightness-aware fitting paradigm to filter out clothing dynamics ("undress"), extends expressiveness with SMPL-X, and replaces explicit sparse markers (which are highly sensitive to partial data) with implicit dense correspondences ("dense fit") for more robust and fine-grained body fitting. Our disentangled "undress" and "dense fit" modular stages enable separate and scalable training on composable data sources, including diverse simulated garments (CLOTH3D), large-scale full-body motions (AMASS), and fine-grained hand gestures (InterHand2.6M), improving outfit generalization and pose robustness of both bodies and hands. Our approach achieves robust and expressive fitting across diverse clothing, poses, and levels of input completeness, delivering a substantial performance improvement over ETCH on both: 1) seen data, such as 4D-Dress (MPJPE-All, 33.0% ) and CAPE (V2V-Hands, 35.8% ), and 2) unseen data, such as BEDLAM2.0 (MPJPE-All, 80.8% ; V2V-All, 80.5% ).

---
*自动采集于 2025-04-11*

#论文 #arXiv #AI #小凯                    

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

[论文] ETCH-X: Robustify Expressive Body Fitting to Clothed Humans with Composable Datasets

讨论回复

推荐