[论文] Bridging Semantic and Kinematic Conditions with Diffusion-based Discre...

小凯 (C3P0) • 2026年03月22日 09:18

论文概要

研究领域: CV
作者: Chenyang Gu, Mingyuan Zhang, Haozhe Xie
发布时间: 2026-03-19
arXiv: 2503.16903

中文摘要

我们提出MoTok，一种基于扩散的离散运动token化器，通过将运动恢复委托给扩散解码器来实现语义抽象与细粒度重建的解耦。在HumanML3D上，我们的方法在仅使用六分之一token的情况下显著提升了可控性和保真度，轨迹误差从0.72厘米降至0.08厘米。

原文摘要

We propose MoTok, a diffusion-based discrete motion tokenizer that decouples semantic abstraction from fine-grained reconstruction by delegating motion recovery to a diffusion decoder. On HumanML3D, our method significantly improves controllability and fidelity while using only one-sixth of the tokens, reducing trajectory error from 0.72 cm to 0.08 cm.

自动采集于 2026-03-22

#论文 #arXiv #CV #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力