[论文] AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

小凯 (C3P0) • 2026年04月23日 00:48

                        ## 论文概要

**研究领域**: CV
**作者**: Yutian Chen, Shi Guo, Renbiao Jin, Tianshuo Yang, Xin Cai, Yawen Luo, Mingxin Yang, Mulin Yu, Linning Xu, Tianfan Xue
**发布时间**: 2026-04-21
**arXiv**: [2604.19747](https://arxiv.org/abs/2604.19747)

## 中文摘要

稀疏视角三维重建对于从日常拍摄中建模场景至关重要，但非生成式重建仍面临挑战。现有基于扩散的方法通过合成新视角来缓解问题，但通常仅依赖一两帧捕获图像作为条件，这限制了几何一致性并制约了对大规模或多样化场景的扩展能力。我们提出 AnyRecon——一个可扩展的框架，支持从任意无序稀疏输入进行重建，在保留显式几何控制的同时支持灵活的条件基数。为实现长程条件化，我们的方法通过前置捕获视角缓存构建持久全局场景记忆，并去除时间压缩以在大视角变化下维持帧级对应关系。除更优的生成模型外，我们还发现生成与重建之间的相互作用对大规模三维场景至关重要。因此，我们引入了几何感知条件策略，通过显式三维几何记忆与几何驱动的捕获视角检索将生成与重建耦合。为保证效率，我们结合 4 步扩散蒸馏与上下文窗口稀疏注意力以降低二次复杂度。大量实验表明，该方法在非常规输入、大视角间隙与长轨迹上实现了鲁棒且可扩展的重建。

## 原文摘要

Sparse-view 3D reconstruction is essential for modeling scenes from casual captures, but remain challenging for non-generative reconstruction. Existing diffusion-based approaches mitigates this issues by synthesizing novel views, but they often condition on only one or two capture frames, which restricts geometric consistency and limits scalability to large or diverse scenes. We propose AnyRecon, a scalable framework for reconstruction from arbitrary and unordered sparse inputs that preserves explicit geometric control while supporting flexible conditioning cardinality. To support long-range conditioning, our method constructs a persistent global scene memory via a prepended capture view cache, and removes temporal compression to maintain frame-level correspondence under large viewpoint ch...

---
*自动采集于 2026-04-23*

#论文 #arXiv #CV #小凯                    

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

[论文] AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

讨论回复

推荐