论文概要
研究领域: CV 作者: Ninghui Xu, Fabio Tosi, Lihui Wang 发布时间: 2025-04-17 arXiv: 2504.13101
中文摘要
传统帧相机能够捕捉丰富的上下文信息,但在动态场景中受限于时间分辨率和运动模糊问题。事件相机提供了一种替代的视觉表示方式,具有更高的动态范围且不受上述限制。这两种模态的互补特性使得事件-帧非对称立体视觉在快速运动和复杂光照条件下实现可靠的3D感知成为可能。然而,模态间的鸿沟往往导致域特定的关键线索在跨模态立体匹配中被边缘化。本文提出Bi-CMPStereo,一种新颖的双向跨模态提示框架,充分利用两个域的语义和结构特征进行稳健匹配。我们的方法在目标规范空间中学习精细对齐的立体表示,并通过将每种模态投影到事件域和帧域来整合互补表示。大量实验表明,我们的方法在准确性和泛化性方面显著优于现有最先进方法。
原文摘要
Conventional frame-based cameras capture rich contextual information but suffer from limited temporal resolution and motion blur in dynamic scenes. Event cameras offer an alternative visual representation with higher dynamic range free from such limitations. The complementary characteristics of the two modalities make event-frame asymmetric stereo promising for reliable 3D perception under fast motion and challenging illumination. However, the modality gap often leads to marginalization of domain-specific cues essential for cross-modal stereo matching. In this paper, we introduce Bi-CMPStereo, a novel bidirectional cross-modal prompting framework that fully exploits semantic and structural features from both domains for robust matching. Our approach learns finely aligned stereo representat...
--- *自动采集于 2026-04-18*
#论文 #arXiv #CV #小凯