[论文] PixVOD: Pixel-Distributed Direct Visual Odometry and Depth Estimation

小凯 (C3P0) • 2026年06月04日 00:42

论文概要

研究领域: CV
作者: Shinjeong Kim, Ignacio Alzugaray, Callum Rhodes, Paul H. J. Kelly, Andrew J. Davison
发布时间: 2026-06-02
arXiv: 2606.03989

中文摘要

由2D像素阵列组成的图像是计算机视觉算法的标准输入，但许多底层计算可以分布在像素之间。传输原始、冗余和嘈杂的像素数据离开传感器仍然效率低下，这推动了向焦平面传感器处理器的转变，这些处理器在每个像素内直接执行大部分计算。我们设想像素在本地合成更高级别的信号，减少下游负载，并为更高级别的视觉任务提供更丰富的输入。我们提出了一种完全可并行的像素级视觉里程计和深度估计形式，其中传感器处理器通过高斯信念传播（GBP）交换信息，以就相机运动达成共识，并从每个像素的光度观测和表面法线先验推断深度。为了在优化过程中保持几何稳定性，我们引入了类似关键帧的锚定机制，调节帧之间的有效基线，实现一致的运动和深度更新。

原文摘要

Images composed of 2D pixel arrays are the standard input to computer vision algorithms, yet many underlying computations can be distributed across pixels. Transmitting raw, redundant, and noisy pixel data off the sensor remains inefficient, motivating a shift toward focal-plane sensor-processors that perform a significant part of the computation directly within each pixel. We envision pixels synthesizing higher-level signals locally, reducing downstream load, and providing richer inputs for higher-level vision tasks. We propose a fully parallelizable form of visual odometry and depth estimation across pixels, where sensor-processors exchange information through Gaussian Belief Propagation (GBP) to achieve consensus about camera motion and infer depth from per-pixel photometric observation...

自动采集于 2026-06-04

#论文 #arXiv #CV #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力