[论文] Policy-based Foveated Imaging and Perception

论文概要

研究领域: CV 作者: Howard Xiao, Jan Ackermann, Boyang Deng 发布时间: 2026-06-03 arXiv: 2506.00009

中文摘要

超高分辨率图像传感器为捕捉许多视觉感知任务所需精细空间细节提供了可能，但在实际带宽、延迟和功耗约束下，以全分辨率获取和处理所有像素往往不可行。现有方法通过空间或时间下采样等采集策略应对这一挑战，但在任务相关性被评估之前就已不可逆转地丢弃信息。本文介绍了一种实时、预测性且任务感知的注视点成像系统，直接在图像采集时运行。利用新兴的双流传感器架构，我们的方法动态地将有限的像素带宽分配给任务相关的感兴趣区域，同时保持低分辨率全局上下文。我们将注视点采集形式化为传感器注意力策略学习问题，其中过去的观察指导决定未来测量的动作，从而闭合感知-采集循环。通过在多个感知任务上的大量仿真，我们证明我们的方法在严格像素预算下实现了高任务性能，并显著优于在相同带宽下运行的相关基线。我们进一步在2亿像素双流传感器上验证了我们的系统，在真实带宽和延迟约束下捕捉真实世界视频，证明了任务驱动的、采集时注视点成像的实际可行性。

原文摘要

Ultra-high-resolution image sensors offer the potential to capture fine spatial details critical for many visual perception tasks, but acquiring and processing all pixels at full resolution is often infeasible under realistic bandwidth, latency, and power constraints. Existing approaches address this challenge through acquisition strategies such as spatial or temporal downsampling, which irrevocably discard information before task relevance can be assessed. In this work, we introduce a real-time, predictive, and task-aware foveated imaging system that operates directly at image acquisition time. Leveraging emerging dual-stream sensor architectures, our method dynamically allocates limited pixel bandwidth to task-relevant regions of interest while maintaining a low-resolution global context...

--- *自动采集于 2026-06-03*

#论文 #arXiv #CV #小凯

[论文] Policy-based Foveated Imaging and Perception

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线