论文概要
研究领域: CV 作者: Yaohan Guan, Pristina Wang, Najim Dehak中文摘要
在许多科学论文中,"Figure 1"作为核心研究思想的主要视觉摘要。这些图在视觉上简单但概念丰富,通常需要人类作者付出大量努力和迭代才能做好,突显了科学视觉传播的难度。基于这一直觉,我们引入GENFIG1,这是一个用于生成式AI模型(如视觉-语言模型)的基准测试。GENFIG1评估模型生成能够清晰表达和激励论文中心思想的图的能力,输入为标题、摘要、引言和图标题。解决GENFIG1需要的不仅仅是生成视觉上吸引人的图形:该任务涉及将科学理解与视觉合成相结合的文本到图像生成推理。具体而言,模型必须(i)理解并掌握论文的技术概念,(ii)识别最显著的概念,(iii)设计一个连贯且美学上有效的图形,以视觉上忠实于输入的方式传达这些概念。我们从顶级深度学习会议发表的论文中筛选基准测试,应用严格的质量控制,并引入一个与专家人类判断相关性良好的自动评估指标。我们在GENFIG1上评估了一系列代表性模型,并证明该任务对即使是性能最好的系统也提出了重大挑战。我们希望这个基准测试能成为多模态AI未来进展的基础。原文摘要
In many science papers, "Figure 1" serves as the primary visual summary of the core research idea. These figures are visually simple yet conceptually rich, often requiring significant effort and iteration by human authors to get right, highlighting the difficulty of science visual communication. With this intuition, we introduce GENFIG1, a benchmark for generative AI models (e.g., Vision-Language Models). GENFIG1 evaluates models for their ability to produce figures that clearly express and motivate the central idea of a paper (title, abstract, introduction, and figure caption) as input. Solving GENFIG1 requires more than producing visually appealing graphics: the task entails reasoning for text-to-image generation that couples scientific understanding with visual synthesis. Specifically, models must (i) comprehend and grasp the technical concepts of the paper, (ii) identify the most salient ones, and (iii) design a coherent and aesthetically effective graphic that conveys those concepts visually and is faithful to the input. We curate the benchmark from papers published at top deep-learning conferences, apply stringent quality control, and introduce an automatic evaluation metric that correlates well with expert human judgments. We evaluate a suite of representative models on GENFIG1 and demonstrate that the task presents significant challenges, even for the best-performing systems. We hope this benchmark serves as a foundation for future progress in multimodal AI.--- *自动采集于 2026-04-07*
#论文 #arXiv #AI #小凯 #自动采集