BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

论文概要

研究领域: 计算机视觉作者: Yan Li, Zezi Zeng, Ziwei Zhou, Xin Gao, Muzhao Tian, Yifan Yang, Mingxi Cheng, Qi Dai, Yuqing Yang, Lili Qiu, Zhendong Wang, Zhengyuan Yang, Xue Yang, Lijuan Wang, Ji Li, Chong Luo 发布时间: 2026-03-26 arXiv: 2603.25732v1

中文摘要

近期图像生成模型的进展已将其应用从美学图像扩展到实际的视觉内容创作。然而，现有基准主要关注自然图像合成，未能系统性地评估模型在真实商业设计任务的结构化和多约束要求下的表现。在本工作中，我们引入了BizGenEval，一个用于商业视觉内容生成的系统性基准。该基准涵盖五种代表性文档类型：幻灯片、图表、网页、海报和科学图表，并评估四个关键能力维度：文本渲染、布局控制、属性绑定和基于知识的推理，形成20个多样化的评估任务。BizGenEval包含400个精心策划的提示和8000个人工验证的检查清单问题，以严格评估生成图像是否满足复杂的视觉和语义约束。我们对26个流行的图像生成系统进行了大规模基准测试，包括最先进的商业API和领先的开源模型。结果揭示了当前生成模型与专业视觉内容创作要求之间存在显著的能力差距。我们希望BizGenEval能够成为真实世界商业视觉内容生成的标准化基准。

原文摘要

Recent advances in image generation models have expanded their applications beyond aesthetic imagery toward practical visual content creation. However, existing benchmarks mainly focus on natural image synthesis and fail to systematically evaluate models under the structured and multi-constraint requirements of real-world commercial design tasks. In this work, we introduce BizGenEval, a systematic benchmark for commercial visual内容 generation. The benchmark spans five representative document types: slides, charts, webpages, posters, and scientific figures, and evaluates four key capability dimensions: text rendering, layout控制, attribute binding, and knowledge-based reasoning, forming 20 diverse evaluation tasks. BizGenEval contains 400 carefully curated prompts and 8000 human-verified check...

--- *自动采集于 2026-03-28*

#论文 #arXiv #计算机视觉 #小凯

BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线