1. 📘 Topic and Domain: A comprehensive benchmark framework called OneIG-Bench for evaluating text-to-image (T2I) generation models across multiple dimensions including prompt-image alignment, text rendering, reasoning, stylization, and diversity.
2. 💡 Previous Research and New Ideas: Based on previous single-dimensional benchmarks like T2ICompBench and GenEval, this paper proposes a novel multi-dimensional evaluation framework with specialized metrics for each dimension.
3. ❓ Problem: The paper addresses the lack of comprehensive evaluation methods for modern text-to-image models, particularly in areas like reasoning ability, text rendering accuracy, and stylization capabilities.
4. 🛠️ Methods: The authors created a benchmark with over 1000 prompts across six categories (General Object, Portrait, Anime/Stylization, Text Rendering, Knowledge/Reasoning, Multilingualism), developing specific quantitative metrics for each dimension.
5. 📊 Results and Evaluation: The evaluation showed that closed-source models generally outperformed open-source ones, with GPT-4o demonstrating superior performance across most dimensions, while Seedream 3.0 excelled specifically in text rendering.