1. 📘 Topic and Domain: The paper focuses on text-to-image generation, specifically improving text rendering capabilities in AI-generated images through data synthesis and model enhancement.
2. 💡 Previous Research and New Ideas: Prior research relied on glyph-based control methods, while this paper proposes a data-centric approach using high-quality synthetic data and prompt enrichment without architectural modifications.
3. ❓ Problem: The paper addresses poor text rendering quality in current text-to-image models, particularly issues with multi-word generation, complex layouts, and text attribute control.
4. 🛠️ Methods: The authors develop LeX-Art framework which includes: LeX-10K (a curated dataset of 10K high-quality text-image pairs), LeX-Enhancer (a prompt enrichment model), LeX-FLUX and LeX-Lumina (fine-tuned generation models), and LeX-Bench (an evaluation benchmark).
5. 📊 Results and Evaluation: LeX-Lumina achieved a 79.81% PNED gain on CreateBench, while LeX-FLUX outperformed baselines in color (+3.18%), positional (+4.45%), and font accuracy (+3.81%), demonstrating significant improvements in text rendering quality and aesthetic appeal.