Generating images with accurate visual text

25 March 2024

Computer Vision and Pattern Recognition

Sanyam Lakhanpal,

Shivang Chopra,

Vinija Jain,

Aman Chadha,

Man Luo


Created benchmark to test generating long, uncommon text

Identified limitations of existing text-to-image models

Proposed training-free method to reduce text overlap and fix misspellings

Achieved gains of over 20% in text accuracy metrics

This paper introduces methods to improve the accuracy of text rendered within images produced by AI systems. The researchers created a benchmark to evaluate models on generating lengthy, complex text. They then developed a training-free technique to enhance existing models by minimizing text overlap and correcting spelling errors, demonstrating significant gains.

