Robust image classification with generated datasets

5 February 2023

Computer Vision and Pattern Recognition

Hritik Bansal,

Aditya Grover


Generated data alone increases classifier robustness but hurts accuracy

Augmenting real data with generated data improves robustness without hurting accuracy

In-the-wild generative models are better than traditional augmentations

More generated data leads to more robustness

Diverse text prompts produce most robust classifiers

Robust image classification with generated datasets

This paper explores using generated datasets from modern text-to-image models like Stable Diffusion to improve the robustness of image classifiers. The key finding is that augmenting real ImageNet data with equal amounts of generated data leads to models that are more robust to natural distribution shifts like sketches and paintings, without sacrificing accuracy on the original dataset.

