Paper Image

Attention heads improve vision networks

Published on:

25 October 2023

Primary Category:

Computer Vision and Pattern Recognition

Paper Authors:

Jongbin Ryu,

Dongyoon Han,

Jongwoo Lim


Key Details

Proposes lightweight multi-head network architecture as alternative to channel expansion

Computes Gramian matrices to enhance heads via pairwise feature similarity

Introduces decorrelation loss to encourage heads to complement each other

Demonstrates accuracy and throughput advantages over CNNs and ViTs on ImageNet-1K

Exhibits strong performance on downstream COCO segmentation

AI generated summary

Attention heads improve vision networks

This paper introduces a novel network design that uses multiple lightweight attention heads to improve vision network performance, instead of relying solely on channel expansion or additional blocks. It computes Gramian matrices to enhance multiple heads via pairwise feature similarity, strengthening their aggregation. A decorrelation loss is also proposed, which encourages heads to complement each other by reducing correlation. Experiments show these Gramian attention heads surpass CNNs and ViTs regarding accuracy and throughput on ImageNet-1K. Downstream tasks like COCO segmentation further demonstrate superiority.

Answers from this paper


No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up