Paper Title:
Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
Published on:
8 May 2024
Primary Category:
Machine Learning
Paper Authors:
Jiuxiang Gu,
Yingyu Liang,
Heshan Liu,
Zhenmei Shi,
Zhao Song,
Junze Yin
Proposes conv basis to decompose attention matrices into convolution matrices
Shows any attention matrix can be decomposed this way for efficient FFT computation
Achieves near linear-time attention inference without changing model parameters
Also accelerates attention training forward pass and backward gradient
May enable transformer application to much longer input contexts
Efficient attention computation for transformers
This paper develops a convolution-based method to efficiently approximate attention in transformers, reducing the quadratic complexity to nearly linear. It shows any attention matrix can be decomposed into convolution matrices, which enables fast Fourier transform for faster computation without changing model parameters.
Factorizable attention for transformers
Capturing Higher-Order Word Connections with Tensor Attention
Simplifying legal language model training
Efficient attention for long contexts
Efficient Transformer Pretraining with Query, Key, Value Grouping
Transformers Without MLPs: Converting Feedforward Layers to Attention
No comments yet, be the first to start the conversation...
Sign up to comment on this paper