Paper Image

Quantized neural network training equivalence

Published on:

8 May 2024

Primary Category:

Machine Learning

Paper Authors:

Matt Schoenbauer,

Daniele Moro,

Lukasz Lew,

Andrew Howard

Bullets

Key Details

Many proposed gradient estimators are equivalent to the straight-through estimator

Equivalence holds after adjusting learning rate and weight initialization

Result applies for both SGD and Adam optimization

Shown to apply for small CNNs and large ResNets

Concern about 'gradient error' is unfounded based on this

AI generated summary

Quantized neural network training equivalence

This paper proves that many proposed complex gradient estimators for quantized neural networks are equivalent to simpler estimators like the straight-through estimator. After adjustments to the learning rate and weight initialization, models using complex estimators train almost identically to those using the straight-through estimator.

Answers from this paper

Comments

No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up