Mixed precision quantization aware training : Integration of Fracbits in Aidge
This issue aims to integrate the mixed precision quantization aware training method Fracbits into Aidge.
It is structured as follows. Section Fracbits Method
introduces the Fracbit method. Then section Graph projection of Fracbit method
proposes an operator decomposition to integrate the Fracbit method as an Aidge graph. Section Tasks
lists the implementation tasks forseen to integrate the Fracbit method. Finally, section future work suggests an idea to improve the Fracbit method with little overhead.
Fracbits Method
The FracBits formulates bit-widths as continuous parameters between two consecutive integers. That is to say a distribution quantized with a continuous bitwidth λ
= 3.6 corresponds to the combinaison of the quantized distributions with floor(λ
) = 3 and ceil(λ
) =4. This approach allows for differentiable optimization, making bit-width selection a learnable process. FracBits supports both layer-wise and kernel-wise quantization, dynamically allocating bit-widths based on computational and model size constraints. By incorporating resource constraints like model size and computation cost (measured in BitOPs), FracBits efficiently discovers optimal bit-width configurations, yielding superior performance compared to prior mixed-precision methods.
Learning pipeline :
- Phase 1 - learn params + bitwidth
- Phase 2 - fix bitwidth
- Phase 3 - finetune weights & alpha (activation) with fixed bitwidth
References :
- Fracbits paper : https://arxiv.org/abs/2007.02017
- Fracbits builds on the SAT method : https://arxiv.org/pdf/1912.10207
Graph projection of Fracbit method
Graph projection in Aidge of the weight quantization from Fracbit method :
Graph projection in Aidge of the activation quantization from Fracbit method :
Tasks
-
Implement operator LearnableBitwidth_Op -
Implement operator Fracbit Interpolation -
Implement cost function Comp
(for weight only quantization) -
Implement cost function BitOps
(weight & activation quant) -
Implement recipes to add and remove the operators
Future work
- Adapt the Fracbit method to learn the range (=number of quantization bins) instead of the bitwidth.