Global Quantization Improvements (!36) · Merge requests · Eclipse Projects / aidge / aidge_quantization

Benjamin Halimi requested to merge improvements into dev Jan 28, 2025

Description

This merge request aims to rework several parts of the existing PTQ and QAT code. The provided new changes don't add extra features to the quantization module, but rather enhance the implementation of the existing ones. Here are the lists of changes.

Changes

Regarding the PTQ :

simplify the insertCompensationNodes() routine
improve the tensor manipulation utility functions, that is :
getTensorAbsMax()
rescaleTensor()
roundTensor()
migrate the CLE data types from float to double
move the 'PTQMetaOps' files to the operator folder

Regarding the QAT :

completely rework the code architecture, that is :
use addBeforeForward() instead of adding a calibration step in the workflow
rework the inputs/weights quantizers node insertion routines
add a node name sanitizer function at the begining of the setupQuantizer()
improve the tensor manipulation utility functions, that is :
getTensorAbsMean()
getTensorStd()
change the initial step-size formula to a simpler one which appears to work better

Also, all the std::cout verbose logs have been replaced with Log ones.

Global Quantization Improvements

Description

Changes

Regarding the PTQ :

Regarding the QAT :

Merge request reports