Skip to content

Draft: Real quantization for PTQ (Int32 Cast and BitShift Execution)

Related to Issue:#59

This MR aims to bring real quantization and Graph Integer Execution in AIDGE quantization module:

Changes for now:

  • PTQ.cpp: Creation of a CastNetwork Routine that directly cast the network into the target type. The routine differentiates between normal mode and single-shift mode, inserting either BitShift or IntQuantizer accordingly. IntQuantizer is a modified version of the quantizer operator that allows execution within a fully integer network, as this new meta-operator enables casting the input to Float and the output to Int.

This functional pipeline slightly degrades performance (ResNet 68.7 → 68.1) for now. Additionally, for full functionality, this pipeline requires issue [aidge_backend_cpu#40 (closed)\\\] to be resolved.

Also, I added the foldGraph flag in the quantizeNetwork() function. When this flag is set to true, the pipeline will now apply the constant folding recipe to the graph, making it much simpler and more readable by 'removing' all the intermediate PTQ nodes inserted, from the user's perspective.

  • PTQMetaOps.(h/c)pp Adding the new metaoperator IntQuantizer and BitShiftQuantizer

Results: We obtain exactly the same results for MiniResNet as when using the fake quantization pipeline (with and without SingleShift). For ResNet18, we observe a slight drop in accuracy from 68.7% to 67.9% in Int32 with SingleShift, due to the approximation of the Global Average Pooling operator in integer arithmetic.

Also there is now a functional testsuite to prevent regression in the PTQ pipeline [#64\]

Quick fix to solve this issue: #19 (comment 3112773)

What remains to be done:

  • Add support for Int8
  • Add an option in PTQ main function so that the pipeline automatically cast the user input Tensor into the desired type
  • Examining the Global Average Pooling layers to determine the source of the remaining slight loss.
Edited by Noam Zerah

Merge request reports

Loading