Draft: Real quantization for PTQ (Int32 Cast and BitShift Execution) (!35) · Merge requests · Eclipse Projects / aidge / aidge_quantization · GitLab

Snippets Groups Projects

"apps/schema-manager/src/interfaces/response.interface.ts" did not exist on "67a6c1f191a92b2481c6b3a12edaf4994b3c4f50"

Closed Noam Zerah requested to merge noamzerah/aidge_quantization:real_quantization_cast into dev 3 months ago

Related to Issue:#59 (closed)

This MR aims to bring real quantization and Graph Integer Execution in AIDGE quantization module:

Changes for now:

PTQ.cpp: Creation of a CastNetwork Routine that directly cast the network into the target type. The routine differentiates between normal mode and single-shift mode, inserting either BitShift or IntQuantizer accordingly. IntQuantizer is a modified version of the quantizer operator that allows execution within a fully integer network, as this new meta-operator enables casting the input to Float and the output to Int.

This functional pipeline slightly degrades performance (ResNet 68.7 → 68.1) for now. Additionally, for full functionality, this pipeline requires issue [aidge_backend_cpu#40 (closed)\\\] to be resolved.

Also, I added the foldGraph flag in the quantizeNetwork() function. When this flag is set to true, the pipeline will now apply the constant folding recipe to the graph, making it much simpler and more readable by 'removing' all the intermediate PTQ nodes inserted, from the user's perspective.

PTQMetaOps.(h/c)pp Adding the new metaoperator IntQuantizer and BitShiftQuantizer

Results: We obtain exactly the same results for MiniResNet as when using the fake quantization pipeline (with and without SingleShift). For ResNet18, we observe a slight drop in accuracy from 68.7% to 67.9% in Int32 with SingleShift, due to the approximation of the Global Average Pooling operator in integer arithmetic.

Also there is now a functional testsuite to prevent regression in the PTQ pipeline [#64 (closed)\]

Quick fix to solve this issue: #19 (comment 3112773)

What remains to be done:

Add support for Int8
Add an option in PTQ main function so that the pipeline automatically cast the user input Tensor into the desired type
Examining the Global Average Pooling layers to determine the source of the remaining slight loss.

Edited 2 months ago by Noam Zerah

Activity

Noam Zerah added 2 commits 3 months ago
added 2 commits

ebe0fb42 - Full int32 with bitshift pipeline

2be85f14 - Adding Fully functional Cast to the desired type in the PTQ pipeline

Compare with previous version
Noam Zerah changed the description 3 months ago

changed the description
Noam Zerah changed the description 3 months ago

changed the description
Noam Zerah changed the description 3 months ago

changed the description
Noam Zerah added 1 commit 2 months ago
added 1 commit

9289d0c4 - Changing the way we decide if one apply real cast to the PTQ or not

Compare with previous version
Noam Zerah added 1 commit 2 months ago
added 1 commit

f891fc60 - Adding BitShiftQuantizer MetaOperator that clip the value after the bitshift,...

Compare with previous version
Noam Zerah added 2 commits 2 months ago
added 2 commits

35e30401 - updating gitignore

b627ea1f - Use cuda does not set the graphView at the end

Compare with previous version
Noam Zerah added 1 commit 2 months ago
added 1 commit

280506d0 - Adding quantization tag to producers of BitShift and Compensation nodes in the...

Compare with previous version
Noam Zerah changed the description 2 months ago

changed the description
Noam Zerah added 1 commit 2 months ago
added 1 commit

5a53de58 - Adding the script PTQ_tq.py to prevent regression in the PTQ pipeline

Compare with previous version
Noam Zerah changed the description 2 months ago

changed the description
Noam Zerah mentioned in issue #59 (closed) 2 months ago

mentioned in issue #59 (closed)
Noam Zerah closed 2 months ago

closed

Please register or sign in to reply

Copyright © Eclipse Foundation, Inc. All Rights Reserved. Privacy Policy | Terms of Use | Copyright Agent