CUDA support for the Quantization routines (!18) · Merge requests · Eclipse Projects / aidge / aidge_quantization

Benjamin Halimi requested to merge DevQAT into dev Nov 13, 2024

Description

This MR introduces the CUDA support for the PTQ and QAT routines.

For the Quantization module, it is a game changer as it now allows to quantize real size models (e.g. ResNet18) in only few minutes instead of several hours.

Regarding the QAT, minor modifications were also made to make it functional, but for now the QAT is only working over small sized models.

Changes

Here is an exhaustive list of the changes made to the source files :

Support of the Leaky ReLUs for the PTQ
Support of the CUDA backend for the PTQ
Fix of the CUDA backend for the LSQ/FixedQ nodes
Support of the CUDA backend for the QAT routines
Add a recipe submodule for the Quantization module

TODO

This MR does not fully enable the QAT, which for now only works over small models/datasets.

Later works will also provide unit tests for the QAT.

CUDA support for the Quantization routines

Description

Changes

TODO

Merge request reports