Add support for QLinearConv

marked the checklist item Create a simple network with one Conv as completed

marked the checklist item Quantize this network with random data as completed

marked the checklist item Fuse Conv and Scaling using the aidge_core.fuse_to_metaops() function into a MetaOperator QLinearConv as completed

added StatusWork in Progress label

changed the description

marked the checklist item Add support in export node for a MetaOperator QLinearConv as completed

changed the description

Can this also be done for Gemm + Scaling being transformed to a QLinearMatMul MetaOperator ?

Hello @julienl

Yes it will be done next, we are currently experimenting with the QLinearConv and trying to understand the ONNX formalism

Cheers,

Cyril

Okay, thank you for the update

We have an issue with the export to QLinearConv, in current Aidge PTQ, scaling factor is hidden inside the weights and so we cannot export it right away.

@noamzerah and @bhalimi are gonna work on changing Aidge PTQ method to expose the scaling factor.

mentioned in merge request !90 (merged)

mentioned in merge request aidge_backend_cpu!106 (merged)

mentioned in merge request aidge_core!251 (merged)

mentioned in issue aidge_quantization#47 (closed)

assigned to @noamzerah

marked the checklist item Perform inferences in the exported model and aidge model as completed

marked the checklist item Perform inferences in the exported model and aidge model as incomplete

added #48 as child task

mentioned in merge request aidge_quantization!33 (merged)

closed

mentioned in issue aidge_core#230 (closed)

reopened

I would like to highlight that the implementation of this method has also led to a slight performance improvement when using the single shift approximation. In the previous approach, the values of the producers' tensors were directly scaled and then rounded. During the application of the single shift, we would rescale the values from the rounded result and subsequently round them again. The accumulation of these two rounding steps resulted in a greater loss of precision. In the current solution, by introducing scaling nodes and utilizing a single rounding step, we are able to "stack" the different coefficients within the scaling node, thus reducing the rounding to only one step.

mentioned in issue aidge#246

mentioned in merge request aidge_quantization!41

mentioned in merge request !112

changed the description

assigned to @cmoineau

unassigned @noamzerah

mentioned in issue #61

Add support for QLinearConv

TODO:

import

export

verifications

Designs

Child items ...

Activity