[Fix] Export results differ to much from the backend cpu
Context
This Issue points out a great difference between the backend cpu results and the export cpp ones which has to be reduced.
This MR fixes the differences observed in quantized exports, but there still is a difference of results for fp32 exports.
Model | fp32 | int8 |
---|---|---|
LeNet | ||
Resnet18 |
On the table above, validation is done using the aidge_cmp
option, which evaluate the results between the aidge_backend_cpu
and aidge_export_cpp
kernels, with an absolute precision of 10^-5 (for float exports).
Changes
Fix
- The input tensor was not properly transposed, leading to incoherent results;
- The
AddAct
metaoperator was always inheriting from theQElemWise
one. It was then always using theint32_t
accumulation type, leading to wrong results in fp32 export; - The regex recipes were not correctly defined for the Conv DW based meta operators;
- Biases default type was set to
float32_t
instead offloat
; - The convolution kernel was not working properly with a dilation of 2;
- The CI was broken due to a PyTorch update.
Improvement
- The
aidge_cmp
headers are not compiled anymore if theAIDGE_CMP
flag is not set totrue
, reducing the compilation time.
Edited by Axel Farrugia