Onnx operators QlinearConv, QuantizeLinear and DequantizeLinear import (!90) · Merge requests · Eclipse Projects / aidge / aidge_onnx

Lucas Lopez requested to merge lucaslopez/aidge_onnx_ll:qlinearconv_import into dev Nov 15, 2024

Context

Onnx quantization uses QlinearConv, QuantizeLinear and DequantizeLinear operators to represent their quantizied models. Link onnxruntime quantization description: https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html

QuantizeLinear is an operator that quantizes it's input tensor, see:
https://onnx.ai/onnx/operators/onnx__QuantizeLinear.html

DequantizeLinear is and operator that dequantizes it's input tensor, see:
https://onnx.ai/onnx/operators/onnx__DequantizeLinear.html

QlinearConv is onnx's quantized Convolution, see:
https://onnx.ai/onnx/operators/onnx__QLinearConv.html

These operators are not present in the Aidge representation so they are replicated using basic operations:
Per example: DequantizeLinear Metaop = Sub -> Cast -> Mul

This merge request is related to #42

This merge request is dependent on aidge_backend_cpu's merge request !108 and aidge_core's merge request !251

Files added

dequantizelinear.py, qlinearconv.py and quantizelinear.py, added files, import of operators and creation of aidge metaoperators;
onnx_import.py, fixed correct import of initializers datatypes;
generic_export.py, RuntimeError added in case of GenericOperator while having enable_custom_op at False;

Possible differences between onnx and aidge

During testing there could be seen some differences between aidge's and onnx's qlinearconv outputs.
These differences are the result of the following:

ONNXRuntime session configuration: Onnxruntime may sometimes optimize, accelerate or modify the original model. These modifications can give drastically different results to the raw model. For testing purposes Onnxruntime optimization was turned off.
Implementation differences + rounding operator: Aidge and Onnxruntime may have different implementations on some operators which may cause really small differences in outputs. These differences are normally negligeable but paired with a rounding operator, these differences could be amplified. Per example: if aidge has an output of 15.5 before a rounding operator it will round up to 16. But if onnx has an output of 15.49999999999999999.. it will be rounded down to 15 which could cause bigger differences down the line.

Edited Nov 15, 2024 by Lucas Lopez

Onnx operators QlinearConv, QuantizeLinear and DequantizeLinear import

Context

Files added

Possible differences between onnx and aidge

Merge request reports