Onnx operators QlinearConv, QuantizeLinear and DequantizeLinear import
Context
Onnx quantization uses QlinearConv, QuantizeLinear and DequantizeLinear operators to represent their quantizied models. Link onnxruntime quantization description: https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html
QuantizeLinear is an operator that quantizes it's input tensor, see:
https://onnx.ai/onnx/operators/onnx__QuantizeLinear.html
DequantizeLinear is and operator that dequantizes it's input tensor, see:
https://onnx.ai/onnx/operators/onnx__DequantizeLinear.html
QlinearConv is onnx's quantized Convolution, see:
https://onnx.ai/onnx/operators/onnx__QLinearConv.html
These operators are not present in the Aidge representation so they are replicated using basic operations:
Per example: DequantizeLinear Metaop = Sub -> Cast -> Mul
This merge request is related to #42
This merge request is dependent on aidge_backend_cpu's merge request !108 and aidge_core's merge request !251
Files added
-
dequantizelinear.py
,qlinearconv.py
andquantizelinear.py
, added files, import of operators and creation of aidge metaoperators; -
onnx_import.py
, fixed correct import of initializers datatypes; -
generic_export.py
, RuntimeError added in case of GenericOperator while having enable_custom_op at False;
Possible differences between onnx and aidge
During testing there could be seen some differences between aidge's and onnx's qlinearconv outputs.
These differences are the result of the following:
- ONNXRuntime session configuration: Onnxruntime may sometimes optimize, accelerate or modify the original model. These modifications can give drastically different results to the raw model. For testing purposes Onnxruntime optimization was turned off.
- Implementation differences + rounding operator: Aidge and Onnxruntime may have different implementations on some operators which may cause really small differences in outputs. These differences are normally negligeable but paired with a rounding operator, these differences could be amplified. Per example: if aidge has an output of 15.5 before a rounding operator it will round up to 16. But if onnx has an output of 15.49999999999999999.. it will be rounded down to 15 which could cause bigger differences down the line.