Rounding issue with singleshift mode in ScalingImpl_cpu_forward_kernel
What commit version of aidge do you use
-
aidge_core
: allowNoInputProducer: commit 7776fc6a421d8d07780a822b17bb9153e38c843e -
aidge_backend_cpu
: allowNoInputProducer: commit 684f7bc6 -
aidge_quantization
: DevQAT: commit b431c59670e1ca4b41680d8b544718262b593924 -
aidge_onnx
: allowNoInputProducer: commit b97899532c78f24c7b0c0e4cf8fa2a452dcfb9e7
Context description
I export models on 8bits architecture. In order to match the data resolution, I use quantization 8bits in single-shift mode.
# Configuration for the model
model.set_datatype(aidge_core.dtype.float32)
model.set_backend("cpu")
model.compile("cpu", aidge_core.dtype.float32, dims=[[1, input_dims[0], input_dims[1], input_dims[2]]])
# Create Aidge subset for PTQ
nb_calib = 10
calib_aidge_tensors = []
for i in range(nb_calib):
input_data = np.random.rand(1, input_dims[0], input_dims[1], input_dims[2]).astype(np.float32)
input_tensor = aidge_core.Tensor(input_data)
calib_aidge_tensors.append(input_tensor)
# Quantize model (via PTQ)
quantize_network(model, 8, calib_aidge_tensors, apply_rounding=True, optimize_signs=False, single_shift=True)
To debug and verify my export, I run the aidge forward used as golden model and I compare the results with my architecture execution.
# Create golden_log_outputs
scheduler.forward(True, [input_tensor])
os.system("rm -rf golden_log_outputs && mkdir golden_log_outputs")
model.log_outputs(os.getcwd()+"/golden_log_outputs")
Problem description
I rounding errors on some pixels of my convolution due to the scaling approximation. Both aidge forward and pneuro export give the same final accumulate value. In PNeuro, scaling nodes added after convolution are based on right shift of the accumulator value with linear saturation. In order to compensate the error due to the shift, our bias is increased of half of the shift (see bellow code, this was used in N2D2 framework).
#Halfshift compensation
shift = 10
bias = bias + (1 << (shift-1))
Reproducible example code
ScalingImpl_kernels.hpp model.onnx export.py
Adding debug in aidge_backend_cpu/include/aidge/backend/cpu/operator/ScalingImpl_kernels.hpp, I identified that rounding causes difference between aidge forward and pneuro executions. As presented in the following debug, the result of the following operation is rounded (-28160 * 0.000976562 = -27,49998592 rounded to -27.5). This leads the final rounding to -28 result instead of -27.
# ScalingImpl_cpu_forward_kernel Debug
scalingFactor:0.000976562 quantizedNbBits:8 isOutputUnsigned:0
input[i] = -28160
# output[i] = static_cast<O>(input[i] * static_cast<I>(scalingFactor));
output[i] = -27.5
# output[i] = saturate(std::round(output[i]), quantizedNbBits, isOutputUnsigned);
output[i] = -28
PNeuro is not able of floating point operation, so we use the singleShift. As shown bellow,
# PNeuro Debug
input[i] = -28160
shift = 10
# HalfSize bias correction
# input[i] += (1 << (shift-1))
input[i] = -27648
# output[i] = input[i] >> shift
output[i] = -27
Conclusion
When singleShift is enabled, the exact same function should be executed on forward. Otherwise, difference can be observed with the export results.