Model modification for CPP export after quantization

Required prerequisites

Make sure you've read the documentation. Your issue may be addressed there.
Search the issue tracker and discussions to verify that this hasn't already been reported. +1 or comment there if it has.

What commit version of aidge do you use

aidge_core: dev

Problem description

I am comparing the CPP export of a network for float32 and int8 quantization.

ResNet_CIFAR100.onnx <-- This is the network

It's a simplified ResNet, with 2 residual blocks.

When I export the network in float, there is no problem, but when I export it after int8 quantization, there are 2 problems:

The number of outputs is changed,
Some activation layers are not fused.

These errors only appear for a residual network.

Problem 2:

Here is the structure of the exported float network, from the forward.cpp file:

    convolution_forward<_0_PADCONVACT_0_INPUT_0_NB_CHANNELS,
    convolution_forward<_0_PADCONVACT_0_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_1_PADCONVACT_1_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_2_MAXPOOLING2D_0_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_3_PADCONVACT_2_OUTPUT_0_NB_CHANNELS,
    elemwise_forward<_5_ADD_0_NB_MAT,
    convolution_forward<_5_ADD_0_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_6_PADCONVACT_4_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_7_MAXPOOLING2D_1_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_8_PADCONVACT_5_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_9_MAXPOOLING2D_2_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_10_PADCONVACT_6_OUTPUT_0_NB_CHANNELS,
    elemwise_forward<_12_ADD_1_NB_MAT,
    pooling_forward<_12_ADD_1_OUTPUT_0_NB_CHANNELS,
    fullyconnected_forward<_13_MAXPOOLING2D_3_OUTPUT_0_NB_CHANNELS,

This is coherent with the image above. But here is the structure extracted from the forward.cpp file of the quantized CPP export:

    convolution_forward<_0_PADCONVACT_0_INPUT_0_NB_CHANNELS,
    convolution_forward<_0_PADCONVACT_0_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_1_PADCONVACT_1_OUTPUT_0_NB_CHANNELS,
    activation_forward<_3_QMUL_0_NB_ELTS,                                  <-- not expected
    convolution_forward<_2_MAXPOOLING2D_0_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_4_PADCONVACT_2_OUTPUT_0_NB_CHANNELS,
    elemwise_forward<_6_QADD_0_NB_MAT,
    convolution_forward<_6_QADD_0_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_7_PADCONVACT_4_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_8_MAXPOOLING2D_1_OUTPUT_0_NB_CHANNELS,
    pooling_forward<_9_PADCONVACT_5_OUTPUT_0_NB_CHANNELS,
    convolution_forward<_10_MAXPOOLING2D_2_OUTPUT_0_NB_CHANNELS,
    activation_forward<_12_QMUL_1_NB_ELTS,                                <-- not expected
    convolution_forward<_11_PADCONVACT_6_OUTPUT_0_NB_CHANNELS,
    elemwise_forward<_14_QADD_1_NB_MAT,
    pooling_forward<_14_QADD_1_OUTPUT_0_NB_CHANNELS,
    fullyconnected_forward<_15_MAXPOOLING2D_3_OUTPUT_0_NB_CHANNELS,

Problem 1:

Here is the beginning and end of the forward.cpp of the float export:

void model_forward (const float* _0_PadConvAct_0_input_0,float** _14_FC_0_output_0_ptr)
{
...
    *_14_FC_0_output_0_ptr = _14_FC_0_output_0;
}

and the same for the quantized network:

void model_forward (const int8_t* _0_PadConvAct_0_input_0,int8_t** _16_QFC_0_output_0_ptr, int64_t** _2_MaxPooling2D_0_output_1_ptr, int64_t** _10_MaxPooling2D_2_output_1_ptr, int64_t** _15_MaxPooling2D_3_output_1_ptr)
{
...
    *_16_QFC_0_output_0_ptr = _16_QFC_0_output_0;
    *_2_MaxPooling2D_0_output_1_ptr = _2_MaxPooling2D_0_output_1;
    *_10_MaxPooling2D_2_output_1_ptr = _10_MaxPooling2D_2_output_1;
    *_15_MaxPooling2D_3_output_1_ptr = _15_MaxPooling2D_3_output_1;
}

The float network has only 1 output whereas the quantized network has now 4 outputs. This is the same in the main.cpp file:

#include <iostream>
#include "forward.hpp"
#include "data/_0_PadConvAct_0_input_0.h"
#include "data/labels.h"

int main()
{
    // Initialize the output arrays
    int8_t* _16_QFC_0_output_0 = nullptr;
    
    int64_t* _2_MaxPooling2D_0_output_1 = nullptr;
    
    int64_t* _10_MaxPooling2D_2_output_1 = nullptr;
    
    int64_t* _15_MaxPooling2D_3_output_1 = nullptr;
    

    // Call the forward function
    model_forward(_0_PadConvAct_0_input_0, &_16_QFC_0_output_0, &_2_MaxPooling2D_0_output_1, &_10_MaxPooling2D_2_output_1, &_15_MaxPooling2D_3_output_1);

    // Print the results
    int prediction;
    int confidence;
    prediction = 0;
    confidence = _16_QFC_0_output_0[0];

    for (int o = 0; o < 100; ++o) {
        if (_16_QFC_0_output_0[o] > confidence) {
            prediction = o;
            confidence = _16_QFC_0_output_0[o];
        }
    }

    printf("Prediction out#0: %d (%d)\n", prediction, confidence);
    printf("Label out#0: %d\n", labels[0]);
    prediction = 0;
    confidence = _2_MaxPooling2D_0_output_1[0];

    for (int o = 0; o < 100; ++o) {
        if (_2_MaxPooling2D_0_output_1[o] > confidence) {
            prediction = o;
            confidence = _2_MaxPooling2D_0_output_1[o];
        }
    }

    printf("Prediction out#1: %d (%d)\n", prediction, confidence);
    printf("Label out#1: %d\n", labels[1]);
    prediction = 0;
    confidence = _10_MaxPooling2D_2_output_1[0];

    for (int o = 0; o < 100; ++o) {
        if (_10_MaxPooling2D_2_output_1[o] > confidence) {
            prediction = o;
            confidence = _10_MaxPooling2D_2_output_1[o];
        }
    }

    printf("Prediction out#2: %d (%d)\n", prediction, confidence);
    printf("Label out#2: %d\n", labels[2]);
    prediction = 0;
    confidence = _15_MaxPooling2D_3_output_1[0];

    for (int o = 0; o < 100; ++o) {
        if (_15_MaxPooling2D_3_output_1[o] > confidence) {
            prediction = o;
            confidence = _15_MaxPooling2D_3_output_1[o];
        }
    }

    printf("Prediction out#3: %d (%d)\n", prediction, confidence);
    printf("Label out#3: %d\n", labels[3]);
    return 0;
}

Edited Aug 28, 2025 by Olivier BICHLER