Draft: [issue 24] Refactoring kernels
Context
In order to optimize ONNX operands for ARM embedded target it is necessary to refactor them. According to this related issue the following kernels have been reworked :
- Add
- Atan
- BatchNorm
- Concat
- Div
- MatMul
- Mul
- ReLU
- Reshape
- Sigmoid
- Slice
- Softmax
- Sub
Modifications
-
For the kernels mentioned bellow the associated aidge_kernel.h has been creating. Those files use Template features and if needed factorise conditional loop in order to lighten MCU's works.
-
All the new implementations results have been compared and validated with python numpy and legacy kernels as references
-
Standalone benchmark between legacy kernels and refatored kernels to measure execution time delta has been done on STM32H7 cortex M7 DISCO board.
TODO
It could be pertinent to add a python script that can be integrated to this Add Maxence benchmark ticket url to make those benchmark more scalable and faster to do if more kernels needs to be reworks in the future
Merge request reports
Activity
assigned to @vbaudelet
requested review from @pineapple
1 template < 2 typename T, 3 typename Dim_T, 4 typename Size_T 5 > 6 __attribute__((always_inline)) inline static 7 void aidge_matmul(T* __restrict input_a, 8 T* __restrict input_b, 9 T* __restrict output, 10 Dim_T* __restrict dim_a, 3 template < 4 typename T, 5 typename MeanVar_T, 6 typename ScaleBias_T, 7 typename SpatialDims_T, 8 unsigned int NB_Channels, 9 unsigned int NB_SpatialDims 10 > 11 __attribute__((always_inline)) inline static 12 void aidge_batchnorm(T* __restrict inputs, 13 T* __restrict outputs, 14 MeanVar_T* __restrict input_mean, 15 MeanVar_T* __restrict input_var, 16 ScaleBias_T* __restrict scale, 17 ScaleBias_T* __restrict bias, 18 SpatialDims_T* __restrict spatial_dims, 1 #include <math.h> 2 3 #define MAX_DIMS_AXIS_SIZE 128 /** TODO : is 128 enough or to big ? | Other possibility is to use a shared buffer as param, but this could have a side effect on Aidge's overall mechanics **/ 4 float exps[MAX_DIMS_AXIS_SIZE]; 5 6 template < 7 typename T, 8 typename Dim_T, 9 typename Size_T 10 > 11 __attribute__((always_inline)) inline static 12 void aidge_softmax(T* __restrict input, 13 T* __restrict output, 14 Dim_T* __restrict dims, The following arguments should be template parameters.
Edited by Olivier BICHLER
added StatusInactive label