Draft: [issue 24] Refactoring kernels
Context
In order to optimize ONNX operands for ARM embedded target it is necessary to refactor them. According to this related issue the following kernels have been reworked :
- Add
- Atan
- BatchNorm
- Concat
- Div
- MatMul
- Mul
- ReLU
- Reshape
- Sigmoid
- Slice
- Softmax
- Sub
Modifications
-
For the kernels mentioned bellow the associated aidge_kernel.h has been creating. Those files use Template features and if needed factorise conditional loop in order to lighten MCU's works.
-
All the new implementations results have been compared and validated with python numpy and legacy kernels as references
-
Standalone benchmark between legacy kernels and refatored kernels to measure execution time delta has been done on STM32H7 cortex M7 DISCO board.
TODO
It could be pertinent to add a python script that can be integrated to this Add Maxence benchmark ticket url to make those benchmark more scalable and faster to do if more kernels needs to be reworks in the future