Optimizing data traversal on Tensor V2
As mentioned there https://gitlab.eclipse.org/eclipse/aidge/aidge_backend_cpu/-/blame/fix/TensorPIMPL/include/aidge/backend/cpu/data/TensorImpl.hpp#L150 data traversal can be optimised on Tensor V2 when some active area dimensions are the same as the full tensor area.
The algorithm consistes in computing the largest subtensor which is the same in active area and full area. It can be done by multiplying the dimensions from the last one to the first one for active area and full area, until the values differ.
The last dimension for which both subtensors size are the same determines which subtensor data can be traversed without jumps and, thus with a fast loop.
In the link above, it would determined the largest possible memcopy to do this extract.
This optimization can be used by all operators that supports restriction to the active area (by default, using an operator on tensor with restricted active area is UB).