Backward failure when using Data Provider with drop_last set to False
What commit version of aidge do you use
Current development version of core and CUDA modules
Problem description
When using DataProvider with drop_last set to false, the last batch is shorter than the previous ones. This requires that the tensor dimensions are correctly updated during training.
Unfortunately, when scheduler.backward() is called for this last batch, the following CUDA error is generated: "RuntimeError: CUDNN failure: CUDNN_STATUS_NOT_SUPPORTED (9)".
This error occurs in the Aidge::ReLUImpl_cuda::backward_ method, when cudnnActivationBackward is called using the (badly?) updated CUDA descriptors / dimensions for the input and output tensors.
Reproducible example code
Use the following example in the tutorial: /examples/tutorials/Learning/learn.ipynb
Where the following modifications have been done :
- Set BACKEND to "cuda",
- In the definition of "aidge_dataprovider", "drop_last" is set to "False",
- The break point after the 5 first iterations (at the end of the file) is removed.