[core] Scheduler Backward and Input Tensor Gradient

Context

I'm working on allowing CUDA based training in AIDGE using any device index.

Important note: This problem only appears when attempting to train a model on the cuda backend with a device index different from zero.

Description

At the moment, the scheduler backward() does not take care of the fact that the input tensor gradient may not exist.

In fact, if we follow the standard training pipeline, this input tensor gradient is never created !

Thus, the following loop will end up crashing after the backward() call (at the end of the first iteration).

for _ in range(10 ** 6):
    # get samples
    x, y_h = get_batch_pair()
    # forward
    scheduler.forward(True, [x])
    y = get_ouput_tensor()
    # loss
    l = loss(y, y_h)
    # backward
    scheduler.backward()
    # optimizer
    optimizer.update() # <- crashes around here !
    optimizer.reset_grad(classifier)

Workaround

A workaround for this issue is to explicitly create a grad tensor for the input tensor before calling the backward() method.

This can be done by simply calling the grad() method just after retrieving the input.

    x, y_h = get_batch_pair()
    x.grad() # <- this creates the grad tensor and fix the issue !
    ...

Nevertheless this is not a very user friendly solution, so a fix has to be made !

Edited Jun 13, 2025 by Benjamin Halimi