I may have overdone things, after reflexion, maybe the best idea would have been to let MSE take a tensor as argument as we can set the grad of this tensor and not give the whole graph ...
I have made a refactor in this way, now learning looks like:
# Set object for learningscheduler=aidge_core.SequentialScheduler(model)# setup optimieropt=aidge_learning.SGD(momentum=0.9)learning_rates=aidge_learning.constant_lr(0.001)opt.set_learning_rate_scheduler(learning_rates)opt.set_parameters(list(aidge_core.producers(model)))tot_acc=0fori,(input,label)inenumerate(aidge_dataprovider):input.init_gradient()scheduler.forward(data=[input])# Really long line should be a faster way ...pred=list(model.get_output_nodes())[0].get_operator().get_output(0)opt.reset_grad()loss=aidge_learning.loss.MSE(pred,label)acc=np.sum(np.argmax(pred,axis=1)==np.argmax(label,axis=1))tot_acc+=accscheduler.backward()opt.update()print(f"Nb samples {(i+1)*BATCH_SIZE}, loss: {loss[0]}, acc:{(acc/BATCH_SIZE)*100}%, tot_acc:{(tot_acc/((i+1)*BATCH_SIZE))*100}%")