Handling of MetaOperators with Memorize Node as output by Scheduling, when building from micrograph
Required prerequisites
-
Make sure you've read the documentation. Your issue may be addressed there. -
Search the issue tracker and discussions to verify that this hasn't already been reported. +1 or comment there if it has.
What commit version of aidge do you use
-
aidge_core
: dev
Problem description
The problem is best described by the example given below, in Section Reproducible example code
.
The problem arise when I try to schedule a meta-operator who has a Memorize
node as output, and a node connected to this output. This is the case for LSTM, and the unit tests of LSTM DO NOT have a node following the LSTM operator, only a pop node before.
The problem in itself is that the scheduling pass, (in particular the getPriorProducerConsumer()
function) cannot find prior nodes when they have a Memorize node of a meta-operator as parent. This leads to the forward pass producing incoherent results or failing (since some nodes will simply not be scheduled).
Reproducible example code
Let us, for the sake of clarity, define a small meta-operator node, that we will call Accumulate
. It will add its input and previous output at each time step.
// File : Accumulate.cpp
namespace Aidge {
std::shared_ptr<Node> Accumulate(int seqLength, const std::string& name)
{
auto input = Identity((!name.empty()) ? name + "_input" : "");
auto hiddenState = Memorize(seqLength, (!name.empty()) ? name + "_hidden_state" : "");
auto add = Add((!name.empty()) ? name + "_add" : "");
input->addChild(add, 0, 0);
add->addChild(hiddenState, 0,0);
hiddenState->addChild(/*otherNode=*/add, /*outId=*/1, /*otherInId=*/1);
std::shared_ptr<GraphView> microGraph = std::make_shared<GraphView>();
microGraph->add(input);
microGraph->add({hiddenState, add});
microGraph->setOrderedInputs({{input, 0}, {hiddenState, 1}});
microGraph->setOrderedOutputs({{hiddenState, 0}});
auto metaOp = MetaOperator("Accumulate", microGraph, {}, name);
return metaOp;
}
} // namespace Aidge
Now let us write a test section (Note: if you really want to compile, make sure to add the declaration of Accumulate(...)
method in file MetaOperatorsDefs
. Since this test section needs an implementation for Add
, it has to be written in aidge_backend_cpu
.
SECTION("Issue")
{
std::shared_ptr<Tensor> Input = std::make_shared<Tensor>(
Array3D<float, 2, 3, 2>{{{{1.0, 2.0}, {3.0, 4.0}, {5.0, 6.0}},
{{2.0, 3.0}, {4.0, 5.0}, {6.0, 7.0}}}});
std::shared_ptr<Tensor> MemInit =
std::make_shared<Tensor>(Array2D<float, 3, 2>{
{{0.0, 0.0}, {0.0, 0.0}, {0.0, 0.0}}});
auto meta = Accumulate(2, "accumualate");
auto op = std::static_pointer_cast<MetaOperator_Op>(meta->getOperator());
auto pop_i = Pop("pop_input");
auto pop_o = Pop("pop_output"); // NOTE: Could be Identity/Stack/Whatever node you want, this is is not the problem here
pop_i->getOperator()->associateInput(0, Input);
pop_i->addChild(op->getMicroGraph()->getOrderedInputs()[0].first, 0, 0);
op->getMicroGraph()->getOrderedOutputs()[0].first->addChild(pop_o, 0, 0);
op->associateInput(1, MemInit);
// Build the graph.
auto myGraph = std::make_shared<GraphView>();
myGraph->add(pop_i);
myGraph->add(op->getMicroGraph());
myGraph->add(pop_o);
myGraph->compile("cpu", DataType::Float32);
// Schedule and run
auto scheduler = SequentialScheduler(myGraph);
scheduler.generateScheduling();
scheduler.graphView()->save("scheduled_graph");
REQUIRE_NOTHROW(scheduler.forward(true));
// Print output
std::static_pointer_cast<OperatorTensor>(pop_o->getOperator())->getOutput(0)->print();
}
Even though the code runs without error, we obtain an incoherent result.
What I believe is happening is that the scheduler starts with two output nodes (the memorize node from accumulate, and the pop node).
When calling getPriorProducerConsumers()
for the pop output node :
for (const auto& parent : node->inputs()) {
if (parent.first) {
AIDGE_LOG_CONTEXT("Producer node {} (of type {}) output #{}",
parent.first->name(), parent.first->type(), parent.second);
if ((node->getOperator()->getNbConsumedData(inputIdx) + node->getOperator()->getNbRequiredData(inputIdx)) >
parent.first->getOperator()->getNbProducedData(parent.second))
{
// the node needs more data than the current parent has provided yet
if (!mGraphView->inView(parent.first)) {
// Do not schedule prior outside the current graph!
// return PriorProducersConsumers(); // not scheduled
prior.priorConsumers.insert(node);
}
else if (parent.first->type() == Producer_Op::Type) {
prior.requiredProducers.insert(parent.first);
prior.priorConsumers.insert(node);
}
else if (parent.first->type() == Memorize_Op::Type) {
// Break cycles
Log::info("Parent is of type Memorize - Breaking Cycles");
return PriorProducersConsumers(); // not scheduled
}
else {
const auto& parentPrior = getPriorProducersConsumers(parent.first);
if (!parentPrior.isPrior) {
// only happens in case of cyclic graphs
return PriorProducersConsumers(); // not scheduled
}
else {
prior.requiredProducers.insert(parentPrior.requiredProducers.cbegin(), parentPrior.requiredProducers.cend());
prior.priorConsumers.insert(parentPrior.priorConsumers.cbegin(), parentPrior.priorConsumers.cend());
}
}
}
}
++inputIdx;
}
We enter the else if (parent.first->type() == Memorize_Op::Type) { // Break cycles Log::info("Parent is of type Memorize - Breaking Cycles"); return PriorProducersConsumers(); // not scheduled }
condition, leading to the parent node of Pop (that is, the Memorize node from Accumulate) not being detected, and then scheduled.
This is confirmed by the debug logs :
Context: Consumer node pop_output (Pop#1) input #0
not runnable: NbConsumedData is 0:0 + NbRequiredData is 2:1 > NbAvailableData 0:0 for input #0
Maybe this is not the correct way to build such a graph (and it should be done with getConnectedGraphView
, if so, why is it in the tests ?