Add determinism and completeness to getOrderedNodes and use for getRankedNodes (!223) · Merge requests · Eclipse Projects / aidge / aidge_core

Context

Objective is to reuse getOrderedNodes() (topological node order) in particular for getRankedNodes().

Note that there are other places where refactorizations based on getRankedNodes() could happen. For instance:

save(): to generate mermaid nodes in order and have determinism on the viewed graph
forwardDims(): for one pass forward and to avoid unordered and multiple warnings for the same nodes
add(node set): to simplify current implementation

Though these are not part of this MR.

With respect to getOrderedNodes() this MR brings some extensions in order to:

favor in the topological order the current graph ordered inputs and ordered outputs (was only ordered outputs before)
always provide a complete coverage of the graph even in the presence of weird nodes: nodes with no output or loops of nodes w/o external output
be deterministic to some extend: for this the choice has been made to use a global unique incremental id for the nodes and use it for sorting nodes not connected to any ordered output. This is a reasonable choice as relative creation order of nodes has an understandable meaning. Note though that there is a more involved alternative which would be the insertion order in the graph view, but as of now this information would require a larger rewrite of the graph view implementation. Though it's an option for the future.

With respect to getRankedNodes(), with this MR:

now it is simply implemented as getOrderedNodes() and as we consider with the above assumptions that getOrderedNodes is deterministic, we always return that all nodes have order unicity
the initial intent, of what I understand, was to rank node as a kind of distance to the mRootNode, though it seems to be mainly unused and anyway approximately defined for distances > 1
the only place where it was used was in updateInputsOutputsOrder(), but actually getting a full order of the graph was not necessary and costly (full graph ordering for each deletion), while there we only want either direct connections (i.e. distance == 1) or reset the mRootNode to null.

By the way, as additional notes for future thought:

maintainance of mRootNode brings complexity which could be removed
- mainly mRootNode is useful for graph matchings which may be either top-down or bottom-up and for which it is useful to have the initial seeding node
- though, this is a pure graph matching issue, and it would be better to have a matcher return a nodeset (or graphview) and a root node (i.e. the root node would be a field of the graph matcher)
- otherwise I do not see a real usage of having a mRootNode in the graph which is a kind of "first inserted node" unless some "hard to understand node if it was deleted"
optionally, in place of this: if we want something more meaningful, it would be the full insertion order of nodes in the graph view (ref to point above), which would then cover the "first inserted" (or "first remaining inserted" in case of deletion), and would allow a more located determinism for node visits such as getOrderedNodes() (i.e. use insertion id instead of creation id).

Review and refer to per commit messages for additional details, in commit order (older to newer):

[Node] Add globally unique incremental id to nodes: add unique incremental id to nodes
[GraphView] improve topological order determinism: see discussions above on topological order improvements
[GraphView] preserve relative node ids in clone operation: clone graph nodes in order of creation in order to preserve order in the cloned graph
[GraphView] implement getRankedNode() as a topological order: simply implement getRankedNodes() as getOrderedNodes(), see discussion above

The function getRankedNodes() changed behavior, the initial interface description was kept (ref to the docstring in the API).

To be discussed I suppose whether we really make this change and if so whether we keep the legacy description.

Edited Oct 23, 2024 by Christophe Guillon