Skip to content

[feat] GraphRegEx Lookaround

Context

aidge_onnx#79 (closed)

With current graph matching, if we want to match B and C with B and C having a different input, we would not be able to differenciate between case one and two:

flowchart TD
	subgraph Case two
	A --> B --> C;
	A --> C
	end
	subgraph Case one
	a0["A0"] --> b["B"] --> c["C"];
	a1["A1"] --> c
	end

Indeed to differenciate, we would need to have this kind of pattern .#0->B#1->C#2;.#3->C#2 (case one) or .#0->B#1->C#2;.#0->C#2 (case two). But we don't want to match the extra operators (A, A0, A1)! This is currently not supported. To do so with current impelementation, we need to remove by hand the extra operator.

Proposed solution

I propose to add a new regex symbol based on lookaround

Lookaround Name What it Does
(?=foo) Lookahead Asserts that what immediately follows the current position in the string is foo
(?<=foo) Lookbehind Asserts that what immediately precedes the current position in the string is foo
(?!foo) Negative Lookahead Asserts that what immediately follows the current position in the string is not foo
(?<!foo) Negative Lookbehind Asserts that what immediately precedes the current position in the string is not foo

ref: https://www.rexegg.com/regex-lookarounds.php

This regex allow to match a pattern but to not send it in the result.

Due to the directed nature of graphs, I don't think we need to differenciate between lookbehind and lookahead.

Negative look around can be handled by the $ symbol.

To implement this I propose to add the symbol ^() with everything prefixed by this symbol matched but not captured.

In regular regex ^ is used to indicate the begining of the sequence, which does not apply in out case due to the directed nature of the graph.

The example given in the context will then become: ^(.#0)->B#1->C#2;^(.#0)->C#2