PLCgen: Optimize generated code size for algebraic variables
From #1043 (closed) and #1055 (closed) it seems to me that use of algebraic variables is creating a large cost in code size.
Not now but perhaps soon, we may need a path to reduce that impact.
Pondering a bit about the discussions of both issues, the current approach of recursively unfolding all algebraic variables to their definition seems costly. Not a complete surprise of course.
Modeling differently
I considered somewhat if you can reduce the impact by modeling differently, but it looks complicated to do that at first sight (please correct me if I am wrong).
CIF uses a single step to make a transition. It's generally impossible to compute needed values in a step before, like eg in Java
x = f(...);
y = g(..., x, ..., x);
Here x
is computed first and its value is used twice in the next computation.
You cannot do this in CIF, since this would translate to 2 transitions, and you cannot express in CIF in the general sense that after the x
transition the only allowed next transition is the y
transition.
In other words, a modeler is forced into y = g(..., f(...), ..., f(...));
To some extent this is good, as a modeler you don't want to worry about splitting your computations. At the other side, this inevitably leads to a large forest of nested algebraic variable trees, in order to obtain the values that the modeler wants to have.
This seems non-trivial to fix in CIF, aside from questions whether it's a good direction for CIF.
Solution direction
But at some point, this may need to be addressed in some way. Currently I can see two paths (or more precisely, one path in two flavors). Likely more paths exist.
- In a single "transition" (which can be smaller than a CIF event transition, eg 'compute guard(s)', 'compute updates', or 'compute output value'), compute the result value of each algebraic variable once as a separate PLC statement (sequence). Store the result in a temp variable, and use it for next requests of that value.
- Perform expression analysis, and eg find common sub-expressions.
The first is conceptually easy although currently I don't know how to implement "in a single transition" at all. The disadvantage is that it's somewhat crude. If all algebraic variables are used once, nothing is gained. The second approach should detect a super-set of the first approach. Typically this is done as a separate step after initial code generation.
Addresses #679