Discuss how to absorb and mature proof-of-principle/concept research contributions

Context

Our project description among others indicates:

The Eclipse ESCET project provides a toolkit for the development of supervisory controllers. The toolkit has a strong focus on model-based design, supervisory controller synthesis, and industrial applicability, for example to cyber-physical systems. The toolkit supports the entire development process of (supervisory) controllers, from modeling, supervisory controller synthesis, simulation-based validation and visualization, and formal verification, to real-time testing and implementation.

We thus have a strong focus on industrial applicability, aiming for mature tools that can be applied by industrial practitioners. We therefore aim for maturity in various places, from the code base to reviewing and testing to documentation.

On the other hand, we also want to be state-of-the-art, which is also essential for application to complex industrial systems. The CIF website for instance, indicates:

World-class algorithms The CIF toolset features world-class algorithms for automatic synthesis of supervisory controllers. Focus on the what rather than the how!

World-class algorithms and other improvements typically start with academic research projects. There the focus is typically different, e.g., on proving that some new idea works in principle, or can be shown to work conceptually and practically on industrial cases. There maturity is not a goal. The opposite is true, in fact. Speed and agility are essential, to quickly try new things and reject them if they don't work.

Given the different goals and requirements for research and a mature toolkit, we can distinguish different types of 'code bases' (using the term broadly to also include tests, documentation, etc). I often use the following distinction:

Proof-of-Principle (PoP): The bare minimum is developed to show that some new idea could work. You invest as little as possible, make as little as possible, support as little as possible, to show that the idea is useful and worthwhile to investigate further.
Proof-of-Concept (PoC): If the proof-of-concept is useful, it can be turned into a proof-of-concept. It is developed further, expanded to cover the most relevant cases, mature it a bit, improve performance, etc. Typically it is developed further, during this stage, until it can be applied to practical cases, showing real-world value. Still it remains at prototype level.
Mature/industrialized tool (MT): The proof-of-concept tool is typically still quite rough. It may not cover all corner cases. May have been developed and grown without regard for the architecture. Or the original architecture and concepts may have been changed or lost along the way, leading to sub-optimal designs and inconsistencies. Concepts may have changed half-way. It may not have good tests and documentation. And so on. It needs to be matured/industrialized to address all such issues.

One could broadly link these to Technology Readiness Levels (TRLs) as defined by NASA:

Proof-of-Principle (PoP): approximately TRL 1-3.
Proof-of-Concept (PoC): approximately TRL 4-6.
Mature/industrialized tool (MT): approximately TRL 7-9.

Open questions

Several questions then arise:

How do we move from PoP to PoC, and from PoC to MT?
At what stage do we contribute something to Eclipse ESCET?
- Does it need to be 'MT' before contributing it?
- Can we accept 'PoP' or 'PoC' contributions, and mature them in Eclipse ESCET?
What is the process for this? How to arrange this?

Example

To make things more concrete, let me take issue #368 (closed) with merge request !320 (closed) as an example.

Context:

Issue #318 is the overview issue for contributing multi-level synthesis to Eclipse ESCET.
There is a PoC available (I think it is probably beyond PoP level, but I'm not completely sure, as I haven't seen everything yet). It is to be contributed, in phases.
Issue #344 (closed) is about contributing DSM/DMM data structures and an application to do clustering. This has been contributed.
Issue #368 (closed) is about contributing CIF to DSM/DMM/matrix (there is discussion about this) functionality. This is work in progress, with merge request !320 (closed).

We seem to struggle quite a bit on how to approach this. I'll quote some random selection of comments (this is not meant to be complete):

[UNDERSTANDING] Regarding understanding the contribution to begin with:
- But no idea, I didn't invent the theory.
- I didn't invent the theory, I didn't write the code either. I don't have answers to all questions as many decisions are not mine.
- I don't know why the current approach or limits were chosen. I don't know what happens if you change the limits.
- I don't have the answer, I didn't write this.
- I have no answer.
[CONCEPTS] Regarding concepts:
- It is also confusing in its terminology and concepts, which makes reviewing it quite difficult.
- Several terms like 'most-refined product system' and 'location alphabet' are not defined, or not defined where you'd expect it.
[NAMING] Regarding naming of the project:
- Changed the name of the computation to cif2dsm as suggested.
- cif2dsm may be the wrong project name
- renamed everything "dsm" to "matrix".
- I think what you made is actually CIF to DMM.
[INOUT] Regarding what the project does, in terms of input/output and what that represents:
- I think what you made is actually CIF to DMM. The DMM then still needs to be converted to a DSM, and clustered. From the clustered DSM, a multi-level system is created. Then for each node in the multi-level system, the entire input specification is then be reduced to the relevant part of it, which is where the sub-plant part comes into play.
- I'm left to wonder what a 'directed' matrix is and what an 'undirected' matrix is. I get directed and undirected graphs. I get symmetric matrices (which match undirected graphs). But I'm not sure what a directed matrix is?
- For multilevel synthesis, the 'direction' of requirements does not matter. But I used the DMM approach also for generating the graphs shown in Chapter 3 of my thesis.
[DESIGN] Regarding the design of the project, regarding 1 vs 4 DMMs, and the treatment of input variables:
- [...] what does "collect relations between CIF elements" have to do with "most-refined product systems"? This probably needs some thought, not sure how it fits.
- Why are input variables not treated more generally, such they can be in a most-refined product system with plant automata? Now they seem to be a rather special case, requiring more special handling, making the whole design a lot more complex, it seems. If they treated all together, the DependencyMatrix also would not need 4 parts anymore, etc.
- If I understand correctly, this is just a DMM, as introduced in Martijn Goorden's thesis. Can this use the Dmm class from the common.dsm package? Especially if we get rid of the plants vs input variables etc differences, which seem arbitrary anyway, then it is just a single matrix without 4 distinct parts?
- [...] a DMM relates 1 domain with 1 other domain. At the very least we thus have 4 DMMs.
- In theory we would only need one DMM: on one side we have the plant model domain and on the other side we have the requirement model domain. In CIF, a plant model can be an automaton, input variable, or even an invariant; and a requirement model can be an automaton or an invariant.
- Dmm allows arbitrary real values in the matrix, here you only have 1 or 0 as relation.
- Obviously you can express the latter in the former, it just takes more space and perhaps a bit of checking that you don't get arbitrary real values.
[PLACEMENT] Regarding in which project to place code:
- While the code to compute it is here probably due to hysteric raisins, I am not so sure anymore it should be here.
- I wonder if this class should be here? A lot of it seems to be quite general in its intent [...] Should we move most of the functionality in this class to CifCollectUtils instead?
- [...] Then again, adding something to common may also mean an effort to make it used commonly??
[THEORY] Theory vs implementation:
- I am not familiar with the theories, nor how the code relates to theory.
- Clustering and DSMs/DMMs is not an exact science, there are heuristics involved. There may be good reasons why it's not here. There may also be bad reasons why it's not here.
- Using a different way to get the same result is fine, extending without extending the theory with the new cases is not.
- This precondition checker seems to have zero conditions about marker predicates? [...] It's not there because the code from research didn't have it.
- I think this is a good example of the available concepts and notions in the underlying theory and the concepts and notions available in the CIF language. There is a gap between those in the sense that the set of theoretical concepts is a subset of the CIF concepts. For example, in theoretical automata there is no such thing as marker predicates, only a set of marked locations.
- So as a method / theoretical algorithm developer, I want to focus first on a subset of CIF concepts that exactly matches with those from the theory when I implement new methods / algorithms. After that, the implementation can be slowly relaxed by considering the larger CIF language. For each new concept, you carefully have to consider how it impacts the new method, as the theory doesn't say anything about that new concept.
[KNOWLEDGE] Getting the knowledge:
- I think you need detailed theoretical and practical experience with DSMs, clustering, synthesis and the kind of models written in that area to understand all this.
- However I don't know, and I won't know unless I spend a few years understanding the domain, but I don't have that time and likely you don't want to wait that long.
- I didn't start from the paper [...]
[CORRECTNESS] Regarding functional correctness and gaps:
- This method seems to have some arbitrary things it omits, like LHS of assignments, [...
- [...] the precondition checker [...] supports a total of 16 operators [...] this is thus very incomplete [...] Now we skip this [...] can actually get wrong results [...]
- I thus think I may have missed other serious issues, as I probably didn't even understand all of the code.
- The original author of the code has intimate knowledge of everything, so I am pretty sure the code is for 90% doing what is intended to happen.
[DETAILS] The smaller details:
- I've just spent over 4 hours reviewing this and have collected over 100 issues so far. I'm not done [...]
- It has many inconsistencies, and a bunch of them lead to functional issues, such as missing dependencies, and thus incorrect dependency matrices.
- It mentions value simplification, but no SimplifyValues* preprocessing operation is performed?
- This does not actually create a mapping? [...] the mapping is still there, it's just hidden.
- [...] the code changed a lot so it's not a big surprise it may some rough corners and a heap of tiny things [...]
- Comment has not been changed [...]
- Small stuff is a major nuisance to me. My focus is about getting the algorithms correct and fast.
- There are bigger fish that needs to done. We will run into the remaining spots at some future time and simply fix them at that time.
[EFFORT] Regarding how much effort to put into it at what stage, or at all:
- Code is as far as makes sense now.
- I ported the code as-was [...] I also worried more about how to break down the long function while not breaking the tests rather than optimizing layout of resulting code.
- I just took the code that existed, renamed some stuff, fixed the out form, and nothing else. I didn't start from the paper, I didn't implement this, I didn't refactor all code for improvements.
- All in all I feel this branch is rather immature at the moment
- Furthermore, it seems really geared towards the specific application that is considered here, while if it is made a little bit more general, it would be a nicer design and simpler to understand, and it could then even be reused in other contexts.
- Doing that is not pure "porting to ESCET" though [...]
- Possibly. Should that be done now?
- It's fine that you want it, we can discuss it and perhaps you can even get it, but it's not porting anymore at that point.
- If you want all the answers, I pretty much have to write the code from scratch, I think.
- Also, not meeting your needs doesn't make it immature. It's focused [...] but that is not the same as immature.
- That is definitely out of scope here, imho.
- I mean "generalize it" [...] means rewriting / changing / re-designing the algorithm, up to inventing a completely different algorithm. In particular, it stops being a port of existing functionality very soon.
- No doubt they are all lovely goals, but if "moving code into ESCET" means starting from scratch from published papers or beyond, I think it needs some discussion.
[REVIEW] Review process:
- there is a difference in approach between the author and the reviewer here. A reviewer first wants to eliminate the syntax noise before being able to understand the global issues. An author wants to get the global structure right before considering commas and dots.
- I do have the same problem as you as reviewer. I do mark such noise in papers or code, but once I mostly run out of the small stuff, I start looking at the bigger picture (also because at that point you have seen all code so you can form an idea of how it works).
- we probably solve that currently by the author just skipping all the "noise" issues first, since code restructuring typically trashes existing text and thus its errors.
- Taking over a branch is one option.
- Seen in that way, taking over the branch may cause a lot of time being spent on fixes that ultimately never end up in the resulting code.
- If you have mostly textual issues, writing suggested changes is another way. (Instead of saying "change X to Y" you're actually making the change as proposal.)`
- A third option is to make the changes in a git-controlled environment and create a patch file with the changes. It seems basically the above "change as suggestion" in bulk with more flexibility in what to apply than taking over the branch. (I consider diff files editable too.)`
- I'd like to move forward with the branch, preferably with a big stuff first to avoid making changes that get discarded anyway. If that's not possible, wasting some time there isn't a problem either. [...]
- Maybe it is indeed the problem that I'm reviewing it all at once, the big and smaller stuff. Would it then be an idea to review big merge requests like this in two stages? First review the big stuff, the approach, the overall design, etc. Get the big stuff right. Then review the small stuff, either via comments, patch or taking over. We could experiment here.
- My brain seems to really dislike inconsistenties, so I'll surely spot these in the first round as well. But I can try to focus only on the big stuff, and see how that goes?
- We also discussed how to contribute such large contributions, which are not fully finished, in an efficient manner. We concluded that it may be useful to declare such new things as 'experimental'.
- Then not everything has to be up to the same quality standards as the mature parts of the toolkit.
- What this means exactly we don't know yet, I think. We'll have to experiment with this 'experimental' status.

Clearly there is a lot of discussion. I'd rather in this issue not discuss the specifics of that particular 'CIF to [something]' issue (we can do that in !320 (closed)), but rather focus on the processes. Still, I think these examples show the kind of things we are discussing and/or struggling with. As there are quite some different things being discussed, I tried to place the various discussions under some collective, more generic, headers. Hopefully, this can help us in the discussion.

Another example, with a rather different approach, is issue #196 (closed) with merge request !327 (closed). This is about contributing the DCSH variable ordering heuristic algorithm to data-based synthesis. The approach taken here:

There was no Java-based implementation yet.
Started from the original paper, reading it completely, and also reading some of the papers its refers to.
Tried to implement it all, from the paper, staying as close as possible to the papers.
Realized that it could not be integrated into the current codebase with API changes.
Changed the API in the branch, to develop the new API, but also see how it fits with the extensions.
Contributed the API changes separately (see #371 (closed)).
Rebased the changes upon the branch for #371 (closed).
After addressing review comments, reworked the branch to be based on the final version of #371 (closed).
Submitted the 3rd attempt as merge request !327 (closed).
Left various things open for follow-up, by creating new issues for them.

What can we do?

I don't pretend to know the solution here, but can share some of my thoughts:

[UNDERSTANDING] My current thinking is that something that we don't understand can't be contributed. We must know what a contribution is about, to be able to see how it fits, and to develop it further. If we don't understand it, we must first understand it, before we can continue. This requires an investment, not just from the contributor (which may be one of the committers), but also from the (other) committers, to be able to review it, discuss it, etc. I feel that by reading Martijn Goorden's PhD thesis a bit, I got a good enough understanding of what it is about.
[CONCEPTS] Here too, I think we must understand the concepts before we can move forward.
[NAMING] I would hope that if we understand what something does, and is about, we could resolve the naming difficulties with ease.
[INOUT] This too is essential to understand, I think. We must know what goes in and out, and how it relates to (ties in with) the rest of the code base. Everything that follows likely depends on understanding this, so it is important to get this right at the very start.
[DESIGN] If we understand the purpose of a transformation or tool, its input/output relation, etc, we can think of what would be a good design. For me, it seems essential to get some basic idea of the desired design, before we move forward with the contribution. Without understanding this, a review feels pointless, and further maturing can't be done. I realize we don't always do this upfront. But maybe for larger new contributions, we should first read up on it (read the papers etc), and discuss the design we want to end up with? Then we have a guidance when looking at the actual contribution(s), and we can then define a path to get to where we want to be.
[PLACEMENT] Like naming, I hope that if we understand the purpose, goal, input/output relation, and design, and agree upon it, the ideal placement of the code should be relatively simple to decide.
[THEORY] This one is a bit more tricky. I think it is good to see where the theory and implementation differ, or have gaps. Still, this can require quite a bit of effort to do. However, it is essential, I think. We must understand what we're working on/with. Practically, starting small, even smaller (e.g., more restricted subset of the CIF language, less features, etc) than what an available PoC currently provides, may be useful here. For instance, only what is defined in the (first) paper that describes it. Contributing a basic version, with only the essentials, and extending it afterwards, could help to get the basics in first. This allows more focus, as the contribution is then smaller. We can focus on the essentials, and extend later. This may speed things up. Then later, we can extend it with the rest of what the PoC offers, or what follow-up papers add on top of it. Just like the PoC was developed incrementally, it may be useful to contribute it incrementally to Eclipse ESCET, to build up knowledge/expertise, and mature the basis, before extending.
[KNOWLEDGE] This strongly relates to the theory part. Building up knowledge on the contribution, and what lies behind it, as well as getting experience in working with and maturing the implementation, are important. This allows moving forward, and at the end reaching 'MT' level. Here too, incremental contributions can help, to not get overwhelmed, and gradually build up the knowledge. I do feel that reading Martijn Goorden's PhD thesis a bit, I already got quite a decent level of knowledge.
[CORRECTNESS] I think it is good to identify the gaps in correctness. Since the end goal is 'MT' level, these issues must be addressed at some point.
[DETAILS] I think these are the most obvious candidates to leave out at the start, and mature a bit later.
[EFFORT] The way I see it, ending up at a mature implementation, from a PoP or PoC, is always a lot of work, whatever way you approach it. I often hear in my job that going from PoP to PoC triples the investment (in time and/or money), and from PoC to MT the investment triples again. With our end goal of 'MT' level, this is however in my opinion something that we must do. We discussed temporarily accepting lesser quality by using an 'experimental' status to label new contributions that are not mature enough yet. I do think if we go for this, we have to limit it to specific plugins or packages of plugins, to be able to keep track of what parts of the code base are considered experimental, and what parts are mature. And then we need to define the actions required to lift the 'experimental' label. However, I must also say that an 'experimental' label does not have my preference. I find it very difficult to think of what level the first contribution should reach before we merge it. Where is 'the line'? Do we accept whatever is given, do we mature a bit, or do we fully mature? The first feels wrong, and the second impossible to decide how far to go. If we can discuss more upfront, and work with small contributions with reduced scope, I would hope we can mature them fully (third option) before moving on to the next contribution that builds upon it or extends it. An approach with patches or so as we discussed may help here as well.
[REVIEW] This one I'm not sure either. I feel some things should be discussed upfront, rather than during the review. We should 'prepare' for a contribution, reading up on it, discussing whether a PoP or PoC will be contributed, whether it is suitable for contribution (does it require restarting from the paper, or can we further mature it), what should be contributed and it what order, etc. I think we don't do this enough now. For instance, I'm now actually think it may have been better to start with multi-level synthesis from a manually given multi-level system first. Then you have something that works (it only requires data-based synthesis as dependency), and can be matured, etc. Then allow computing the multi-level system from a DSM, then clustering a DSM, then converting a DMM to a DSM, and then constructing a DMM from CIF. That way, each contributed new thing is immediately actionable. Now it started more from the first step to the last step in the chain, but maybe the reverse would have been better. Such discussions we should be able to have by just reading the paper and taking the toolchain figure from it.

As I mentioned above, I also think splitting larger contributions into smaller ones can really help, as well as reducing 'extensions'. For instance, only do CIF to DMM, not CIF to directed graph as well, as that is an extension. This makes it easier to discuss, review, adapt, mature, etc. Finding time to review huge things is difficult for me, as it requires significant uninterrupted time to seriously look into something. With small contributions, this is easier.

I'm not at all sure where 'the line' should be here. I left out many comments when reviewing !320 (closed), but still many of them led to way too detailed discussions for the phase we were in, as well as irritations. Likely I still added too detailed comments. But maybe upfront discussions could have helped here, as then the fundamentals would have been more clear already?

This leads me to the following proposal:

Discuss larger new contributions upfront, to understand the future contribution, its concepts, and the desired design, order of contribution, etc.
Maybe even look at all that is to be contributed, in its current form, if it is to be contributed in phases. Maybe add a ZIP to the issue, temporarily dump it in a branch, or on GitHub or so?
Start with the basics, through small focused contributions. Leave out all extensions, keeping only the essentials.
Review contributions until they are mature, to avoid the 'experimental' label and confusing or even impossible to define 'experimental levels of maturity'.
Improve upon the review process by patches or merge-request-upon-merge-request or such. We may have to experiment here.
Use follow-up issues for the next step, extensions, etc.

Edited Jun 12, 2022 by Dennis Hendriks