Implement edge selection heuristics for CIF data-based synthesis workset algorithm

The workset algorithm has multiple steps, including:

Heuristically pick a transition from the workset and remove it from the workset.

Here, I'll try to implement the heuristics from:

Fei et al, "Symbolic State-Space Exploration and Guard Generation in Supervisory Control Theory", 2013 [link]

It has two phases:

Phase 1: Select from the workset the transitions with either the least or most dependencies. The evaluation shows that selecting the ones with the most dependencies works best.

Phase 2: From the transitions selected in the 1st phase, use reinforcement learning to select the best one(s). Alternatively, use reinforcement learning combined with Tabu search. The 2nd alternative, with Tabu search scores best in the evaluation.

So, select the transitions with the most dependencies, and then use reinforcement learning and Tabu search. Unfortunately, for reinforcement learning and Tabu search no more details are given. There is a reference to a survey paper about reinforcement learning of close to 50 pages, and to a book chapter about Tabu search of nearly 100 pages. I could read them, but they will only explain these algorithms in general. For reinforcement learning, it is important to know what the actions are, and what the rewards are for different outcomes of the action. This is not described for this problem domain. I assume the actions are the chosen transitions, and the outcome is whether applying the transition had any effect. The reward is likely positive if the transition had an effect, and negative if it didn't. I can try something simple, and we can always make it more advanced later on.

Addresses #520 (closed)