CIF benchmark HTML overview requires averages and sorting
Recently, I did a large benchmark (100+ configurations) for quite some models, see #379 (comment 1090900). The idea is to find a better default for the data-based synthesis variable ordering option (#379 (closed)). Ideally, we can quickly see which configuration is now best (#379 (comment 1090943)). However, what is lacking for this, is:
- Adding an 'Average' column to the HTML page, to see in one number how the configuration did. This ensures that if we benchmark a lot of models, we can more quickly see the results for a configuration. I'm not sure yet whether an unweighted or weighted average is better, or that we want both. A weighted average would weigh a model based on its best metric value.
- Adding sorting for columns to the HTML page. This allows to sort on the average and then see which configurations do better/worse.