Fixed branch prediction for compressed instruction on fetch with word unaligned addresses
Created by: OttG
This performance bug is rare but happens sometimes. This bug is triggered by the following conditions:
- Fetching from an word unaligned address;
- Instruction must be a compressed branch;
In this case, the instr_realign
module will take the instruction and put it on the value [0] of its output (so addr[0], valid[0], instr[0]). The outputs of the BHT and BTB are structured in a way where the position [0] of the prediction always refer to an instruction that is word aligned. This conflicts with the realigner block which "shifts" the instruction to position [0] which then will select the wrong prediction from the BTB/BHT (where should be from position [1] instead). The fix is checking the value of the addr[0][1], meaning that if the instruction has been shifted to position [0] we will take the prediction from the output [1] of the predictors since the address bit [1] will be set being not word aligned.
This fix is a bit convoluted to explain. To make sure that these changes were working properly I used this personal branch https://github.com/OttG/cva6/tree/bht_test_branch. Here in the folder bht_model
there is a script that analyze the traces output from an rtl simulation and compare with the software model of the bimodal predictor that is currently on CVA6. Other modification on the core where necessary to make sure that the software model is behaving as the RTL version of CVA6. The critical part was to stop the frontend anytime a branch was encountered. This is necessary because on the SW model the BHT saturation counters are updated right after a branch is encountered. In contrast, this doesn't happen in the RTL since you have to wait that the instruction goes through the pipeline and get executed before updating the counters. Since the frontend is decoupled with the backend, it might happens multiple predictions are taken before we update the tables meaning that predictions are taken with an outdated counter. This is avoided by stopping the frontend each time a branch is encountered making the SW and RTL behave in the same way. With this and the previous fix for the bht update (https://github.com/openhwgroup/cva6/pull/754) the SW and RTL have no differences on predictions.
Signed-off-by: Gianmarco Ottavi gianmarco@openhwgroup.org