m/sret optimization
Created by: ezelioli
Problem:
With the current implementation, when an sret
retires the pipeline stalls for several cycles (>30) due to i-cache misses that are never resolved (since outstanding cache misses are killed on a pipeline flush).
Proposed solution:
Treat m/sret
operations as direct jumps to PC + 0 when computing npc_d
. This avoids polluting the instruction cache with useless instructions, and generating spurious i-cache misses.
The only software workaround I found is to manually insert a j .
instruction after each m/sret
. However, this can simply be implemented as a hardware optimization to make ISR performance more reliable.
Assuming no other source of stalling (i.e. real i-cache miss, TLB miss), this brings the sret
penalty the same as a pipeline flush (6 cycles).