Wrong (none) bypass usage by pv.cplxconj
Created by: Silabs-ArjanB
The pv.cplxconj instruction does not correctly use the register file bypasses. In the following example the pv.cplxconj.b should get its x12 operand from the preceding lui instruction. However, the register file bypass is not triggered and the wrong ALU operand is used for the pv.cplxconj.b instruction.
Time Cycles PC Instr Mnemonic
86ns 4 00000180 00000013 nop
96ns 5 00000182 00000013 nop
106ns 6 00000184 fffe3637 lui x12, 0xfffe3000 x12=fffe3000
116ns 7 00000186 5cc19657 pv.cplxconj.b x12, x3, x12 x12=00000000 x3:00000000 x12:00000000
126ns 8 0000018a 00000013 nop
136ns 9 0000018c 00000013 nop
146ns 10 0000018e 00000013 nop
Wanted behavior is achieved for the following code (where I just inserted some nops so that the RTL would not have to rely on the register file bypass):
.section .text.start
.global _start
.type _start, @function
_start:
.word 0x00010001 lui x12,0xfffe3 nop nop nop .word 0x5cc19657 nop nop nop
Erroneous behavior is achieved for the following code:
.section .text.start
.global _start
.type _start, @function
_start:
.word 0x00010001 lui x12,0xfffe3 .word 0x5cc19657 nop nop nop
I believe (but didn't check) that the issue is caused by the following RTL in the 'decoder':
6'b01011_1: begin // pv.cplxconj.h
alu_operator_o = ALU_ABS;
is_clpx_o = 1'b1;
scalar_replication_o = 1'b0;
**regb_used_o = 1'b0;**
end
My guess (without understanding what this instruction is supposed to do) is that regb_used_o needs to be 1.
Also note the related ticket https://github.com/openhwgroup/core-v-docs/issues/95 . The RTL code only mentions pv.cplxconj.h, whereas I think the above RTL handles both .h and .b. The documentation only mentions pv.cplxconj and does not mention the existence of pv.cplxconj.b, pv.cplxconj.h.