Core does not boot and debugger does not work when adding an AXI master on the bus
Created by: Juan-Gg
I am trying to connect a custom accelerator to the cva6 axi bus. This accelerator will have an AXI slave port for configuration and an AXI master port for reading and writing data directly off main memory. I am now attempting to attach a simple AXI master to the bus, that just writes to a fixed memory address so I can later check that address’ contents with gdb. The accelerator code is as follows:
AXI master code:
// A test to integrate an AXI master in the CVA6 APU
// See axi_mem_if/src/axi2mem.sv for example use of AXI_BUS interface (as a slave that is)
module axi_master_test #(
parameter int unsigned AXI_ID_WIDTH,
parameter int unsigned AXI_ADDR_WIDTH,
parameter int unsigned AXI_DATA_WIDTH,
parameter int unsigned AXI_USER_WIDTH,
parameter logic [31:0] ADDRESS,
parameter logic [31:0] DATA
) (
input logic clk_i,
input logic rst_ni,
AXI_BUS.Master axi_master_port
);
// Default values. See "Table A10-1 Master interface write channel signals and default signal values"
// // AXI write address channel
assign axi_master_port.aw_id = '0;
// assign axi_master_port.aw_addr;
assign axi_master_port.aw_len = '0; // Number of beats in burst - 1
assign axi_master_port.aw_size = 3'b010; // Number of bytes per beat, 3'b011 for 8 bytes -- Let 4 bytes for 32b access
assign axi_master_port.aw_burst = 0'b00; // Burst type FIXED (0'b00)
assign axi_master_port.aw_lock = '0;
assign axi_master_port.aw_cache = '0;
assign axi_master_port.aw_prot = 3'b0; // Unpriviledged access
assign axi_master_port.aw_qos = '0;
assign axi_master_port.aw_region = '0;
assign axi_master_port.aw_atop = '0; // Configures atomic operations (AXI 5)
assign axi_master_port.aw_user = '0;
// assign axi_master_port.aw_valid;
// assign axi_master_port.aw_ready; // Input
// // AXI write data channel
// assign axi_master_port.w_data;
assign axi_master_port.w_strb = '1; // Strobe, byte enable
assign axi_master_port.w_last = 1'b1; // Single beat
assign axi_master_port.w_user = '0;
// assign axi_master_port.w_valid;
// assign axi_master_port.w_ready; // Input
// // AXI write response channel
// assign axi_master_port.b_id; // Input
// assign axi_master_port.b_resp; // Input
// assign axi_master_port.b_user; // Input
// assign axi_master_port.b_valid; // Input
assign axi_master_port.b_ready = 1'b1; // No error checking
// // AXI read address channel
assign axi_master_port.ar_id = '0;
// assign axi_master_port.ar_addr;
assign axi_master_port.ar_len = '0; // Number of beats in burst - 1
assign axi_master_port.ar_size = 3'b010; // Number of bytes per beat, let 4 bytes for 32b access
assign axi_master_port.ar_burst = 2'b00; // Burst type FIXED
assign axi_master_port.ar_lock = '0;
assign axi_master_port.ar_cache = '0;
assign axi_master_port.ar_prot = 3'b0; // Unpriviledged access
assign axi_master_port.ar_qos = '0;
assign axi_master_port.ar_region = '0;
assign axi_master_port.ar_user = '0;
// assign axi_master_port.ar_valid;
// assign axi_master_port.ar_ready; // Input
// // AXI read data channel
// assign axi_master_port.r_id; // Input
// assign axi_master_port.r_data; // Input
// assign axi_master_port.r_resp; // Input
// assign axi_master_port.r_last; // Input
// assign axi_master_port.r_user; // Input
// assign axi_master_port.r_valid; // Input
// assign axi_master_port.r_ready;
// -----------------------------------
assign axi_master_port.w_data = DATA;
assign axi_master_port.aw_addr = ADDRESS;
logic [9:0] timer_aw_q, timer_aw_d;
logic [9:0] timer_w_q, timer_w_d;
always_ff @(posedge clk_i) begin
if(!rst_ni) begin
timer_w_q <= 2;
timer_aw_q <= 2;
end else begin
timer_w_q <= timer_w_d;
timer_aw_q <= timer_aw_d;
end
end
always_comb begin
// Defaults
axi_master_port.aw_valid = 1'b0;
timer_w_d = timer_w_q + 1;
axi_master_port.w_valid = 1'b0;
timer_aw_d = timer_aw_q + 1;
// Write address
if (timer_aw_q == 0) begin
// Wait for transaction
axi_master_port.aw_valid = 1'b1;
if(!axi_master_port.aw_ready) // Wait for ready
timer_aw_d = timer_aw_q;
end
// Write data
if (timer_w_q == 0) begin
// Wait for transaction
axi_master_port.w_valid = 1'b1;
if(!axi_master_port.w_ready) // Wait for ready
timer_w_d = timer_w_q;
end
end
endmodule
I simulated this using an AXI crossbar with the same configuration and memory map as in cva6, using an axi2mem module and a simulated RAM as a stand-in for DRAM. I had to simulate with --no-timing in verilator, otherwise I got a nice segmentation fault. I’m instantiating the accelerator in ariane_xilinx.sv as follows:
axi_master_test #(
.AXI_ID_WIDTH ( AxiIdWidthSlaves ),
.AXI_ADDR_WIDTH ( AxiAddrWidth ),
.AXI_DATA_WIDTH ( AxiDataWidth ),
.AXI_USER_WIDTH ( AxiUserWidth ),
.ADDRESS('h9000_0000),
.DATA('hABCD)
) axi_master_test_i (
.clk_i (clk ),
.rst_ni (ndmreset_n ),
.axi_master_port (slave[2]) // Slave port in xbar
);
The only other changes I made were:
- Increment NBSlave from 2 to 3 to accommodate CPU, debug module, and my accelerator. This also changes AxiIdWidthSlaves from 5 to 6 bits.
- Increment NrSlaves from 2 to 3 in ariane_soc_pkg.sv.
After the aforementioned changes, the core does no longer boot (i.e., no UART activity, whereas before printed “Hello World! init SPI …”). Running OpenOCD results in the following errors:
Info : clock speed 1000 kHz
Info : JTAG tap: riscv.cpu tap/device found: 0x43651093 (mfg: 0x049 (Xilinx), part: 0x3651, ver: 0x4)
Info : datacount=2 progbufsize=8
Error: unable to halt hart 0
Error: dmcontrol=0x80000001
Error: dmstatus =0x00000c82
Error: Fatal: Hart 0 failed to halt during examine()
Warn : target riscv.cpu examination failed
Info : starting gdb server for riscv.cpu on 3333
Info : Listening on port 3333 for gdb connections
init routine started
Error: Target not examined yet
Note: I am working with a kc705 board, and debugging as I describe in #1803 (closed), using JTAG through Xilinx’s BSCANE2. This should not have any influence on the issue at hand. I would appreciate any ideas as to what I may be doing wrong. Maybe I am preventing the CPU from using the bus somehow? I tried a couple of things, such as using ndmreset_n instead of rst_n as a reset signal (the former seems to be generated from the latter, some modules use one, some use the other…). Synthesis takes over an hour on my computer.
Note2: I just noticed that the CPU hangs when my AXI master reads from RAM, reading from other addresses seems to work fine.
Thanks.