01-23-2019 03:09 AM - edited 01-23-2019 07:53 AM
Hi,
my goal is to parse a kernel data structure of a microkernel (Fiasco.OC) and write the data back to RAM. There might be better approaches, but this is more or less a proof of concept.
I use the DMA core for transfer and a custom IP core (kcap_ng_v1.66) with an AXI Stream Master and AXI Stream Slave for parsing. The incoming data is an array of a 16 byte structs and the output is an array of 8 bytes structs.
Problem:
As you can see in the source code, the number of incoming structs is counted with read_counter. As the total stream of data are 256 bytes, this counter counts 16 incoming structs.Therefore all data are read in.
We assume that every struct is valid, which should result in the same number of outgoing structs. These are counted with valid_counter. But this is not the case and the outgoing count differs in every run. It is like some valid structs are randomly ignored in line 87.
I don't know how this is possible. Is a timing problem? Do I need two buffers, where incoming data and outgoing data are stored? Is there still a problem with the m00_axis_tvalid and s00_axis_tready assign (line 54-56)?
`timescale 1 ns / 1 ps module kcap_ng_v1_0 # ( // Parameters of Axi Master Bus Interface M00_AXIS parameter integer C_M00_AXIS_TDATA_WIDTH = 64, parameter integer C_S00_AXIS_TDATA_WIDTH = 32 ) ( // Ports of Axi Master Bus Interface M00_AXIS input wire m00_axis_aclk, input wire m00_axis_aresetn, output wire m00_axis_tvalid, output wire [C_M00_AXIS_TDATA_WIDTH-1 : 0] m00_axis_tdata, output wire [(C_M00_AXIS_TDATA_WIDTH/8)-1 : 0] m00_axis_tstrb, output wire m00_axis_tlast, input wire m00_axis_tready, // Ports of Axi Slave Bus Interface S00_AXIS input wire s00_axis_aclk, input wire s00_axis_aresetn, output wire s00_axis_tready, input wire [C_S00_AXIS_TDATA_WIDTH-1 : 0] s00_axis_tdata, input wire [(C_S00_AXIS_TDATA_WIDTH/8)-1 : 0] s00_axis_tstrb, input wire s00_axis_tlast, input wire s00_axis_tvalid, output wire [31 : 0] total_kcap_count, output wire [31 : 0] valid_kcap_count, output wire [31 : 0] wrote_kcap_count ); reg [(C_M00_AXIS_TDATA_WIDTH/8)-1 : 0] tstrb = 0; reg tlast = 1'b1; // fix, as every outgoing kcap is a single package. reg [15:0] m00_badge; reg [31:0] m00_kcap; reg [15:0] s00_badge; reg [31:0] read_counter; // counts all valid and invalid incoming kcap values reg [31:0] uint32_counter; // counter for read in uint32 values reg [31:0] wrote_counter; // counts all outgoing valid kcap values reg [31:0] valid_counter; // counts all received valid kcap values // output wires for debugging assign valid_kcap_count = valid_counter; assign total_kcap_count = read_counter; assign wrote_kcap_count = wrote_counter; localparam [15:0] UNUSED = 16'h0000; localparam [15:0] INVALID_ID = 16'hffff; // don't continue reading values until current stored values (m00_kcap, m00_bdage) are written.
/*line 54-56*/ wire full = wrote_counter != valid_counter; assign s00_axis_tready = ~full; assign m00_axis_tvalid = full; // output wires assign m00_axis_tdata = {16'h0000, m00_badge, m00_kcap}; assign m00_axis_tstrb = tstrb; assign m00_axis_tlast = tlast; // read logic always @(posedge s00_axis_aclk) begin if (!s00_axis_aresetn) begin uint32_counter <= 0; read_counter <= 0; valid_counter <= 0; end else if(s00_axis_tvalid && s00_axis_tready) begin // data is available // every read struct has size of four uint32. if(uint32_counter == 3) begin uint32_counter <= 0; end else begin uint32_counter <= uint32_counter + 1; end // the second read uint32 value contains the `badge` value if(uint32_counter == 1) begin read_counter <= read_counter + 1; s00_badge <= s00_axis_tdata[15:0]; // filter all valid badges /*line 87*/ if(s00_badge != UNUSED && s00_badge != INVALID_ID) begin m00_kcap <= (read_counter << 12); // calculate kernel capability m00_badge <= s00_badge; // store badge in register valid_counter <= valid_counter + 1; end end end end // write logic always @(posedge m00_axis_aclk) begin if (!m00_axis_aresetn) begin wrote_counter <= 0; end else if(m00_axis_tvalid && m00_axis_tready) begin wrote_counter <= wrote_counter +1; end end endmodule
I created a testbench and passed m00_badge and s00_badge. You can see, that s00_badge is set to 4444 at 55ns. This value is unequal to UNUSED and INVALID_ID. But why does following line not take effect?
m00_badge <= s00_badge;
01-23-2019 04:33 PM
Oh, you nutty software guys...
The "<=" in Verilog is known as a non-blocking operator. When it's placed in a block with a sensitivity list containing a posedge keyword, the block effectively creates registered assignments/transactions. Every assignment in the block is transacted simultaneoulsy when the block is entered--i.e., when a positive edge of the clock occurs.
The "s00_badge <= s00_axis_tdata[15:0]" assignment that you say works, happens on the same exact clock edge as the "m00_badge <= s00_badge" assignment that you say doesn't work. When the clock edge (at 55 nS) occured, s00_badge hadn't been assigned a new value yet; it doesn't take that value until after the clock occurs. Naturally, that value couldn't be passed to m00_badge.
The entire non-reset clause of this block is qualified by the "s00_axis_tvalid && s00_axis_tready" check. So the assignments therein will only occur when this condition is true. The "m00_badge <= s00_badge" assignment you're waiting on will not happen until s00_badge gets updated with a new value--i.e., when conditions are met that will allow both assignments to occur again.
This is an RTL (registered transfer logic) problem. You need to re-write your HDL in such a way to account for assignments that occur over time.
-Joe G.
01-23-2019 04:33 PM
Oh, you nutty software guys...
The "<=" in Verilog is known as a non-blocking operator. When it's placed in a block with a sensitivity list containing a posedge keyword, the block effectively creates registered assignments/transactions. Every assignment in the block is transacted simultaneoulsy when the block is entered--i.e., when a positive edge of the clock occurs.
The "s00_badge <= s00_axis_tdata[15:0]" assignment that you say works, happens on the same exact clock edge as the "m00_badge <= s00_badge" assignment that you say doesn't work. When the clock edge (at 55 nS) occured, s00_badge hadn't been assigned a new value yet; it doesn't take that value until after the clock occurs. Naturally, that value couldn't be passed to m00_badge.
The entire non-reset clause of this block is qualified by the "s00_axis_tvalid && s00_axis_tready" check. So the assignments therein will only occur when this condition is true. The "m00_badge <= s00_badge" assignment you're waiting on will not happen until s00_badge gets updated with a new value--i.e., when conditions are met that will allow both assignments to occur again.
This is an RTL (registered transfer logic) problem. You need to re-write your HDL in such a way to account for assignments that occur over time.
-Joe G.
01-23-2019 04:35 PM
02-06-2019 11:46 AM
Thank you very much for that explanation. This was one of the problems.
@Everybody who has similar problems and is working with Genode. Following points helped to solve my problem.
env.ram().alloc(SIZE, Cache_attribute::UNCACHED)