cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
srojas
Observer
Observer
786 Views
Registered: ‎05-20-2019

AXI4 full slave read burst RVALID 50%

Hi,

I want to use an AXI4 slave to exchange data between the PS and the PL.

  • Ultrascale MPSoC
  • Vivado 2020.1
  • Interface: M_AXI_HPM0_FPD with a width of 128 bit.
  • AXI Slave data width: 32
  • AXI slave created with the IP Package Manager: default example

I want to burst read data from the FPGA. Using the System ILA I noticed that unlike the write transactions, when reading data, there seems to be an idle between one data and the next one, see signal RVALID below. I was expecting the RVALID to be set the whole time, similar as with WVALID.

Screenshot 2021-02-09 180944.png

srojas_0-1612891540488.png

 

Doesn't this reduce the performance for reading data?

Someone had reported a similar behavior in

https://forums.xilinx.com/t5/Memory-Interfaces-and-NoC/MIG-7-series-DDR3-AXI-slave-read-50-duty-cycle/m-p/1053192

 

Does anyone know why this is so? and how can be the throughput improved?

I appreciate your comments.

King Regards.

 

0 Kudos
5 Replies
dgisselq
Scholar
Scholar
764 Views
Registered: ‎05-21-2015

@srojas ,

I wrote about this and several other bugs prominent in Xilinx's demonstration AXI4 cores some time ago.  The bugs within them have been present since at least 2016, and have not been fixed as of Vivado 2020.2 in spite of bug reports dating back (roughly) two years.

You can find a better/working AXI core here--one that'll get you 100% throughput.

Dan

0 Kudos
srojas
Observer
Observer
703 Views
Registered: ‎05-20-2019

@dgisselq 

Thank you for your response.

I have read your blog about these bugs. Great material to understand AXI and its workings.

I think there are many users, starting with AXI, who would like to first use the Xilinx example. I was wondering if anyone has been able to figure out what changes have to be done to the code in order to improve the performance when reading.

Taking a look at the AMBA AXI and ACE Protocol Specification, the RVALID signal seems to be set by the slave, so there has to be a way to improve the throughput in the provided example.

 

0 Kudos
dpaul24
Scholar
Scholar
694 Views
Registered: ‎08-07-2014

@srojas ,

I want to burst read data from the FPGA. Using the System ILA I noticed that unlike the write transactions, when reading data, there seems to be an idle between one data and the next one, see signal RVALID below. I was expecting the RVALID to be set the whole time, similar as with WVALID.

It might be due to AXI4 interconnect being used. I would suggest you to review the interconnect configuration settings once again (probably you can optimize them).

Did you check out the RVALID signal driven by the AXI4-Lite slave?

If the slave is continuously driving the RVALID high, but your the RAVLID @ i/c is as you have shown in the above SS, then for sure the problem lies in the interconnect (again might be a setting thing).

Else if you see similar RVALID behavior driven by the save, then find out whether your slave is really able to provide continuous data or not.

Having said that, the best test would be to connect the master and slave without the i/c and then perform a bust read.

------------FPGA enthusiast------------
Consider giving "Kudos" if you like my answer. Please mark my post "Accept as solution" if my answer has solved your problem
Asking for solutions to problems via PM will be ignored.

0 Kudos
srojas
Observer
Observer
468 Views
Registered: ‎05-20-2019

Hi,

With AXI SmartConnect instead of Interconnect I get the same results. I tried removing the interconnect and connecting the AXI-Full slave directly to the PS, but the system stalls after a burst read. 

Since the slave is the one driving the RVALID signal, the example code provided by Xilinx has to be check against the AMBA AXI and ACE Protocol Specification. Here is the logic setting RVALID:

// axi_rvalid is asserted for one S_AXI_ACLK clock cycle when both 
// S_AXI_ARVALID and axi_arready are asserted. The slave registers 
// data are available on the axi_rdata bus at this instance. The 
// assertion of axi_rvalid marks the validity of read data on the 
// bus and axi_rresp indicates the status of read transaction.axi_rvalid 
// is deasserted on reset (active low). axi_rresp and axi_rdata are 
// cleared to zero on reset (active low).  

always @( posedge S_AXI_ACLK )
begin
  if ( S_AXI_ARESETN == 1'b0 )
	begin
	  axi_rvalid <= 0;
	  axi_rresp  <= 0;
	end 
  else
	begin    
	  if (axi_arv_arr_flag && ~axi_rvalid)
		begin
		  axi_rvalid <= 1'b1;
		  axi_rresp  <= 2'b0; 
		  // 'OKAY' response
		end   
	  else if (axi_rvalid && S_AXI_RREADY)
		begin
		  axi_rvalid <= 1'b0;
		end            
	end
end    

 

Now I can't do more testing, but I wanted it to mention in case someone else is interested.

Kind regards,

Sebastian

0 Kudos
dgisselq
Scholar
Scholar
445 Views
Registered: ‎05-21-2015

@srojas ,

So, let me recap this conversation so far:

  1. I point you to an article outlining several bugs in the Xilinx's demonstration cores dating back for years. Bugs that have not been fixed in Vivado 2020.2 in spite of being around since before 2016.  Indeed, your request above starts from a figure copied from that article.  (It's the one marked "Fig 4. An AXI Read Transaction" above, and was generated using wavedrom.  You can find the original in Fig 9 of the article I cited above.)
  2. The article points out that these Xilinx designs could well hang your AXI bus and the rest of your design with it. It recommends not using them.
  3. The article doesn't spend much time discussing duty cycle because, well, who cares if you can achieve 50% or even 100% read duty cycle if the AXI IP will cause your entire design to hang.
  4. Since the article was written, Xilinx has acknowledged these bugs and promised a fix.
  5. Your response to this was, "I think there are many users, starting with AXI [?], who would like to first use the Xilinx example."
    • Okay, so ... you want to proceed with this broken design anyway?  Fair enough.  That's your call.
  6. You then place this design into a Zynq system and the system hangs.  Does this surprise you? It doesn't surprise me at all. It's much like I might expect.  As the article pointed out, Xilinx's design is quite broken. Does it really make a difference whether or not it only gets 50% duty cycle?
  7. The second article cited above discusses how to build a better AXI slave--one that works, where the read logic is replaced with working logic.
    • It doesn't hang the bus
    • It reads from the read address, never the write address, and writes to the write address--never the read address.
    • It properly handles narrow burst addressing
    • It achieves the 100% throughput you were looking for.  (Not even Xilinx's AXI block RAM controller will get 100% throughput ...)
  8. Since posting those two articles, AXI bugs have been found in other Xilinx IP to include two other template designs, their AXI Ethernet-lite controller and (more recently) in their AXI Quad SPI IP.
  9. I could post links to articles describing these problems in abundance, but ... I've been warned that posting too many links will get my responses marked as SPAM, so I try to avoid that if possible.
  10. I have heard that Xilinx hasn't posted updates to any of these broken designs because they have no way of reproducing these bugs with their VIP.  This makes verifying any modified, corrected or even updated design a challenge.
  11. But ... you are still asking how to patch this broken design?  Well, okay, but you'll have a long road ahead of you.

The first step to patching the IP would be to make certain that RVALID gets set any time there's a value to return and !RVALID || RREADY. You could do this easily by removing the ~axi_rvalid criteria from the logic block above. If you do this, however, the house of cards is likely to start falling.

  • S_AXI_RLAST won't get set in a timely fashion
  • I'm not sure you'll necessarily be reading the right values from memory. You might find yourself off by a cycle. If you want to check this yourself, you're going to need to check both with and without backpressure. One of the big problems Xilinx's demonstration cores have suffered from is a lack of proper backpressure handling. You'll need to check for this.
  • In general, everything in the read chain following the AR* request should transition on !RVALID || RREADY.  (This was one of the bugs in Xilinx's AXI Ethernet-lite controller, their response waiting on RREADY causing it to hang under certain conditions.  Most Verification IP won't catch this, but it is easily caught via a formal verification check.)
  • Similarly, be forewarned of anything transitioning on xVALID && xREADY && anything_else.  For example, Xilinx's demo core starts a read whenever !ARREADY && ARVALID && !axi_awv_awr_flag && !axi_arv_arr_flag.  Elsewhere, they start on !AXI_ARREADY && ARVALID && !axi_arv_arr_flag.  Frankly, both of these are recipes for disaster.  All transitions on the AR* channel in the slave should be based on ARREADY && ARVALID and NOTHING ELSE.  Any other condition is likely to hang the bus when ARVALID && ARREADY are both true but the other condition isn't.  The problem is, if you don't fix this properly, you'll be (still) stuck reading from the wrong mem_address whenever both reads and writes show up at the same time.

Are you sure you want to go down this road?

Dan

0 Kudos