cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
rdb98791
Adventurer
Adventurer
1,263 Views
Registered: ‎03-05-2020

Custom AXI-Lite Slave IP causing Microblaze to stall

I made a Microblaze block diagram in Vivado 2019.1, and tested it with a quick "Hello World" application. This was successful: I could step through code with the debugger and see text in the serial port terminal program.

I then added a custom AXI-Lite slave IP that was developed on an earlier version of Vivado (known to work). I added it to the Block Design and the Connection Automation had no issues attaching to the AXI SmartConnect and assigning it an address in the Address Editor. The Block Design validated, and I generated a bitstream with a passing timing score.

Now when I try to run the same "Hello World" application, the debugger never starts. When I click the "Pause" button in SDK, I get the error message "Cannot stop MicroBlaze. Stalled on instruction fetch." Trying to simply "Run" the application (not debug), I never see the serial port printout.

I attached an ILA to both the UART IP and my custom IP to try and diagnose the issue. The only difference I see right after programming is that BREADY and RREADY are both high for the UART IP, while these signals are both low for the custom IP. The master drives this, so I'm not sure why there's a difference. All other signals were zeros on both IP.

The only other slight difference I can find is how the custom IP's labels appear in the Address Editor tab. The working IP claim "Base Name = Reg" while my IP claims "Base Name = S00_AXI_reg". Not sure if this needs to match the other IP exactly or not:

AddressAssignments.png

0 Kudos
10 Replies
dgisselq
Scholar
Scholar
1,247 Views
Registered: ‎05-21-2015

@rdb98791,

Custom AXI slave peripheral?  Did you generate it from Vivado's AXI-lite slave core generator?  If so, you should know both the AXI and AXI-lite demonstration designs have been broken for upwards of 3 years now and has yet to be fixed.  I first found the bug in the code generated by Vivado 2016.3.  At that time, there were bugs in both the read and write halves of the core.  (You'll want to check your write half for the bug as well!)  While the design has since been updated as of at least 2018.3, the bug in the read half of the core remains as of Vivado 2020.1.  Here you can see what a trace of the bug looks like:

axi-read-fault.png

If this bug is ever hit, the bus will seize, halting any and all subsequent usage of the bus.  The bug is known to pass an AXI VIP check, providing you with a false sense of the design logic working when it doesn't.  Worse, whether or not the bug gets triggered appears to depend upon an interconnect setting--something independent of your logic entirely.

The quick and easy way to check if your core has this bug (at least on the read side) is to check the logic for ARREADY to see whether or not ARREADY is responsive to back pressure.  That is, ARREADY shouldn't be set high (in this design) unless !RVALID || RREADY.  In Xilinx's demonstration core, this check is non-existent which will cause your design to hang with these (or similar) symptoms.  The write side check is similar.

If you look closely at the logic, you'll also see checks for xVALID && xREADY && something else.  This check for something else is also a serious AXI bug.  Ask yourself, what would the design do if the something else weren't true?  If the answer is that it wouldn't respond to the bus, which I've seen often, then you are again inviting a bus lock up like this.

Fixing the logic is actually fairly easy.  Check this blog post for a better way of handling the AXI signaling logic.  It's also fairly easy to check your design against a formal property set to see if your design is bug free or not (link above).

Dan

0 Kudos
rdb98791
Adventurer
Adventurer
1,210 Views
Registered: ‎03-05-2020

Awesome. I'm not familiar with the concept of a formal property file. I'm assuming since the verilog file faxil_slave.v treats all axi signals as inputs, this is basically a line sniffer with protocol checking? 

Is the intent to run the formal property in a simulation environment, or in real time on hardware? 

0 Kudos
dgisselq
Scholar
Scholar
1,200 Views
Registered: ‎05-21-2015

@rdb98791,


@rdb98791 wrote:

Awesome. I'm not familiar with the concept of a formal property file. I'm assuming since the verilog file faxil_slave.v treats all axi signals as inputs, this is basically a line sniffer with protocol checking? 


I suppose it's sort of like a line sniffer with protocol checking.  That particular file could be used within a simulation to do protocol checking that way.  Using it that way would restrict what it can do to the creativity of your simulation generation--part of the problem that gets folks into this mess in the first place.


Is the intent to run the formal property in a simulation environment, or in real time on hardware? 

Formal methods are actually very different from simulation.

simulation-v-formal.png

Vivado doesn't support formal verification, although there are several open source and commercial tools you can use for that purpose on source files you might have.  The article you read above used SymbiYosys for this purpose.

In simulation, you provide the inputs to your module.  With formal, you don't provide the inputs at all, you provide constraints instead.  The formal solver will then search all possible input combinations subject to the constraints you've given to find one combination that will break an assertion.

In your case, your design "worked" before.  It passed a simulation for a given bus driver, but wouldn't have passed with all possible bus drivers.  The simulation didn't check all the different ways the bus might have been driven, but rather only one possible way the bus could've been driven.  Formal methods, on the other hand, checked *EVERYTHING* looking for a bug.  This is also their Achilles heel: checking everything can quickly become combinatorially prohibitive.  Indeed, it's exponential in its complexity.  Therefore, you tend to only apply formal methods to smaller design components, and for limited numbers of time steps.

For example:

  1. Many simulation scripts will check read or writing a slave design but never both.  As an example, Xilinx's AXI ethernet-lite controller passed their acceptance tests yet it will still fail if you ever try reading and writing to it on the same clock cycle--the write will get sent to the read address.
  2. Most simulation scripts will raise BREADY or RREADY whenever they are interacting with an AXI slave core.  Why?  Because that core is the only item in the simulation and so there's no reason to stall the design on return.  The AXI protocol, however, doesn't require this.  Instead, the AXI protocol requires that a slave set BVALID or RVALID and then wait for the corresponding ready.  Simulation scripts tend not to catch things thing like the slave waiting for RREADY before setting RVALID.  In this case, Xilinx's AXI ethernet-lite controller will hang.
  3. Backpressure is the term used to describe BVALID && !BREADY or equivalently RVALID && !RREADY.  It's used when the master isn't able to accept a return on the current clock cycle.  Backpressure can be introduced by a crossbar interconnect as it tries to win arbitration on the return path.  It can also be introduced by the interconnect when crossing clock domains as the answer waits to transfer from a faster clock to a slower one.  Many simulation scripts don't check for backpressure, leaving users with the false sense of belief that their design works--even when they haven't yet checked all cases.  This is one of the problems with both Xilinx's AXI and AXI-lite demo designs--the simulation script never applies backpressure, so you'll never know if the design will work in all cases or not.  Formal methods will check all of the above.

Hopefully that little explanation helps you understand better what's going on.  The reality is that there are several bugs throughout the Xilinx eco-system that I've caught using formal methods.  These are only some of them.

Dan


0 Kudos
rdb98791
Adventurer
Adventurer
1,179 Views
Registered: ‎03-05-2020

Thanks for the detailed description. Interesting stuff.

I have some new findings:

1) I replaced my custom IP with another Vivado AXI GPIO module. Everything worked fine with this setup (no microblaze stalling).

2) I created a new AXI-Lite IP using Vivado's "Create and Package New IP." I left all options as defaults, and I didn't modify any of the files. I instantiated this directly in place of the AXI GPIO module (even the same address in the Address Editor). The microblaze is stalling again.

0 Kudos
dgisselq
Scholar
Scholar
1,168 Views
Registered: ‎05-21-2015

@rdb98791,

Absolutely!  Xilinx's AXI GPIO passes a formal verification check, so it is bus compliant.  It gets absolutely horrible performance, but that's irrelevant to bus compliance.

You might find this video on the topic interesting, to know more of what's going on.

Dan

0 Kudos
rdb98791
Adventurer
Adventurer
987 Views
Registered: ‎03-05-2020

Well I have adapted the axi logic equations from "wb2axip/demoaxi.v", but the microblaze is still stalling. I'm going to start a new thread since the details have now changed (I'm no longer even trying to get a "custom" axi-lite slave IP to work, but simply just the default/auto-generated code from Vivado).

Do you have a verified AXI IP that can be integrated into Vivado (with correct directory structures, "component.xml", tcl scripts, etc)? Something that you deem to be a correct AXI-lite slave implementation that I can easily plug into the Vivado block design, for testing?

0 Kudos
dgisselq
Scholar
Scholar
975 Views
Registered: ‎05-21-2015

@rdb98791,

No, sorry, 1) I don't place Vivado's source directories in source control since they are tool produced, and 2) I currently just build those as I need them.

Here's another approach you might find valuable: place a firewall between AXI master and slave, either above or below the interconnect.  (Might work better below the interconnect.)  If the design still fails (it might not), you'd at least be able to start narrowing down the problem using Xilinx's internal logic analyzer.

Dan

rdb98791
Adventurer
Adventurer
957 Views
Registered: ‎03-05-2020

Okay thanks. I will give that a try next.

0 Kudos
dgisselq
Scholar
Scholar
945 Views
Registered: ‎05-21-2015

@rdb98791,

Just checking but ... you do realize that the default auto-generated code by Vivado is broken, right?  It has been broken since at least 2016.3 and has yet to be fixed as of 2020.1.

Dan

0 Kudos
rdb98791
Adventurer
Adventurer
917 Views
Registered: ‎03-05-2020

Yes, your posts and the articles have explained that clearly. But when I adapted the logic from your code it ended up with the same result which makes me think something else maybe be wrong (SDK bug, microblaze debugging module, jtag, AXI SmartConnect, etc). I'm hoping that it's something very simple and a quick fix. I'm not super interested in doing a deep dive into the guts of AXI unless there's absolutely no other choice. Xilinx isn't paying me to debug their logic. And if I have to do everything from the ground up, I'd probably go with an open core RISC-V with Wishbone or just a simple parallel bus interface. I already have some pretty good simulation results with that. I think you can get some ARM cores nowadays too.

0 Kudos