cancel
Showing results for 
Search instead for 
Did you mean: 
Explorer
Explorer
17,359 Views
Registered: ‎11-13-2007

Design or timing problem somewhere. What is the best way to debug?

Jump to solution

I have a Virtex 4FX design done in EDK 10.1. There is a lot of custom IP attached to the Power PC.

 

Sometimes after compiling the design, nothing works. The Power PC won't even start. Then I make a small change to some very trivial thing (such as changing the revision constant in a register) and it all starts working again.

 

I cannot continue like this. The problem with debugging this is that I have no idea where in the design to start looking for such a problem. There are thousands of lines of custom VHDL, plus all the behind the scenes work EDK does to connecting everything.

 

I've asked my FAEs this question, and no one seems to have an answer. Does anyone at Xilinx have a methodolgy which is better than random shots into the sky? The timing report doesn't tell me crap. The FPGA editor is so ackward to use; it seems like an early GUI bolted onto a DOS tool. I'm not sure where to start looking for this problem.

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Explorer
Explorer
13,062 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

Just to follow up for others:

 

The CDC file did exist and was located further down in the project as compared to CDC files I've opened in the past.

It was here:

 

C:\AFolder\AProject\implementation\chipscope_plbv46_iba_0_wrapper\chipscope_plbv46_iba_0.cdc

 

Thanks Gary.

View solution in original post

0 Kudos
13 Replies
Highlighted
Professor
Professor
17,358 Views
Registered: ‎08-14-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

If you suspect a timing problem, you can start with the Post Place&Route Timing analysis.  If for example

you make a small change, and everything breaks you may have a timing problem.  Using the post place&route

timing analysis can show things like unconstrained paths (You need to turn this on by setting "Report Unconstrained

Paths" to some large number in the Post P&R static timing report properties).  If you are failing any constraints

you can also find that easily and see the paths that are affected.

 

If all of your timing constraints are being met, but you still have erratic behavior from run to run, many times

this is due to asynchronous inputs to state logic.  XST converts most state machines into one-hot for

performance.  Asynchronous inputs to a one-hot state machine can cause the logic to stop by going

"zero hot" or to behave badly by asserting more than one state at once.  This behavior cannot be

fixed in the source simply by assigning behaviors to default states or  using all possible states of

a binary state variable, because the state variable gets converted to one bit per state in the synthesis

process.  You can change the default FSM encoding in the XST properties, but of course the real

fix is to make sure all state machine inputs are synchonized to it's clock.

 

Beyond that, there is always ChipScope, which puts a logic analyzer into your design.  The problem

with ChipScope and timing issues is that often ChipScope changes the timing behavior in a

way that masks the problem.

-- Gabor
Highlighted
Explorer
Explorer
17,353 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

Thanks for your reply.

 

I've tried post P&R timing analysis and looked through hundreds of reported unconstrained nets. These won't fail because they are unconstrained, right?

The bulk of my design runs at 100MHz, and it clocked by 50MHz into a DCM. The PowerPC core is running at 300MHZ, also clocked by a DCM. The interface bus (PLB) is running at 100MHz.

 

Deep down in my VHDL, I have some auto generated PLB interface code which I modifed to included a 'version' register the PowerPC can read.  It looks something like this:

 

  -- implement slave model software accessible register(s) read mux
  SLAVE_REG_READ_PROC : process( slv_reg_read_sel, slv_reg0, slv_reg1, slv_reg2, slv_reg3, slv_reg4, slv_reg5, slv_reg6, slv_reg7, slv_reg8, slv_reg9, slv_reg10, slv_reg11, slv_reg12, slv_reg13, slv_reg14, slv_reg15, slv_reg16, slv_reg17, slv_reg18, slv_reg19, slv_reg20, slv_reg21, slv_reg22, slv_reg23, slv_reg24, slv_reg25, slv_reg26, slv_reg27, slv_reg28, slv_reg29, slv_reg30, slv_reg31 ) is
  begin

    case slv_reg_read_sel is
      when "10000000000000000000000000000000" => slv_ip2bus_data <= slv_reg0;
      when "01000000000000000000000000000000" => slv_ip2bus_data <= slv_reg1;
      when "00100000000000000000000000000000" => slv_ip2bus_data <= slv_reg2;
      when "00010000000000000000000000000000" => slv_ip2bus_data <= slv_reg3;
      when "00001000000000000000000000000000" => slv_ip2bus_data <= slv_reg4;
      when "00000100000000000000000000000000" => slv_ip2bus_data <= slv_reg5;
      when "00000010000000000000000000000000" => slv_ip2bus_data <= slv_reg6;
      when "00000001000000000000000000000000" => slv_ip2bus_data <= slv_reg7;
      when "00000000100000000000000000000000" => slv_ip2bus_data <= slv_reg8;
      when "00000000010000000000000000000000" => slv_ip2bus_data <= slv_reg9;
      when "00000000001000000000000000000000" => slv_ip2bus_data <= slv_reg10;
      when "00000000000100000000000000000000" => slv_ip2bus_data <= slv_reg11;
      when "00000000000010000000000000000000" => slv_ip2bus_data <= slv_reg12;
      when "00000000000001000000000000000000" => slv_ip2bus_data <= slv_reg13;
      when "00000000000000100000000000000000" => slv_ip2bus_data <= slv_reg14;  
      when "00000000000000010000000000000000" => slv_ip2bus_data <= slv_reg15;
      when "00000000000000001000000000000000" => slv_ip2bus_data <= slv_reg16;
      when "00000000000000000100000000000000" => slv_ip2bus_data <= slv_reg17;
      when "00000000000000000010000000000000" => slv_ip2bus_data <= slv_reg18;
      when "00000000000000000001000000000000" => slv_ip2bus_data <= slv_reg19;
      when "00000000000000000000100000000000" => slv_ip2bus_data <= slv_reg20;
      when "00000000000000000000010000000000" => slv_ip2bus_data <= slv_reg21;
      when "00000000000000000000001000000000" => slv_ip2bus_data <= slv_reg22;
      when "00000000000000000000000100000000" => slv_ip2bus_data <= slv_reg23;
      when "00000000000000000000000010000000" => slv_ip2bus_data <= slv_reg24;
      when "00000000000000000000000001000000" => slv_ip2bus_data <= slv_reg25;  
      when "00000000000000000000000000100000" => slv_ip2bus_data <= slv_reg26;  
      when "00000000000000000000000000010000" => slv_ip2bus_data <= slv_reg27;
      when "00000000000000000000000000001000" => slv_ip2bus_data <= slv_reg28;  
      when "00000000000000000000000000000100" => slv_ip2bus_data <= slv_reg29;  
      when "00000000000000000000000000000010" => slv_ip2bus_data <= slv_reg30;      
      when "00000000000000000000000000000001" => slv_ip2bus_data <= X"00010004"; --version 1.0004 
  
      when others => slv_ip2bus_data <= (others => '0');
    end case;

 

 

The design works great in general. Now if I modify the  X"00010004" to  X"00010005", the design will still build just fine, but the PowerPC will not start, nor will anything else work.

Yes, I agree there is a problem somewhere. The question is how to find it. Using ChipScope in this case is useless because if ChipScope can actually run, then the problem isn't present for that specific build.

 

0 Kudos
Highlighted
Historian
Historian
17,344 Views
Registered: ‎02-25-2008

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

akleica wrote:

Thanks for your reply.

 

I've tried post P&R timing analysis and looked through hundreds of reported unconstrained nets. These won't fail because they are unconstrained, right?

The bulk of my design runs at 100MHz, and it clocked by 50MHz into a DCM. The PowerPC core is running at 300MHZ, also clocked by a DCM. The interface bus (PLB) is running at 100MHz.

 



 

Are you resetting your DCM?

 

-a

----------------------------Yes, I do this for a living.
0 Kudos
Highlighted
Explorer
Explorer
17,334 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

That's a good question. I have a pin defined as my reset and it has a pull up resistor on it. It can be toggled by another CPU in the system, but generally it just sits there pulled up by the resistor.

 

A long time ago I vaguely remember having a problem getting the DCM RST to respond to this pin, so I tied it to GND and checked the box External reset is active high. This has worked for over a year.

Today I tried connecting the DCM RST to the actual pin again, unchecking the 'active high' box, (because the pin is always high), and rebuilt.

 

This change has no effect. For the project that works, it still works; for the project in a 'broken' state, it's still broken. 

0 Kudos
Highlighted
Historian
Historian
17,315 Views
Registered: ‎02-25-2008

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

akleica wrote:

That's a good question. I have a pin defined as my reset and it has a pull up resistor on it. It can be toggled by another CPU in the system, but generally it just sits there pulled up by the resistor.

 

A long time ago I vaguely remember having a problem getting the DCM RST to respond to this pin, so I tied it to GND and checked the box External reset is active high. This has worked for over a year.

Today I tried connecting the DCM RST to the actual pin again, unchecking the 'active high' box, (because the pin is always high), and rebuilt.

 

This change has no effect. For the project that works, it still works; for the project in a 'broken' state, it's still broken. 


 

RTFDS. Virtex4 DCM reset needs to be at least 200 ms (yes, milliseconds).

 

Now it isn't clear at all that you need to assert the DCM reset at power-up; none of the stuff created by EDK Base System Builder includes such a DCM reset, which would require a big counter.

 

One way to tell if the DCM is running is to clock a simple T flip-flop with each one of the DCM output clocks, and bring those flop outputs to pin, and watch them with a 'scope. If they're not toggling then the DCM isn't running. Also bring the DCM LOCKED output to a pin for easy monitoring.

 

Check your design to make sure you've got your feedback and the various DCM configurations correct.

 

-a

----------------------------Yes, I do this for a living.
0 Kudos
Highlighted
Explorer
Explorer
17,309 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

I've read the manual, thanks.

 

You know the drill: compile, download via JTAG, debug, etc...

 

The board stays powered up constantly during this, and the resets (everywhere) are de-asserted. The DCM is running.

 

 

0 Kudos
Highlighted
Explorer
Explorer
17,270 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution
Apparently the JTAG port has access into the reset block so hopefully it's doing the right thing.
0 Kudos
Highlighted
Explorer
Explorer
17,035 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

So my FAE says I should try 'simulating' this problem. I don't own the proper tool, so I'm here asking for advise.

 

Do any of the simulators work in a post P&R environment? I don't have a VHDL functional issue, I have something that breaks sometimes after random P&R.

I think it will probably be fixed with a re-write of some VHDL or possibly some constraint to something.

At this point, I have a build of my project which is exhibiting a very weird and predictable behavior. I have my full VHDL hardware project combined with a simply Power PC printf type statement writing characters to one of my UARTS. The current build has data bit D2 stuck high in the UART byte, so whatever I write out to the UART becomes garbage. 

For example a + is represented by 2B, and what comes out of my UART is / which is represented by 2F.

There is absolutely no reason for this to be happening. It's the Power PC talking over PLB4.6 to a Xilinx UART. Of course my VHDL is hanging off the side of this bus too, but it shouldn't affect it this way.

 

I'd like to see what is going on in the chip to have data bit D2 always stuck high.

 

Anyone got any advice?

0 Kudos
Highlighted
Explorer
Explorer
16,985 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

So I have the Chipscope PLBv46 core in my project now, and the serial port is actually NOT working (which is good), so maybe I can watch PLB transactions to it.

 

My current problem is after I launch Chipscope. It shows 115 generically labeled data ports and I have no idea what signals are mapped to those ports. I can't find this in the documentation either.

 

Any ideas?

What document should I be reading to explain this??

 

0 Kudos
Highlighted
Scholar
Scholar
9,968 Views
Registered: ‎04-07-2008

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

Hi, 

did you import the cdc file to get the signal mapping?

Gary

Highlighted
Explorer
Explorer
9,966 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

Hi Gary, thanks for the reply. I did forget to look and discovered no CDC file was generated.

It's usually at the root of the project and has a name such as chipscope_ila_0.cdc, but this project didn't create one.

 

The PLBv46 IBA doesn't have any connections to "my" IP. It's just a PLB analyzer and I would expect it to know the names of the PLB bits and pieces.

 

Tony

0 Kudos
Highlighted
Explorer
Explorer
13,063 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

Just to follow up for others:

 

The CDC file did exist and was located further down in the project as compared to CDC files I've opened in the past.

It was here:

 

C:\AFolder\AProject\implementation\chipscope_plbv46_iba_0_wrapper\chipscope_plbv46_iba_0.cdc

 

Thanks Gary.

View solution in original post

0 Kudos
Highlighted
Explorer
Explorer
9,945 Views
Registered: ‎11-13-2007

Re: Design or timing problem somewhere. What is the best way to debug?

Jump to solution

So I'm now viewing the PLB using Chipscope and I happen to have a build where the design doesn't work correctly. I've set up a system that has all of my custom IP, which is only attached to the Power PC using the PLB. It boots into the bootloader stored in bram and all it does it try to write to a PLB based UART. When the system is normally working, I can see a known set of characters come out of the serial port. Right now, nothing is coming out of the serial port. I do see the PLB transactions writing to the serial port and they do look good. (I can change the written data in the source code, recompile, download into the FPGA, and see the correct changes in Chipscope.

 

This just baffles me! My custom IP is not even being exercised right now. It's a bunch of 'stuff' attached only to the PLB; the Power PC has access to it only through the PLB, yet somehow when I make 'minor' changes to my IP, the UART sitting on the same PLB fails.

 

Any brilliant FPGA designers out there want to chime in? gszakacs?? bassman59??

0 Kudos