cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
ojay77
Observer
Observer
1,805 Views
Registered: ‎04-25-2018

ROM generated from core generator not re-initializing properly after reset...

Hello,

I have a large design prototyped on a KC705 board where my DSP logic operates off of a 300 MHz clock generated by a MMCM/PLL which relies on a 150 MHz reference that is provided externally through a physical PLL. The external PLL also provides a LOCK DETECT signal which I'm using to hold my logic in reset to ensure it runs only when the reference clock is stable.

Part of my FPGA DSP logic is an NCO which relies on two ROM's to generate the sine and cosine values (provided through an initialization .coe file). 

I noticed that if I assert my design's global reset the ROM contents get corrupted. Can you sed some light on this?

0 Kudos
31 Replies
pthakare
Moderator
Moderator
1,725 Views
Registered: ‎08-08-2017

Hi @ojay77 

What is the configuration of BMG ?  I presume it is to be independent clock block RAM based implementation.

is all the addresses getting corrupted or any specific address?

How you are asserting and deasserting the reset ? 

Please follow this AR and let me know if  setup/hold time violation is not occuring on reset signal.

https://www.xilinx.com/support/answers/42571.html

 

 

 

 

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
0 Kudos
drjohnsmith
Teacher
Teacher
1,717 Views
Registered: ‎07-09-2009

Assuming your using ram as ROM, the ram is initialised only at configuration to form rom.
you dont have a reset going to the Ram by mistake ?
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
ojay77
Observer
Observer
1,706 Views
Registered: ‎04-25-2018

I didn't check all addresses but most at least seem corrupt. The ROM has no reset. https://forums.xilinx.com/t5/forums/replypage/board-id/OTHERIP/message-id/4630
0 Kudos
ojay77
Observer
Observer
1,706 Views
Registered: ‎04-25-2018

It doesn't have a reset but I also tried enabling reset and resetting upon restart and it was still corrupted.
0 Kudos
drjohnsmith
Teacher
Teacher
1,695 Views
Registered: ‎07-09-2009

Ok, are you certain the rom is going into a ram that supports initialisation from the init file ? Have you gotten other roms working ? try a few small test roms see if they are correct, what do you mean by your global reset ? what do you drive with that in your design ?
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
ojay77
Observer
Observer
1,691 Views
Registered: ‎04-25-2018

Like I said, the ROM works fine upon initial programming. It is when I reset the external PLL that it gets corrupted.

0 Kudos
ojay77
Observer
Observer
1,690 Views
Registered: ‎04-25-2018

The core wouldn't even be generated if the .coe file was not compatible

0 Kudos
drjohnsmith
Teacher
Teacher
1,674 Views
Registered: ‎07-09-2009

Ok, its a funny one, so trying to understand, and reading on the tablet it is not always easy to scroll back,

So the external PLL, when that is reset, what happens, does the clock to the FPGA stop / glitch ?
If so , then thats some where to look,

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
drjohnsmith
Teacher
Teacher
1,674 Views
Registered: ‎07-09-2009

Why do you mention COE file ?
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
ojay77
Observer
Observer
1,624 Views
Registered: ‎04-25-2018

When the external PLL is reset the reference clock to the FPGA stops which in turn pulls the LOCKED signal on the FPGA PLL low which drives the reset of all the FPGA logic. So when the PLL clock is down, the FPGA logic is held in reset until the external clock is up again and the FPGA PLL is LOCKED. 

0 Kudos
ojay77
Observer
Observer
1,623 Views
Registered: ‎04-25-2018

Because I use a COE file to initialize my ROM values?

0 Kudos
brimdavis
Scholar
Scholar
1,603 Views
Registered: ‎04-26-2012

@ojay77    "the LOCKED signal on the FPGA PLL low which drives the reset of all the FPGA logic"

One possibility to check for (although it normally produces only sporadic bit errors), is the following:

Violating address/enable setup and hold times on a BRAM can randomly corrupt the BRAM contents, ***even if the BRAM is configured as a ROM with WE permanently inactive*** .

See AR21870 and AR42571;

"The setup/hold violation conditions are met when an asynchronous reset (or any other control signal) is deasserted or the clock is lost (even briefly) during device operation (for example, losing input clock to the DCM/MMCM or BUFR/BUFG coming into the BRAM).

In this case the block RAM contents can be corrupted even if the write enables are low"

This corruption is caused by internal bitline decode contention on an address/control setup/hold violation.

Any of the following conditions can cause random memory corruption:

  •  if there is a period of time where the PLL output is unstable before the PLL LOCKED deasserts
  •  if the BRAM is enabled while the internal MMCM is still locking, or while unlocking (runt clock pulses violate setup/hold)
  •  if you are asynchronously resetting registers in the BRAM address/control path with the BRAM enabled

You can work around this with either the BRAM enable line (hold low at startup then synchronously assert/deassert once clocks are stable), or by using a BUFGCE to disable the global clock buffer driving the BRAM until the clocks are known good.

IMPORTANT NOTE:

If your external PLL can unlock ***without advance notice*** after the initial startup (i.e. the PLL output changes frequency, or suffers a phase transient, before LOCKED deasserts), then there is no easy way to work around this in the FPGA. One possibility is to add logic to dynamically reload the BRAMs (which are normally initialized only by the bitstream if configured as a ROM), or reconfigure the FPGA, upon detection of an unexpected PLL or MMCM unlock during operation.

-Brian

0 Kudos
ojay77
Observer
Observer
1,595 Views
Registered: ‎04-25-2018

Thanks Brian, I initially used to think that might be the cause. But this issue happens even when I do a 'soft reset' where I reset the FPGA while the external PLL frequency is maintained.
0 Kudos
brimdavis
Scholar
Scholar
1,589 Views
Registered: ‎04-26-2012

@ojay77   "But this issue happens even when I do a 'soft reset' where I reset the FPGA while the external PLL frequency is maintained."

Does this "soft reset" reset the internal MMCM?

Are the resets for the BRAM address/control line logic synchronous or asynchronous?

-Brian

0 Kudos
ojay77
Observer
Observer
1,587 Views
Registered: ‎04-25-2018

The soft reset does reset the internal MMCM but it's external reference frequency is stable.
The resets are all synchronous.
0 Kudos
brimdavis
Scholar
Scholar
1,572 Views
Registered: ‎04-26-2012

@ojay77    "The soft reset does reset the internal MMCM but it's external reference frequency is stable."

An MMCM reset, regardless of the reference stability, will cause MMCM clock output glitches that can corrupt a BRAM-based ROM.

Does your BRAM control logic deassert the BRAM EN signal prior to resetting the MMCM, and hold EN inactive until after the MMCM LOCKED output asserts again?

-Brian

ojay77
Observer
Observer
1,524 Views
Registered: ‎04-25-2018

I am not controlling my enable because reset is supposed to be user controlled so I cannot 'predict' when reset will be asaserted and de-assert EN before hand. That's not really possible.

Also, I should state that I found a working alternative (as mentioned previously) which basically dynamically re-writes the initialization values (declared in VHDL code) to a RAM and enables the read line only when all data is written. This solution works. However, my code is still synchronously reset so if there's a glitch in the clock output then I would reading that initialization data would result in corrupt readings, right?

Here's my working code:

 

process (clka)
begin
   if (rising_edge(clka)) then
      if (rst_n = '0') then
         wr_complete <= '0';
         addra_wr <= (others => '0');
      elsif ena = '1' then
         if (addra_wr < 4095) then
            rom_block(conv_integer(addra_wr)) <= rom_init(conv_integer(addra_wr));
            addra_wr <= addra_wr + 1;
        elsif wr_complete = '0' then
            rom_block(conv_integer(addra_wr)) <= rom_init(conv_integer(addra_wr));
            wr_complete <= '1';
        end if;
     end if;
   end if;
end process;

----------------------------------------------------------------------
-- Decode Address and Assign Output
----------------------------------------------------------------------
process(clka)
begin
   if (rising_edge(clka)) then
      if ena = '1' and wr_complete = '1' then
         addra_reg <= addra;
         douta <= rom_block(conv_integer(addra_reg));
      end if;
   end if;
end process;

0 Kudos
drjohnsmith
Teacher
Teacher
1,520 Views
Registered: ‎07-09-2009

what value does wr_complete  take if addr_wr = 4095

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
ojay77
Observer
Observer
1,518 Views
Registered: ‎04-25-2018

If it's wr_complete = 0 and addr_wr is 4095 wr_complete is asserted high,
0 Kudos
brimdavis
Scholar
Scholar
1,506 Views
Registered: ‎04-26-2012

@ojay77   "I am not controlling my enable because reset is supposed to be user controlled so I cannot 'predict' when reset will be asaserted and de-assert EN before hand. That's not really possible."

To preserve initialized BRAMs, at initial startup, EN must start out inactive, and not be asserted until after the clock is known good (external clock present AND internal MMCM locked). It should be fairly straightforward to handle this initial FPGA startup condition in your code that generates ENA.

The tricky bit is handling an external reset that can happen at any time, possibly after the external PLL has already corrupted the internal clock before the external PLL LOCKED signal has deasserted (see below).

> if there's a glitch in the clock output then I would reading that initialization data would result in corrupt readings, right?

Right - if the clock fails while your initialization ROM is enabled for reading, it could get corrupted as well.

The way I've handled this sort of problem in past designs is upon detection of an external clock fault, use a BUFGMUX to switch operation of the associated BRAM logic over to another, guaranteed present clock and then reload the BRAM, deassert the BRAM EN, and wait for the external clock to return and lock the MMCM before switching the BUFGMUX back over (note that the 'hot spare' memory initialization ROM logic should always be clocked by this guaranteed present clock.)

Also, you appear to be using inference to generate your ROMS, so I would suggest you double check that your inferred enable signal has actually been synthesized into logic that drives the ENA pin of the BRAMs in the synthesized design (for a 'bulletproof' design I'd probably use the BRAM primitives)

-Brian

0 Kudos
drjohnsmith
Teacher
Teacher
1,498 Views
Registered: ‎07-09-2009

Remember , the tools are trying to connect hardware bits together from your code,

If you want to infer a ROM, then there are very well defined templates to do that,

( but sorry, i can't seem to find them on my tablet screen !! )

 

I'd strongly suggest,

One process, the ROM,

one process the adress generator,

one process using the data output of the rom, ( via a register )

 

You might also look at the schematic of th edesing, see what the tools have done,

    both pre and post synthesis,

 are they implimenting a rom with output register ?

  

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
ojay77
Observer
Observer
1,492 Views
Registered: ‎04-25-2018


@brimdavis wrote:

@ojay77   "I am not controlling my enable because reset is supposed to be user controlled so I cannot 'predict' when reset will be asaserted and de-assert EN before hand. That's not really possible."

To preserve initialized BRAMs, at initial startup, EN must start out inactive, and not be asserted until after the clock is known good (external clock present AND internal MMCM locked). It should be fairly straightforward to handle this initial FPGA startup condition in your code that generates ENA.

The tricky bit is handling an external reset that can happen at any time, possibly after the external PLL has already corrupted the internal clock before the external PLL LOCKED signal has deasserted (see below).

> if there's a glitch in the clock output then I would reading that initialization data would result in corrupt readings, right?

Right - if the clock fails while your initialization ROM is enabled for reading, it could get corrupted as well.

The way I've handled this sort of problem in past designs is upon detection of an external clock fault, use a BUFGMUX to switch operation of the associated BRAM logic over to another, guaranteed present clock and then reload the BRAM, deassert the BRAM EN, and wait for the external clock to return and lock the MMCM before switching the BUFGMUX back over (note that the 'hot spare' memory initialization ROM logic should always be clocked by this guaranteed present clock.)

Also, you appear to be using inference to generate your ROMS, so I would suggest you double check that your inferred enable signal has actually been synthesized into logic that drives the ENA pin of the BRAMs in the synthesized design (for a 'bulletproof' design I'd probably use the BRAM primitives)

-Brian


The enables signal I am using in both cases (my working VHDL code RAM and the Vivado generated ROM IPCORE) is the same. However, the IP core resets corrupted, and my vode does not. So to me that doesn't explain why they're behaving differently.

 

Also, pay attention to the line of code where I'm writing data in the RAM (by reading the ROM content 'rom_init'). That is a clocked process ROM read and is susceptible to the same 'clock glitches' that may arise from sudden MMCM reset. So why isn't that getting corrupted?

0 Kudos
drjohnsmith
Teacher
Teacher
1,483 Views
Registered: ‎07-09-2009

Re glitch,
as a glitch, its out side the timing parameters of the chip,
how the chip behaves is un defined,
things like where the code is placed on the chip could be relevant.

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
brimdavis
Scholar
Scholar
1,473 Views
Registered: ‎04-26-2012

@ojay77   " (by reading the ROM content 'rom_init'). That is a clocked process ROM read and is susceptible to the same 'clock glitches' that may arise from sudden MMCM reset. So why isn't that getting corrupted"

The address to your rom_init() BRAM is static and unchanging, other than during the brief interval after reset when the rom_block() is loaded.

( i.e. if all the address/control inputs to the BRAM are unchanging, then there is no setup/hold violation )

Reading 4K samples at 300 MHz takes 13.7 us, so this is a very short window of vulnerability.

I would expect that if you caused a clock glitch during this load interval, you might see the same issue with rom_init().

-Brian

p.s. These sort of problems are difficult to find and troubleshoot, particularly with only fragments of code and brief written descriptions - I could be missing something, but my interpretation of what you've written is that your problem is mostly consistent with the setup/hold violation corruption problem.

The only thing that strikes me as odd is your earlier statement that most of the locations were corrupted in your original ROM setup- when I've seen this problem under different circumstances, it has exhibited as sporadic bit corruption.

 

0 Kudos
ojay77
Observer
Observer
1,466 Views
Registered: ‎04-25-2018

That address is the 'write address' which I use to load up the RAM only once. The higher level design uses another periodically incrementing 'address' signal to 'read' from the RAM once it's 'loaded'. This is NOT the same address signal I use to read. That is just the address signal I use to load up the RAM.
0 Kudos
brimdavis
Scholar
Scholar
1,459 Views
Registered: ‎04-26-2012

@ojay77   "That address is the 'write address' which I use to load up the RAM only once."

Yes, that was my whole point!

Your rom_init() BRAM read address is driven by the signal addra_wr

    rom_block(conv_integer(addra_wr)) <= rom_init(conv_integer(addra_wr));

Because addra_wr is only active for 13.7 us after reset, the probability of a clock glitch resulting in rom_init() corruption happening in that time interval is small.

In contrast, your rom_block() read address is addressed by a constantly changing value:

     douta <= rom_block(conv_integer(addra_reg));

This constantly changing read address, coupled with a clock glitch, is likely to cause corruption as seen in your original coregen ROM; but your latest design now reloads rom_block() after every reset.

-Brian

 

0 Kudos
ojay77
Observer
Observer
1,432 Views
Registered: ‎04-25-2018

@brimdavis CORRECTION:

 

What if I use my 'PLL RESET' as the read enable for the rom?

Something like...

process(clk) begin

  if rising_edge(clk) then

   if reset = '0' then

      if ena = '1' then

         dout <= rom(addr)

...etc?

0 Kudos
ojay77
Observer
Observer
1,419 Views
Registered: ‎04-25-2018


@brimdavis wrote:

@ojay77   "I am not controlling my enable because reset is supposed to be user controlled so I cannot 'predict' when reset will be asaserted and de-assert EN before hand. That's not really possible."

To preserve initialized BRAMs, at initial startup, EN must start out inactive, and not be asserted until after the clock is known good (external clock present AND internal MMCM locked). It should be fairly straightforward to handle this initial FPGA startup condition in your code that generates ENA.

The tricky bit is handling an external reset that can happen at any time, possibly after the external PLL has already corrupted the internal clock before the external PLL LOCKED signal has deasserted (see below).

> if there's a glitch in the clock output then I would reading that initialization data would result in corrupt readings, right?

Right - if the clock fails while your initialization ROM is enabled for reading, it could get corrupted as well.

The way I've handled this sort of problem in past designs is upon detection of an external clock fault, use a BUFGMUX to switch operation of the associated BRAM logic over to another, guaranteed present clock and then reload the BRAM, deassert the BRAM EN, and wait for the external clock to return and lock the MMCM before switching the BUFGMUX back over (note that the 'hot spare' memory initialization ROM logic should always be clocked by this guaranteed present clock.)

Also, you appear to be using inference to generate your ROMS, so I would suggest you double check that your inferred enable signal has actually been synthesized into logic that drives the ENA pin of the BRAMs in the synthesized design (for a 'bulletproof' design I'd probably use the BRAM primitives)

-Brian


In response to your comment on how to ensure RDEN is somehow deasserted BEFORE the clock glitch occurs (i.e., before the PLL is reset), wouldn't driving the RDEN of the RAM off the main PLL RESET effectively achieve that? For the PLL output to glitch the PLL has to be reset (by virtue of losing the lock signal from the external reference PLL or by simply being reset by the user) and so if I drive the ROM rden off that reset then that would ensure the RAM is not read while the clock is glitching... I wrote this code to achieve that (thinking that would fix it):

 

process(clka)
begin
  if (rising_edge(clka)) then
    if rom_reset = '0' then  ---- this is the asynchronous PLL reset
      if rst_n = '1' then  ----- this is the logic reset driven by the PLL lock signal
        if ena = '1' then -- and wr_complete = '1' then
          addra_reg <= addra;
          douta <= rom_init(conv_integer(addra_reg));--rom_block(conv_integer(addra_reg));
        end if;
      end if;
    end if;
  end if;
end process;"

 

However, the above code still does not fix the problem....

0 Kudos
brimdavis
Scholar
Scholar
1,405 Views
Registered: ‎04-26-2012

@ojay77   "However, the above code still does not fix the problem"

In the following line of code, does your phrase "PLL reset" mean 1) the external PLL reset, or 2) the internal MMCM/PLL reset ???

if rom_reset = '0' then ---- this is the asynchronous PLL reset

Did you look at the post implementation schematic to see what logic actually got synthesized from that code to drive the BRAM enable pin?


Without knowing the exact, detailed connectivity/design of your PLL, MMCM, reset generation, address generation, etc., at the current iteration, it is difficult to offer advice that amounts to anything beyond "reading the tea leaves".

That said, here are some more notes/comments on the care and feeding of MMCMs with an unreliable external clock source:

Upthread, you initially indicated that your design was clocked internally by a 300 MHz "MMCM/PLL" output and reset by the external PLL's LOCK DETECT:

"my DSP logic operates off of a 300 MHz clock generated by a MMCM/PLL which relies on a 150 MHz reference that is provided externally through a physical PLL. The external PLL also provides a LOCK DETECT signal which I'm using to hold my logic in reset to ensure it runs only when the reference clock is stable."

But later on you described your reset generation as using the MMCM/PLL LOCKED signal:

"When the external PLL is reset the reference clock to the FPGA stops which in turn pulls the LOCKED signal on the FPGA PLL low which drives the reset of all the FPGA logic. So when the PLL clock is down, the FPGA logic is held in reset until the external clock is up again and the FPGA PLL is LOCKED."

If by the phrase "LOCKED signal on the FPGA PLL low", highlighted above, you mean the MMCM/PLL output LOCKED signal, it is important to note that MMCM output clock can glitch or stop ***BEFORE*** LOCKED deasserts, as described in UG472 [1].

Therefore any attempt to use the MMCM/PLL LOCKED output to somehow drive the BRAM enable signal and protect the BRAM contents is doomed to failure for an unexpected LOCKED deassertion.


It is also tricky to derive a synchronous reset signal when the MMCM/PLL clock output clocking said synchronous circuit is unreliable, and possibly no longer running.

Typically there would be a watchdog circuit, running on an independent always-present clock, that monitors the MMCM/PLL LOCKED and CLOCK_STOPPED signals; when an MMCM failure is seen by this watchdog, it would then use an async reset input on the flip-flop driving the fpga logic rst_n net, so that it can reliably assert rst_n in the absence of the MMCM clock; the rst_n net would then be clocked high synchronously once the MMCM/PLL output clock has returned to operation.  [ plus needing the clock switchover described earlier to implement a safe reinitialization of the main ROM ]


> For the PLL output to glitch the PLL has to be reset (by virtue of losing the lock signal from the external reference PLL or by simply being reset by the user)

The "external PLL" circuits I've encountered over the years often mangle the clock well before deasserting their LOCK DETECT

-Brian

[1] UG472 description of LOCKED deassertion timing:

 

mmcm_locked.png

 

 

0 Kudos