09-30-2013 07:22 AM
I have a problem due to big mismatches when running post-translate simulation as soon as I add a IP from core generator to my design.
Inside the module that I'm working on there are several cascaded blocks, the core that I'm trying to insantiate is a CORDIC sqrt which happens to be at the very end of the chain. The core is pretty simple, it's got one clock input, one data input and a data output which is directly connected to the module outputs. As strange as it may sound as soon as I instantiate it , behavioural simulation works fine but in post-translate simulation (which works just fine without the IP) NOTHING works anymore, the behaviour of the whole circuit is completely wrong, to the point that I'm thinking there must be something wrong with the synthesis/translation process.
Does anyone have a plausible explanation for this?
09-30-2013 09:05 AM
What is it that it is not working after adding CORDIC ?
Is this CORDIC output alone or the modules which do not depend on it as well ?
09-30-2013 10:27 AM
The problems are before the CORDIC. The module has two counters that generate addresses to read from ram, those addresses do not the depent on the output of the cordic, they're generated at the beginning of the chain and controlled by two inputs, a reset and a range value. When I add the cordic those addresses are no longer generated correctly, one is always zero and the other one goes back and forth between two values. The rest of the module uses the data retrieved with those addresses so needless to say, results are not correct.
BTW, I just tried it on a board, instantiated the cordic, connected its outputs to the leds and it works....
09-30-2013 02:15 PM
I just tried post-map simulation and amazingly enough, it behaves like it should, all is good and circuit works as expected. At this point I really think this is a bug...
09-30-2013 06:16 PM
09-30-2013 06:28 PM
Good that the design works on board.
I believe your timing simulation flow might have some issues.
What is your BRAM opearting frequency, what else is there in your design ?
As asked by Muzaffer are your constraints reliable ?
Can you upload your design?
09-30-2013 06:55 PM
10-01-2013 01:04 AM
the only constraint I have at the moment is the one for the clock period, set at 10ns, which is met since the design can run up to 250Mhz (after PAR). Inputs (for the test I made) are internal (BRAM) and outputs are LEDs.
Besides, does post-translate simulation need an SDF file ? I thought post-translate was not a timing simulation...
10-01-2013 01:26 AM - edited 10-01-2013 01:26 AM
To be frank I don't know what post translate (as opposed to post-map?) state is and I am not sure I care. The only time non-RTL simulation is needed is post-route which is a full timing simulation. If you pass timing (STA) and your design works in silicon, you're OK. If you pass STA but your board doesn't work, you do post-route simulation. I am not sure why anything else is necessary or relevant.
10-01-2013 01:58 AM
This module is an autocorrelation/pitch detection block that will be used for a speech compression algoritm.
-It starts by reading data from memory and calculating dot products.
-The dot products sequence is then 'normalized' and this normalized output is fed to the pitch detection logic.
-The pitch detection logic produces its output after having analyzed all the normalized dot products.
-The logic that uses the CORDIC IP does gain exctraction and takes as an input the dot product sequence (non normalized)
so there are two data paths:
data from bram->dot product->normalization->pitch detection
data from bram ->dot product->gain exctraction
My check was to connect the ouput of the pitch detection logic, which is just 7 bits to the leds on the board and the result was correct. Obviously for this to work all the logic upstream must function properly, so I took this result as a good sign.
I also did the same for the output of the logic that uses the CORDIC.
I'd also like to stress out that while post-translate simulation does not work, post-map simulation does work and produces the very same results as the behavioural one.
Implementing this algorithm on an fpga is part of my thesis so I would have no problem posting the code, just le me find a way to post a link instead of pasting it.