05-04-2014 03:25 PM
Hy! I'm getting the same error Xst 1312 cause of while() loop !! and I have only 1 loop in my code.. upto my understanding it is beign caused of infinite loop .... but i have very legit stop condition within the that loop to stop it when address reaches 55th location.. (always less iterations then 64) but it still gives the error ... attachment of the code and error ! please help
05-05-2014 11:54 AM
@ahmad32 wrote:
So what you are saying is that while() loop is not synthesiseable unless it has constant itterations which is equal to for() loop . End point don't use while() loop just use for() loop
That's half of the statement I made. The other half is that using loops, you can easily describe a very large amount of logic. Loops for synthesis always run in zero time. So everything in all iterations generates a lot of combinatorial logic. In your case, you also assign multiple locations in an array within the loop. This means that you need to be able to write multiple locations in the same clock cycle, and if that multiple is more than two you can't use memory for storage.
In simulation, it's possible to use loops that run over a number of clock cycles or other time delays by adding events within the loop. Delay-based waits are ignored for synthesis, and other events (wait until value or edge) are not synthesizable when inside a loop. If you have time available to run the loop one clock cycle per iteration, then you need to re-write the code to use a signal for the loop count and some extra state logic to know when to start and end the loop in time.
If you really need the code you posted to run in a single clock cycle, then the for loop is probably the best you can do, remembering that a for loop needs to have a constant number of iterations. You work around that using if clauses within the for loop to deetermine which iterations of the loop actually affect the output signals. But as you can see it also means that the constant number of iterations needs to be at least as large as the maximum number of iterations you needed for simulation. That can lead to a very large amount of combinatorial logic, and a lot of logic levels, making the best achievable clock speed significantly slower than a pipelined approach.
05-04-2014 06:53 PM
05-04-2014 06:58 PM
Synthesis really doesn't like to deal with variable length loops. My suggestion would be to change the loop to a for loop like:
for (i = 0;i < 55;i = i + 1)
if (i > add_512_block) // pointing to area to be padded
block_512[i] = 8'd0;
The above loop clearly iterates 55 times but only has action when i is greater than add_512_block.
05-04-2014 09:07 PM
Nope to both answers !! sorry!! but i wanna use while loop !! I can't use for loop for spesific iteration it has to be at max limit and with new register cause can't change address of memory with for loop.... can anyone please just look at my code and tell me where i'm mistaken in using the while loop ??
05-04-2014 10:30 PM
05-04-2014 11:00 PM
05-04-2014 11:06 PM
05-05-2014 01:22 AM
Here is the Code for Padder Module of SHA256 !! Its little bit rough !! but no syntax error !! I have tired removing While() loop with either for() loop or just by simple if() else (although my logic doesn't really go well with these) the error XST 1312 goes away. sorry for not attaching the code file before.
05-05-2014 06:38 AM
@ahmad32 wrote:
Nope to both answers !! sorry!! but i wanna use while loop !! I can't use for loop for spesific iteration it has to be at max limit and with new register cause can't change address of memory with for loop.... can anyone please just look at my code and tell me where i'm mistaken in using the while loop ??
Maybe the bold word in my first post didn't sink in. For synthesis, loops need to have a constant number of iterations. Period. I understand that for simulation, you can do anything you want, including "while" loops that iterate forever. However you need to think about the logic that this loop will create. Synthesis by definition needs to know what to generate at the time it compiles the code. It is not an "interpreter" that can run constantly looking at the current value of your variables.
The fact that results were not very good when you got the code to synthesize is not very surprising. There are a lot of logic levels required to synthesize that code, and the "memory" structure "block_512" will need to be implemented with fabric flip-flops rather than any type of distributed or block RAM. You might think about re-writing the code so that the zero-padding is done over multiple clock cycles.
05-05-2014 09:07 AM
So what you are saying is that while() loop is not synthesiseable unless it has constant itterations which is equal to for() loop . End point don't use while() loop just use for() loop
05-05-2014 11:54 AM
@ahmad32 wrote:
So what you are saying is that while() loop is not synthesiseable unless it has constant itterations which is equal to for() loop . End point don't use while() loop just use for() loop
That's half of the statement I made. The other half is that using loops, you can easily describe a very large amount of logic. Loops for synthesis always run in zero time. So everything in all iterations generates a lot of combinatorial logic. In your case, you also assign multiple locations in an array within the loop. This means that you need to be able to write multiple locations in the same clock cycle, and if that multiple is more than two you can't use memory for storage.
In simulation, it's possible to use loops that run over a number of clock cycles or other time delays by adding events within the loop. Delay-based waits are ignored for synthesis, and other events (wait until value or edge) are not synthesizable when inside a loop. If you have time available to run the loop one clock cycle per iteration, then you need to re-write the code to use a signal for the loop count and some extra state logic to know when to start and end the loop in time.
If you really need the code you posted to run in a single clock cycle, then the for loop is probably the best you can do, remembering that a for loop needs to have a constant number of iterations. You work around that using if clauses within the for loop to deetermine which iterations of the loop actually affect the output signals. But as you can see it also means that the constant number of iterations needs to be at least as large as the maximum number of iterations you needed for simulation. That can lead to a very large amount of combinatorial logic, and a lot of logic levels, making the best achievable clock speed significantly slower than a pipelined approach.
05-06-2014 11:41 AM
You were right.... I tried constant itterations with repeat(54) as you mentioned earlier it was taking lots of hardware logic & took all of my PC's RAM 4GB to synthesise for more than 20 minutes .....
Can you please tell me the way around such loops where i can address multiple lcoations without wasting that much of hardware & sorry i didn't get following.
@gszakacs wrote:
If you have time available to run the loop one clock cycle per iteration, then you need to re-write the code to use a signal for the loop count and some extra state logic to know when to start and end the loop in time.
05-06-2014 11:45 AM
@ahmad32 wrote:
Can you please tell me the way around such loops where i can address multiple lcoations without wasting that much of hardware & sorry i didn't get following.
Do you need simultaneous access to these multiple locations, or can you access them one at a time?
05-06-2014 11:50 AM
No tottally ! I can use multiple cycles ... (but less would be better as always)
05-06-2014 05:32 PM
There are at least two options.
Option 1.
Code up the module as a pipeline (also google "systolic array"). A pipeline allows you to accept a new incoming sample on every clock. It's a large design (but not as large as your completely unrolled loop). I haven't looked at your algorithm in detail, but I do notice a 'for' loop iterating from 0 to 54. Perhaps each iteration can map to one stage of pipeline.
Option 2.
Code up the module as a state-machine. Here, your state machine will keep track of where you are in the calculation. For this case, you're sharing fpga resources at each state, hence you can only accept a new input sample once every (in your case 55) clock cycles. Option 2 is probably what a software person is "seeing in his head" as he codes the algorithm. It looks more like a software solution. But you have to get your hands a little dirtier and explicitly deal with state.
These two options are kind of the extreme endpoints. Your chosen solution very likely will be somewhere in between them.
Regards,
Mark
05-06-2014 10:11 PM
Thanx for your help. I'm already trying states using case(address) fucntion but I'll look into the Systolic Array too.