cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
mustafa.homsi
Observer
Observer
866 Views
Registered: ‎01-24-2019

Size Optimized HDL Tricks/Guidlines

Hello,

Is their a book, an app note or any type of a reference that talks about various guidelines or tricks when writing HDL code to generate a small synthesized code. I am relatively new to HDL and I have recently written a code (VHDL) for a specialized PWM timer (a counter, output captures, input captures, etc.). The design works as intended but it is taking a relatively big space. Bigger than microblaze or cortex M1 (~3.5K LUTs on spartan 7). Again I am not an expert but I wouldn't think it should take that much and I am wondering if I should revisit the HDL code after reviewing any high level guidelines. Thanks!

0 Kudos
5 Replies
831 Views
Registered: ‎01-22-2015

@mustafa.homsi 

I can’t recommend a book to you.

However, you already taken a big step towards resource-efficient coding by using VHDL instead of a higher level language like HLS.

For the tasks you describe, 3.5K LUTs seems like a lot.  However, a midsize SPARTAN-7 has about 32K LUTs.  So, you got “lots of LUTs left” - say that 3 times fast

The general advice you’ll probably get is to strive for clarity and readability rather than resource-efficiency in your VHDL coding.

Sometimes our best efforts to write resource-efficient code are foiled by synthesis, which tends to have a mind of its own.  However, we can guide synthesis by selecting from different strategies.  As described in UG901 (near Table 1-2) these strategies are usually a tradeoff between “area/resources optimized” and “performance optimized”. 

Also, if you are using Xilinx IPs, then during setup of the IP you can sometimes choose between “area/resources optimized” and “performance optimized”. 

However, if you are running out of LUTs, then using the FPGA architecture differently may help.  For example, math can use a lot of LUTs but math can sometimes be done in the DSP48 of the SPARTAN-7.  Also, storing data can use lots of LUTs(LUTRAM) but data can also be stored in block-RAM(BRAM).

Cheers,
Mark

mustafa.homsi
Observer
Observer
810 Views
Registered: ‎01-24-2019

Thanks Mark. The part I am using has about 25K which I thought it would be plenty but I need four of these relatively big timers which consumes ~60% and I barley can use the rest to fit a microblaze plus other typical peripherals using standard xilinx IPs (ex: uart, I2C, Quad SPI, xadc, etc.).

I really think that something making these timer IPs use a lot of space. I did try to replace some LUT logic with DSP since I have more of that on the device. No regarding the LUTRAM, I am interfacing each timer IP to microblaze using 64 32-bit registers. Does this blow up the size? Should I use BRAM instead? Btw the interface is AXI-lite. Thanks!

0 Kudos
u4223374
Advisor
Advisor
797 Views
Registered: ‎04-26-2015

@mustafa.homsi That does seem pretty large for a timer. After all, each part of that should only be a couple of LUTs - it's hard to see where 3500 is coming from, unless you've either got an incredibly long timer (eg. 2048-bit) or it's doing floating-point.

 

Could you possibly share the code?

 

Edit: two other things that might be doing it:

- Division. You're not generating a slower timer by performing integer division on a faster timer, are you? Because that really doesn't work well.

- Interfacing. How are these connected? I don't know about the standard AXI Masters, but in HLS an AXI Master is a substantial piece of hardware.

0 Kudos
745 Views
Registered: ‎01-22-2015

@mustafa.homsi 

I agree with comments of u4223374, AXI can use up resources.  In fact, I2C, SPI, and UART can all be done in pure VHDL.  These pure-VHDL solutions will be more resource-efficient than using Xilinx IP, which wants you to use AXI.

As u4223374 suggests, can you show us your code or describe in more detail how you are doing things?

Mark

0 Kudos
dgisselq
Scholar
Scholar
729 Views
Registered: ‎05-21-2015

@mustafa.homsi,

I wrote this article some time ago after placing a multitasking "operating system" onto a Spartan 6/LX4 device (CMod S6 from Digilent).  It contains a lot of the lessons I learned in the process.  Perhaps it might help.

As others have suggested, one of the first things to learn how to do is how to estimate the area of various components.  This will help you narrow down where your LUT resources are going.  Try synthesizing portions of your design, to see what's happening.

Dan

0 Kudos