cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
nadaumtimuj
Adventurer
Adventurer
1,021 Views
Registered: ‎01-29-2021

Implement tanh using 10-bit to 32-bit LUT mapping

What is the best way to implement this function (1+ tanh x)/ 2 where x = -4 to 3.9922 with a increment of 0.0078? I tried a 10-bit to 32-bit LUT mapping but it is taking unusually long time in synthesis. Probably because it has to map 2^10 lines! Any better way to do this? 

I have attached my code. Thanks.

 

 

 

 

 

0 Kudos
13 Replies
richardhead
Scholar
Scholar
971 Views
Registered: ‎08-01-2012

The number of lines is not a problem, the reason it takes so long is that your LUT is being built out of FPGA LUTs. It should fit nicely into a BRAM, but because you have no clock it cannot do this, and instead has to use 10000s of LUTs.

I suggest you use a clock

calibra
Scholar
Scholar
969 Views
Registered: ‎06-20-2012

Use a constant array and fill it with a function.

== If this was helpful, please feel free to give Kudos, and close if it answers your question ==
richardhead
Scholar
Scholar
950 Views
Registered: ‎08-01-2012

@calibra That would make no difference here. The problem is the lack of a clock to make the ROM synchronous, hence it cannot be placed in a BRAM

calibra
Scholar
Scholar
889 Views
Registered: ‎06-20-2012

@richardhead 

Sure the design can only be implemented with LUTs but @nadaumtimuj says "it is taking unusually long time in synthesis". Maybe he cannot use BRAM.

== If this was helpful, please feel free to give Kudos, and close if it answers your question ==
0 Kudos
nadaumtimuj
Adventurer
Adventurer
873 Views
Registered: ‎01-29-2021

@calibra  Are you referring to something similar to this? https://bradpierce.wordpress.com/2010/03/13/systemverilog-constant-arrays-roms/

Can I directly use $tanh(x) function and get the result as 32-bit? Thanks.

0 Kudos
nadaumtimuj
Adventurer
Adventurer
867 Views
Registered: ‎01-29-2021

@richardhead  Thank you, I understand....but this part needs to be asynchronous in my project and so unfortunately I cannot use a clock.

0 Kudos
richardhead
Scholar
Scholar
855 Views
Registered: ‎08-01-2012

@nadaumtimuj  Then, whatever method you chose, you will end up with long synthesis times as you are not actually building a LUT, but really a huge logic decoder, regardless of whether you use the hand written LUT or a logic function (as they are essentially the same thing). 

The only way to speed up synthesis will be to use a clock. FPGAs are designed to be synchronous. Huge async LUTS will take a lot of logic and always take a large amount of time to synthesise.

Why can you not use a clock?

richardhead
Scholar
Scholar
854 Views
Registered: ‎08-01-2012

@calibra 

Maybe, but a function or hand written LUT are essentially the same thing. Without a clock, it will never use a BRAM, which would speed up synthesis.

calibra
Scholar
Scholar
712 Views
Registered: ‎06-20-2012

@nadaumtimuj

Please find attached the code in VHDL.

The size is only 466 LUTs.

== If this was helpful, please feel free to give Kudos, and close if it answers your question ==
nadaumtimuj
Adventurer
Adventurer
566 Views
Registered: ‎01-29-2021

Hi @calibra sorry I didn't get a notification for your reply and saw it today. I am really interested to try your code but I am not familiar with VHDL and my whole system is SV. Is there any conversion tool that can convert it to SystemVerilog? 

0 Kudos
drjohnsmith
Teacher
Teacher
535 Views
Registered: ‎07-09-2009

Two thoughts

SV being a mongrel language, and the Xilinx tools being ,multi language, you should be able to just use the VHDL code.

second, if your concerned about space, most of the range of the tahn is to a fair resolution, a streight line, which can be calculated on the fly, 

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
nadaumtimuj
Adventurer
Adventurer
530 Views
Registered: ‎01-29-2021

@calibra @drjohnsmith  yeah I just tried to use the VHDL directly in my project. But I tried a little bit of different range, x = -64 to 63.875 with a step of 0.125......but I got the following error in synthesis for line 60 (please see the attached code):

[Synth 8-3512] assigned value '-2147483648' out of range 


I also want to add that I tried @richardhead  idea of adding clock. It saves some LUT and uses BRAM but as I said my system updates asynchronously, I lose some updates there if I use a slow clock. In reality, I will need a very fast clock to catch up all updates but it can eventually fail to close timing with other paths. But it still is a useful idea.

0 Kudos
richardhead
Scholar
Scholar
381 Views
Registered: ‎08-01-2012

The issue is likely here:

x := x * 2.0**31 ; -- integer limited to 2**31

One Quirk of most VHDL implementations is that Integer doesnt quite cover the full 32 bit range. It actually covers -2**31+1 to 2**31-1 . So here, if x = -1, then it underflows the integer type.

Quote from VHDL LRM:

"An implementation may restrict the bounds of the range constraint of integer types other than type
universal_integer. However, an implementation shall allow the declaration of any integer type whose range
is wholly contained within the bounds –2147483647 and +2147483647 inclusive."