Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community Forums
- :
- Forums
- :
- Hardware Development
- :
- Other FPGA Architecture
- :
- Re: Multiple sine and cosine lookups at high speed

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

leif@rdos.net

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 01:08 AM

804 Views

Registered:
12-29-2019

I have a 1GHz ADC signal that enters the design at 250 MHz with 4 samples per channel and 2 channels. I want to measure power on a few frequencies in the 45 MHz band. I want to do this in real-time, but latency is not an issue. The preferred solution is to multiply every sample with a sine and cosine value that shifts phase based on the specific frequencies I want to analyze. However, that means I need 8 sine and 8 cosine values at 250 MHz per frequency I want to analyze. The DDS can only provide a single value (although it might be able to give both the sine and cosine at the same time), but does it really work at 250 MHz? If I place the table in block or distributed RAM I cannot read out more than one value per clock cycle. A solution that might be more promising is to generate a sine table up to pi / 2 in Verilog code, as this should mean I could reference it from several blocks at the same time. However, will this work with a table of 64k x 14 entries? I'm using Kintex-7 (KC705). Resource usage is not a big issue since I presently only use 10% of the available LUTs and FFs.

I realize I could skip 3 out of 4 samples and use the average of the channels, but this feels like a bad solution with poorer performance. I also would need to duplicate the block RAM or DDS if I want to analyze several frequencies at the same time.

So, is there a way to make multiple references in the same clock cycle to a sine and cosine table at high speed, like at 250 MHz?

1 Solution

Accepted Solutions

Highlighted

drjohnsmith

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 06:27 AM

650 Views

Registered:
07-09-2009

CORDIC

https://pdfs.semanticscholar.org/adab/cb3a48d2ebb2251bf298b5589598f3eefd04.pdf

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>

14 Replies

Highlighted
##

Jump to solution
If, as you say, "latency is not a problem", then what you may need is a proper pipelining to have an interval of one cycle. Will a table of 64k x 14 work? I would try it. If values are consecutive, I think you can reach a value per clock.

archangel-lightworks

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 01:20 AM

793 Views

Registered:
07-23-2019

Re: Multiple sine and cosine lookups at high speed

Highlighted
##

Jump to solution
You don't mention (or I missed) your data type. Float products are multicycle, so you will need some paralleled to keep one operation per cycle.

archangel-lightworks

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 01:22 AM

791 Views

Registered:
07-23-2019

Re: Multiple sine and cosine lookups at high speed

Highlighted
##

Jump to solution

leif@rdos.net

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 02:10 AM

773 Views

Registered:
12-29-2019

Re: Multiple sine and cosine lookups at high speed

Latency is not a problem in the sense that it doesn't matter if the result is available 100 cycles later, but the solution must handle real-time and so it must deliver 8 sine and 8 cosine values at 250 MHz.

The idea is that I can cache sample data in a FIFO until I have a decision if the frequency is part of the data or not. Having a FIFO with hundreds or thousands of values is not a problem. The idea is that I can skip most of the data that is not interesting and only stream data that is relevant over PCIe, which allows for much longer sampling periods.

It's possible I could use smaller tables. It might be enough with 8 bits, and maybe 32k or 16k might work too.

Sample values are 14-bit signed integers. The power estimate will also be some type of integer. In fact, it will come down to a yes-no decision when values are accumulated in an 8k or 16k sequence.

Highlighted
##

Jump to solution

archangel-lightworks

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 02:30 AM

762 Views

Registered:
07-23-2019

Re: Multiple sine and cosine lookups at high speed

While not trivial, I cannot see it impossible to feed and multiply int data on a kintex-7 at 250 MHz

Highlighted
##

Jump to solution

archangel-lightworks

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 02:36 AM

758 Views

Registered:
07-23-2019

Re: Multiple sine and cosine lookups at high speed

I would do a quick try with HLS.

Highlighted
##

Jump to solution
The DDS gives you the option for a phase output and can also make simple sine/cosine look up tables. You could have multiple sine/cosine LUTs connected to a single phase accumulator.

bruce_karaffa

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 02:56 AM

745 Views

Registered:
06-21-2017

Re: Multiple sine and cosine lookups at high speed

Highlighted
##

Jump to solution

calibra

Voyager

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 03:42 AM

731 Views

Registered:
06-20-2012

Re: Multiple sine and cosine lookups at high speed

I assume that the two channels are independent because they do not share the same sine / cosine.

Therefore you have two independent phase accumulators.

But the 4 samples of the same channel are multiplied by 4 consecutive sine in the table.

It is right ?

== If this was helpful, please feel free to give Kudos, and close if it answers your question ==

Highlighted
##

Jump to solution

leif@rdos.net

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 05:16 AM

699 Views

Registered:
12-29-2019

Re: Multiple sine and cosine lookups at high speed

Yes, the channels are somewhat separate, even though mostly because they could have phase differences based on different arrival times of the signals. They come from two antennas placed at different locations.

My original idea was to create a general-purpose sine and cosine table with a fixed number of values up to pi / 2, and with that setup, the values for consecutive samples would not be consecutive in the sine and cosine table.

An alternative approach might be to create a more dedicated sine and cosine table that captures the frequency searched for closely enough with a much shorter table that is tied to the sampling frequency by some factor M / N. For instance, using 16 values and 750 MHz sampling frequency would capture 46.875 MHz. 22 values and 1GHz sampling frequency would capture 45.45 MHz.

Highlighted
##

Jump to solution

calibra

Voyager

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 05:41 AM

681 Views

Registered:
06-20-2012

Re: Multiple sine and cosine lookups at high speed

Ok.

So divide the "general-purpose sine and cosine table with a fixed number of values up to pi / 2"

in 4 tables "with a fixed number of values" / 4, which consecutives values to read 4 sines in parallel.

Ex.

1 ROM 8 values(1,2,3,4,5,6,7,8)

4 ROM 2 values(1,5) (2,6) (3,7) (4,8)

Same area but throughput multiplied by 4.

== If this was helpful, please feel free to give Kudos, and close if it answers your question ==

Highlighted

drjohnsmith

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 06:27 AM

651 Views

Registered:
07-09-2009

CORDIC

https://pdfs.semanticscholar.org/adab/cb3a48d2ebb2251bf298b5589598f3eefd04.pdf

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>

Highlighted
##

Jump to solution

leif@rdos.net

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 07:32 AM - edited 05-04-2020 07:45 AM

623 Views

Registered:
12-29-2019

Re: Multiple sine and cosine lookups at high speed

It seems like the CORDIC IP can produce both sine and cosine values in just one clock cycle, and that it might work at 250 MHz with a Kintex-7 device. It doesn't have huge resource usage requirements, and so adding 8 of these should work. It will definitely be worth a try.

Correction: It should be enough with 4 as both channels can operate on the same sine and cosine phase. It's just the accumulators that need to be separated.

Highlighted
##

Jump to solution

bruce_karaffa

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 08:07 AM

614 Views

Registered:
06-21-2017

Re: Multiple sine and cosine lookups at high speed

A CORDIC will require at least one clock cycle per bit of precision at the output.

Highlighted
##

Jump to solution

avrumw

Guide

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 08:25 AM

607 Views

Registered:
01-23-2009

Re: Multiple sine and cosine lookups at high speed

A 64kx14 table would require 32 block RAMs. Each block RAM can be true dual port, so you can use one table to look up two value per clock. To get 8 values per clock you would need 4 of these, so a total of 128 block RAMs. While this is a large number, the Kintex-7 325T (the device in the KC705) has 445 of them, so if you aren't using them heavily for other things, you should be able to use them.

Of course you will have to be careful using RAMs like this - they will be scattered throughout the die so you need to pipeline carefully to use them (including using the output registers of the RAM). But since latency isn't an issue, this shouldn't be a problem.

Avrum

Highlighted
##

Jump to solution

leif@rdos.net

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-04-2020 11:14 AM - edited 05-04-2020 01:44 PM

531 Views

Registered:
12-29-2019

Re: Multiple sine and cosine lookups at high speed

From the CORDIC PDF it seems like it can operate in parallell mode and use one pipeline stage per bit, but still will be able to output one sine and one cosine per clock. I can just feed it with an increasing phase value and then use the outputs as coefficients without having to know which phase they correspond to.

Tested it with KC705, and it only uses a bit over 1,000 LUTs and FFs with a 256k x 14 configuration. It works at 187.5 MHz and outputs a new value every clock.