07-23-2019 10:47 AM
Throughput is a parameter that used to measure the performance. When implementing different algorithms using HDL in synthesis tools throughput is not obtained directly. It has to be found out manually.So,How to calculate throughput from a hdl simulation tool such as Xilinx
07-23-2019 11:42 AM
Inputs bits per clock (bps) is input bandwidth, output bits per clock (bps) is output bandwidth.
The advantage of a register transfer language like verilog or VHDL (also called a Hardware Definition Language - HDL), is that you explicitly define your clock(s) and data inputs and outputs. As an example, if you have ten 10 Gbps transceivers, you have 100 Mbps of in, and out. One would write their code to handle that, meet all timing, acheiving the needed latency.
So, one does not ask how to calculate the bandwidth, but rather how do you write your verilog or VHDL to meet the required bandwidth. Then one has to figure out how to use the IO of the device to get the necessary signals in and out, with one finally having to figure out if the device they have chosen is big enough to fit the design into and you have the power to power it, and heat sinking enough to cool it.
07-27-2019 10:19 AM
thanks sir for your reply, and im sorry for this late answer,I have 219620 lines in my input and each line is in 16 bits when i simulate i reach 180 mhz and the formula of the throughput is going to be:
clock speed in Hz / <clock cycles for input / output> .
i just to know how i determine the clock cycles per input / output i didnt understanded.
07-27-2019 02:27 PM
That's pretty meaningless. What is the interface bandwidth on the chip? From a system point of view this is what you should know at the start and design the system the accommodate the interface.
07-28-2019 05:43 AM
@assaad1- The clock cycles per input or output is something that you need to determine from your design. For something like image processing, you're probably aiming for either one or two cycles per input pixel. Or you can measure it in simulation, by checking how frequently your design reads another input.
07-28-2019 03:26 PM
Wow, 219k inputs each with 16 bits? That's a pretty amazing I/O count. Can you tell me which FPGA on which board has this many I/Os?
Or is it that you are building an algorithm that is supposed to work within your board, that takes that many inputs that would come from the bus somehow rather than from an external device? Say from an AXI interconnect? In that case, I think AXI will support up to 512 bits transferred at a time although not all of the logic I've seen will support transferring 512 bits per clock cycle..
08-05-2019 12:56 PM
im sorry for this late answer ,im buildding an algorithm and im using testbench ,in general im using for image each one with 219k and the result is the same 219k
08-14-2019 04:42 AM
So ... in general, throughput = transfers per clock times clock rate.
I've been fairly successful using SymbiYosys to calculate the throughput of a particular component (assuming all the sources are available) by using the SystemVerilog cover() statement. Specifically, I'd try covering several transactions in a row. When/if you try something like that, you'll quickly discover that there are a lot of cores out there that don't even try to achieve one transfer per clock in the first place.
If only things were even that simple.
If you build an (SDR/DDR?) SDRAM interface, you'll quickly discover that even if the core could handle one beat per clock (and I think the MIG can as I recall), it cannot sustain that rate since it needs to be taken periodically off-line in order to refresh itself.
Some controllers, such as the AXI2MM controller(s) have a derating associated with them as well. Basically it says, take the maximum throughput of the bus, and multiply it by the derating (75% for example), and that's the throughput you should be able to achieve with such a core.
My approach to the problem would therefore be: build it, measure it, fix it (i.e. rewrite cores that aren't fast enough), sell it (or turn it in if this is a student project).. I understand some folks would do this in other orders, but that'd be my own approach.
08-19-2019 12:28 PM
Clock cycles per input is up to you. In your 219K lines of 16b data (a file for simulation), are you issuing one line per each clock? Every other clock? That's what folks are trying to understand, in order to help you and others.
In example if you're issuing 16b of data each clock cycle with 180 MHz clock, your input BW is 360 MBps (mega bytes per second - as every input is 2 bytes). Divide that by your enable rate to get your throughput (as you say, divide by "<clock cycles for input / output>".
As @richardhead points out, your system, simulation aside, has to then meet timing requirements to actually work: " design the system to accommodate the interface" required throughput.
Now from your later post "the result is the same 219k". Your transform on the data doesn't have any rate change, so your output rate is the same as your input rate.