cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
alcatraz1372
Visitor
Visitor
1,490 Views
Registered: ‎03-29-2019

problem with dataflow sub-function example

Jump to solution

I am using SDAccel 2018.2 on CentOS 7.4 and trying to run [this example](https://github.com/Xilinx/SDAccel_Examples/tree/master/getting_started/dataflow/dataflow_subfunc_ocl).

Without changing anything after HW-EMU build, I get :
latency: min=138, max=?
interval: min=138, max=?
type=none

after commenting dataflow pragma to observe its effect, I get:
latency: min=6, max=?
interval: min=6, max=?

It isn't supposed to be this way, right? I mean interval value should be less when the dataflow pragma is used.

for further investigation, I launched Vivado HLS for the adder kernel and ran the synthesis there, here is the full build log.

Also, I noticed a difference between the summary and the instance sections of the synthesis report in Vivado HLS. As you can see in the screenshot, the reported interval in the summary section is 138, but in the instance section is 2. btw for this report the dataflow pragma is NOT commented out.

( I am fairly new to Vivado HLS and SDAccel environments, so forgive me if I am asking something obvious :D )

Starting C synthesis ...
/opt/Xilinx/Vivado/2018.2/bin/vivado_hls /home/saaa/00_Xilinx_WS/Ex1-11a/Emulation-HW/adder/adder/adder/adder/solution/csynth.tcl
INFO: [HLS 200-10] Running '/opt/Xilinx/Vivado/2018.2/bin/unwrapped/lnx64.o/vivado_hls'
INFO: [HLS 200-10] For user 'saaa on host 'localhost' (Linux_x86_64 version 3.10.0-693.el7.x86_64) on Fri Mar 29 22:45:20 2019
INFO: [HLS 200-10] On os "CentOS Linux release 7.4.1708 (Core) "
INFO: [HLS 200-10] In directory '/home/saaa/00_Xilinx_WS/Ex1-11a/Emulation-HW/adder/adder/adder'
INFO: [HLS 200-10] Opening project '/home/saaa/00_Xilinx_WS/Ex1-11a/Emulation-HW/adder/adder/adder/adder'.
INFO: [HLS 200-10] Adding design file '../../../../src/adder.cl' to the project
INFO: [HLS 200-10] Opening solution '/home/saaa/00_Xilinx_WS/Ex1-11a/Emulation-HW/adder/adder/adder/adder/solution'.
INFO: [SYN 201-201] Setting up clock 'default' with a period of 4ns.
INFO: [SYN 201-201] Setting up clock 'default' with an uncertainty of 1.08ns.
INFO: [HLS 200-10] Setting target device to 'xcvu9p-flgb2104-2-i'
INFO: [XFORM 203-1171] Pipeline the innermost loop with trip count more than 64 or its parent loop when its trip count is less than or equal 64.
INFO: [XFORM 203-1161] The maximum of name length is set into 256.
INFO: [XFORM 203-701] Set default FIFO size in dataflow to 32.
INFO: [XFORM 203-701] Set the default channel type in dataflow to FIFO.
INFO: [XFORM 203-1171] Pipeline the innermost loop with trip count more than 64 or its parent loop when its trip count is less than or equal 64.
INFO: [XFORM 203-1161] The maximum of name length is set into 256.
INFO: [XFORM 203-701] Set default FIFO size in dataflow to 32.
INFO: [XFORM 203-701] Set the default channel type in dataflow to FIFO.
INFO: [HLS 200-10] Starting synthesis with clang3.9 flow ...
INFO: [SCHED 204-61] Option 'relax_ii_for_timing' is enabled, will increase II to preserve clock frequency constraints.
INFO: [HLS 200-10] Analyzing design file '../../../../src/adder.cl' ... 
WARNING: [HLS 200-40] clang: warning: argument unused during compilation: '-I /home/saleh/01_workspace/00_Xilinx_WS/Ex1-11a/src' [-Wunused-command-line-argument]\n
INFO: [HLS 200-111] Finished Linking Time (s): cpu = 00:00:01 ; elapsed = 00:00:01 . Memory (MB): peak = 455.887 ; gain = 0.113 ; free physical = 23650 ; free virtual = 35571
INFO: [HLS 200-111] Finished Checking Pragmas Time (s): cpu = 00:00:01 ; elapsed = 00:00:01 . Memory (MB): peak = 455.887 ; gain = 0.113 ; free physical = 23650 ; free virtual = 35571
INFO: [HLS 200-10] Starting code transformations ...
INFO: [HLS 200-111] Finished Standard Transforms Time (s): cpu = 00:00:04 ; elapsed = 00:00:04 . Memory (MB): peak = 455.887 ; gain = 0.113 ; free physical = 23651 ; free virtual = 35572
INFO: [HLS 200-10] Checking synthesizability ...
INFO: [HLS 200-111] Finished Checking Synthesizability Time (s): cpu = 00:00:04 ; elapsed = 00:00:04 . Memory (MB): peak = 455.887 ; gain = 0.113 ; free physical = 23651 ; free virtual = 35572
INFO: [XFORM 203-510] Pipelining loop 'Loop-2' in function 'write_result_proc.adder.1' automatically.
INFO: [XFORM 203-510] Pipelining loop 'Loop-2' in function 'read_input_proc.adder.1' automatically.
WARNING: [XFORM 203-713] All the elements of global array 'buffer_in'  should be updated in process function 'read_input_proc.adder.13', otherwise it may not be synthesized correctly.
WARNING: [XFORM 203-713] All the elements of global array 'buffer_out'  should be updated in process function 'compute_add_proc', otherwise it may not be synthesized correctly.
INFO: [XFORM 203-712] Applying dataflow to function 'run_subfunc.adder.1', detected/extracted 3 process function(s): 
	 'read_input_proc.adder.13'
	 'compute_add_proc'
	 'write_result_proc.adder.1'.
INFO: [HLS 200-111] Finished Pre-synthesis Time (s): cpu = 00:00:04 ; elapsed = 00:00:04 . Memory (MB): peak = 583.777 ; gain = 128.004 ; free physical = 23634 ; free virtual = 35554
INFO: [HLS 200-111] Finished Architecture Synthesis Time (s): cpu = 00:00:04 ; elapsed = 00:00:04 . Memory (MB): peak = 583.777 ; gain = 128.004 ; free physical = 23608 ; free virtual = 35529
INFO: [HLS 200-10] Starting hardware synthesis ...
INFO: [HLS 200-10] Synthesizing 'adder' ...
WARNING: [SYN 201-103] Legalizing function name 'read_input_proc.adder.13' to 'read_input_proc_adder_13'.
WARNING: [SYN 201-103] Legalizing function name 'write_result_proc.adder.1' to 'write_result_proc_adder_1'.
WARNING: [SYN 201-103] Legalizing function name 'run_subfunc.adder.1' to 'run_subfunc_adder_1'.
WARNING: [SYN 201-107] Renaming port name 'adder/in' to 'adder/in_r' to avoid the conflict with HDL keywords or other object names.
WARNING: [SYN 201-107] Renaming port name 'adder/out' to 'adder/out_r' to avoid the conflict with HDL keywords or other object names.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'read_input_proc_adder_13' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining loop 'Loop 1'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 3.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 4.44 seconds; current allocated memory: 100.880 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 0.24 seconds; current allocated memory: 101.542 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'compute_add_proc' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining loop 'compute'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 2.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 0.14 seconds; current allocated memory: 101.733 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 0.02 seconds; current allocated memory: 101.856 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'write_result_proc_adder_1' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining loop 'Loop 1'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 3.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 0.14 seconds; current allocated memory: 102.612 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 0.23 seconds; current allocated memory: 103.175 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'run_subfunc_adder_1' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 0.07 seconds; current allocated memory: 103.297 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 0.02 seconds; current allocated memory: 103.504 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'adder' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 0.03 seconds; current allocated memory: 103.562 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 0.02 seconds; current allocated memory: 103.664 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'read_input_proc_adder_13' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'read_input_proc_adder_13'.
INFO: [HLS 200-111]  Elapsed time: 0.22 seconds; current allocated memory: 105.314 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'compute_add_proc' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'compute_add_proc'.
INFO: [HLS 200-111]  Elapsed time: 0.12 seconds; current allocated memory: 106.660 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'write_result_proc_adder_1' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'write_result_proc_adder_1'.
INFO: [HLS 200-111]  Elapsed time: 0.21 seconds; current allocated memory: 108.554 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'run_subfunc_adder_1' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'run_subfunc_adder_1'.
INFO: [HLS 200-111]  Elapsed time: 0.13 seconds; current allocated memory: 110.261 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'adder' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-500] Setting interface mode on port 'adder/gmem' to 'm_axi'.
INFO: [RTGEN 206-500] Setting interface mode on port 'adder/in_r' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'adder/out_r' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'adder/inc' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'adder/size' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on function 'adder' to 's_axilite & ap_ctrl_hs'.
INFO: [RTGEN 206-100] Bundling port 'return', 'in_r', 'out_r', 'inc' to AXI-Lite port control.
INFO: [RTGEN 206-100] Finished creating RTL model for 'adder'.
INFO: [HLS 200-111]  Elapsed time: 0.12 seconds; current allocated memory: 111.597 MB.
INFO: [RTMG 210-285] Implementing FIFO 'buffer_in_channel_U(adder_fifo_w32_d32_A)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'size_c_U(adder_fifo_w32_d32_A)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'out_c_U(adder_fifo_w64_d32_A)' using Block RAMs.
INFO: [RTMG 210-285] Implementing FIFO 'inc_c_U(adder_fifo_w32_d32_A)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'buffer_out_channel_U(adder_fifo_w32_d32_A)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'size_c7_U(adder_fifo_w32_d32_A)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'start_for_compute_add_proc_U0_U(adder_start_for_compute_add_proc_U0)' using Shift Registers.
INFO: [RTMG 210-285] Implementing FIFO 'start_for_write_result_proc_adder_1_U0_U(adder_start_for_write_result_proc_adder_1_U0)' using Shift Registers.
INFO: [HLS 200-111] Finished generating all RTL models Time (s): cpu = 00:00:06 ; elapsed = 00:00:07 . Memory (MB): peak = 583.777 ; gain = 128.004 ; free physical = 23583 ; free virtual = 35508
INFO: [SYSC 207-301] Generating SystemC RTL for adder with prefix adder_.
INFO: [VHDL 208-304] Generating VHDL RTL for adder with prefix adder_.
INFO: [VLOG 209-307] Generating Verilog RTL for adder with prefix adder_.
INFO: [HLS 200-112] Total elapsed time: 6.96 seconds; peak allocated memory: 111.597 MB.
Finished C synthesis. 

 

Screenshot from 2019-03-29 22-58-51.png
0 Kudos
Reply
1 Solution

Accepted Solutions
alcatraz1372
Visitor
Visitor
876 Views
Registered: ‎03-29-2019
Turns out that dataflow pragma on the top function of the kernel solves the issue.

View solution in original post

0 Kudos
Reply
4 Replies
brucey
Xilinx Employee
Xilinx Employee
1,405 Views
Registered: ‎03-24-2010

Your understanding is correct: Dataflow version should have smaller interval than the non-dataflow version. 

But the Vivado HLS synthesis report here showes that it's a non-dataflow version(interval=138). 

Regards,
brucey
----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------
Capture.PNG
alcatraz1372
Visitor
Visitor
1,379 Views
Registered: ‎03-29-2019

Thanks for your kind reply,

 

I double checked it and unfortunately, that is not the case.

In fact, I created a new separate project and built it without any changes to the source code (so it has the dataflow pragma) and the result is the same.

HLS Report --> Timing --> Summary : Latency = 138  --- Interval = 138

 

System estimate report for HW-EMU mode is attached for the new project that I mentioned above.

As you can see in the attached file, best case interval for the "adder" module is 138.

 

I am using Xilinx SDx v2018.2 (64-bit) on CentOS 7.4 x64.

0 Kudos
Reply
alcatraz1372
Visitor
Visitor
1,297 Views
Registered: ‎03-29-2019

Any suggestions? 

0 Kudos
Reply
alcatraz1372
Visitor
Visitor
877 Views
Registered: ‎03-29-2019
Turns out that dataflow pragma on the top function of the kernel solves the issue.

View solution in original post

0 Kudos
Reply