09-18-2020 10:59 AM
Hi all,
I am a beginner wrestling with getting the Xilinx Vitis Graph Library to work on an AWS F1 instance. For some reason, I am able to run the code in sw_emu, but it seems to be failing for hw_emu. I'm not too sure to why this is happening and whether or not this is related to a potential hardware compatibility issue. This is the library in question: https://github.com/Xilinx/Vitis_Libraries/tree/master/graph/L2/benchmarks
I've followed the build steps and modified the PLATFORMS_REPO_PATH to point at the aws-fpga/Xilinx/. Software emulation seems to be working just fine, but for whatever reason the hardware emulation seems to be breaking when the code is actually run. Here's a sample of the output:
INFO: Found Device=xilinx_aws-vu9p-f1_shell-v04261818_201920_2
INFO: Importing build_dir.hw_emu.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/kernel_pagerank_0.xclbin
Loading: 'build_dir.hw_emu.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/kernel_pagerank_0.xclbin'
INFO: https://forums.aws.amazon.com/ Hardware emulation runs simulation underneath. Using a large data set will result in long simulation times. It is recommended that a small dataset is used for faster execution. The flow uses approximate models for DDR memory and interconnect and hence the performance data generated is approximate.
INFO: Kernel has been created
XRT build version: 2.3.0
Build hash: 9e13d57c4563e2c19bf5f518993f6e5a8dadc18a
Build date: 2020-02-06 15:08:44
Git branch: 2019.2
PID: 7596
UID: 1000
https://forums.aws.amazon.com/
HOST: ip-172-31-35-133.ec2.internal
EXE: /home/centos/Vitis_Libraries/graph/L2/benchmarks/pagerank/build_dir.hw_emu.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/host.exe
https://forums.aws.amazon.com/ WARNING: Argument '5' of kernel 'kernel_pagerank_0' is allocated in memory bank 'bank0'; compute unit 'kernel_pagerank_0_1' cannot be used with this argument and is ignored.
https://forums.aws.amazon.com/ ERROR: kernel 'kernel_pagerank_0' has no compute units to support required argument connectivity.
Any help would be much appreciated with this! I'm not sure if this is because the underlying FPGA is not supported or what. Thanks!
09-19-2020 01:59 AM
The reference design is targeting on U250. When you retarget the design into xilinx_aws-vu9p-f1_shell-v04261818_201920_2, how do you modify the config.ini file correspondingly?
09-19-2020 11:01 AM
hongh,
Thanks for your reply! I actually did not make any modifications to the config.ini (conn_u200_u250.ini I am assuming here) file. I wasn't aware that changes were supposed to be made here.
09-20-2020 01:47 PM
For some more detail these are the contents of that file:
[connectivity]
sp = kernel_pagerank_0.offsetCSC:bank0
sp = kernel_pagerank_0.indiceCSC:bank0
sp = kernel_pagerank_0.weightCSC:bank0
sp = kernel_pagerank_0.degreeCSR:bank0
sp = kernel_pagerank_0.cntValFull:bank0
sp = kernel_pagerank_0.buffPing:bank0
sp = kernel_pagerank_0.buffPong:bank0
sp = kernel_pagerank_0.orderUnroll:bank0
slr = kernel_pagerank_0:SLR0
nk = kernel_pagerank_0:1:kernel_pagerank_0
Is is possible that the underlying FPGA that AWS is using does not have "bank0"?
09-20-2020 11:58 PM
Please try to allocate the memory bank with DDR[0], instead of bank0.
[connectivity]
sp = kernel_pagerank_0.offsetCSC:DDR[0]
sp = kernel_pagerank_0.indiceCSC:DDR[0]
...
09-21-2020 02:57 PM
hongh,
I just tried your suggestion. Unfortunately this doesn't seem to work either.
09-21-2020 08:45 PM
I finally got this to work on hardware simulation at least. The issue was in the makefile. I am now encountering another error when trying to build this on the actual FPGA device. Here is the output from that...
INFO: Found Device=xilinx_aws-vu9p-f1_dynamic_5_0
INFO: Importing build_dir.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/kernel_pagerank_0.xclbin
Loading: 'build_dir.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/kernel_pagerank_0.xclbin'
XRT build version: 2.3.0
Build hash: 9e13d57c4563e2c19bf5f518993f6e5a8dadc18a
Build date: 2020-02-06 15:08:44
Git branch: 2019.2
PID: 1921
UID: 1000
[Tue Sep 22 03:31:46 2020]
HOST: ip-172-31-35-133.ec2.internal
EXE: /home/centos/Vitis_Libraries/graph/L2/benchmarks/pagerank/build_dir.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/host.exe
[XRT] ERROR: See dmesg log for details. err=-22
[XRT] ERROR: Failed to load xclbin.
INFO: Kernel has been created
[XRT] ERROR: std::bad_alloc
INFO: Finish kernel setup
[XRT] ERROR: No program executable for device
[XRT] ERROR: event is nullptr
[XRT] ERROR: std::bad_alloc
INFO: Finish kernel execution
INFO: Finish E2E execution
-------------------------------------------------------
INFO: Data transfer from host to device: 18446744072990316 us
-------------------------------------------------------
INFO: Data transfer from device to host: 18446744072990316 us
-------------------------------------------------------
INFO: Average kernel execution per run: 18446744072990316 us
-------------------------------------------------------
INFO: Average execution per run: 55340232218970948 us
-------------------------------------------------------
resultinPong = 0
iterations = 0
[XRT] ERROR: std::bad_alloc
[XRT] ERROR: std::bad_alloc
[XRT] ERROR: std::bad_alloc
INFO: sum_golden = 4.30706
INFO: sum_pagerank = 0
pagerank i = 0 our = 0 golden = 1
pagerank i = 1 our = 0 golden = 0.72602
pagerank i = 2 our = 0 golden = 1
pagerank i = 3 our = 0 golden = 0.15
pagerank i = 4 our = 0 golden = 0.38683
pagerank i = 5 our = 0 golden = 0.24746
pagerank i = 6 our = 0 golden = 0.23147
pagerank i = 7 our = 0 golden = 0.23341
pagerank i = 8 our = 0 golden = 0.18187
pagerank i = 9 our = 0 golden = [0.15
INFO: Accurate Rate = 0
INFO: Err Geomean = 1.71
XRTINFO: Result is wrong
] ERROR: std::bad_alloc
[XRT] ERROR: Event '1' is unreferenced but not complete
[XRT] ERROR: std::bad_alloc
terminate called after throwing an instance of 'xrt::error'
what(): event 1 never submitted