cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
benedetto73
Adventurer
Adventurer
1,402 Views
Registered: ‎09-30-2019

linux hangs when starting kernel on Alveo U280

My program (which invokes 2 kernel) runs fine on Emulation-HW.

When I start it on the hardware (Alveo U280) linux hangs after a few seconds.

dmesg output in attachment.

Using Ubuntu 18.04 

 

Ideas?

 

 

Tags (3)
0 Kudos
Reply
13 Replies
mcertosi
Xilinx Employee
Xilinx Employee
1,325 Views
Registered: ‎10-19-2015

Hi @benedetto73 

DMESG indicates an AXI firewall trip. When the card AXI Firewall trips XRT issues a reset to the card. Depending on the server you are using a PCIe reset can cause a kernel panic. 

To continue debugging the application, please disable the XRT health check that automatically sends a reset with the following steps: 

  1.  sudo modinfo xclmgmt: This command lists the current configuration of the module

and indicates if the health_check parameter is on or off. It also returns the path to the

xclmgmt module.

  1.  sudo rmmod xclmgmt: This removes and therefore disables the xclmgmt kernel

module.

  1.  sudo insmod <path to module>/xclmgmt.ko health_check=0: This reinstalls

the xclmgmt kernel module with the health check disabled.

 

Relevant kernel messages: 

[ +4,127458] firewall.m firewall.m.9437184: check_firewall: AXI Firewall 3 tripped, status: 0x100000

[ +0,000003] xclmgmt 0000:01:00.0: xclmgmt_reset_pci: Reset PCI

 

Questions:

What shell, XRT, and Xilinx tool version? 

Can you insert ILAs into the kernel? 

Can you insert profiling logic into the kernel?

 

Action:

Turn off health check to prevent the server from hanging on boot. 

Try to run xbutil query and send us the output. Query should tell me the AXI Firewall trip violation in plane text. (instead of 0x100000)

Add ILAs to kernel for further debugging

Add the AXI Lightweight Protocol Checker to DDR interface, use xbutil status to determine if there is a protocol violation in the kernel. 

**note, using xbutil command in the directions above, but in 2019.2 xbutil is split between xbmgmt and xbutil, please use the -help option to assist navigation. 

 

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
benedetto73
Adventurer
Adventurer
1,310 Views
Registered: ‎09-30-2019

here's another hanging:

 

[dic 2 22:00] usb 3-4: USB disconnect, device number 2
[ +14,714145] usb 3-4: new full-speed USB device number 3 using xhci_hcd
[ +0,174787] usb 3-4: New USB device found, idVendor=046d, idProduct=c52b, bcdDevice=12.03
[ +0,000004] usb 3-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ +0,000003] usb 3-4: Product: USB Receiver
[ +0,000002] usb 3-4: Manufacturer: Logitech
[ +0,046244] logitech-djreceiver 0003:046D:C52B.0012: hiddev0,hidraw2: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:05:00.3-4/input2
[ +0,133459] input: Logitech M310/M310t as /devices/pci0000:00/0000:00:01.2/0000:02:00.0/0000:03:08.0/0000:05:00.3/usb3/3-4/3-4:1.2/0003:046D:C52B.0012/0003:046D:4031.0013/input/input28
[ +0,000142] logitech-hidpp-device 0003:046D:4031.0013: input,hidraw3: USB HID v1.11 Mouse [Logitech M310/M310t] on usb-0000:05:00.3-4:1
[ +0,007891] input: Logitech K520 as /devices/pci0000:00/0000:00:01.2/0000:02:00.0/0000:03:08.0/0000:05:00.3/usb3/3-4/3-4:1.2/0003:046D:C52B.0012/0003:046D:2011.0014/input/input29
[ +0,000285] logitech-hidpp-device 0003:046D:2011.0014: input,hidraw0: USB HID v1.11 Keyboard [Logitech K520] on usb-0000:05:00.3-4:2
[ +1,049659] logitech-hidpp-device 0003:046D:4031.0013: HID++ 2.0 device connected.
[ +3,603584] xocl 0000:01:00.1: _xocl_drvinst_open: OPEN 1
[ +0,000003] [drm] creating scheduler client for pid(3094), ret: 0
[ +0,001200] xmc.u xmc.u.11534336: xmc_read_from_peer: reading from peer
[ +0,000053] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 10 via HW
[ +0,000160] mailbox.m mailbox.m.13631488: process_request: received request from peer: 10, passed on
[ +0,000003] xclmgmt 0000:01:00.0: xclmgmt_read_subdev_req: req kind 0
[ +0,000070] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 10 via HW
[ +0,001708] icap.u icap.u.15728640: icap_read_from_peer: reading from peer
[ +0,000007] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 10 via HW
[ +0,000163] mailbox.m mailbox.m.13631488: process_request: received request from peer: 10, passed on
[ +0,000002] xclmgmt 0000:01:00.0: xclmgmt_read_subdev_req: req kind 1
[ +0,000058] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 10 via HW
[ +0,103081] rom.u rom.u.0: verify_timestamp: Shell timestamp: 0x5da8da6e
[ +0,000001] rom.u rom.u.0: verify_timestamp: Verify timestamp: 0x5da8da6e
[ +0,009882] xocl 0000:01:00.1: xocl_axlf_section_header: trying to find section header for axlf section 20
[ +0,000002] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 9
[ +0,000001] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 0
[ +0,000000] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 6
[ +0,000001] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 8
[ +0,000000] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 7
[ +0,000001] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 11
[ +0,000000] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 14
[ +0,000001] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 2
[ +0,000000] xocl 0000:01:00.1: xocl_axlf_section_header: saw section header: 22
[ +0,000001] xocl 0000:01:00.1: xocl_axlf_section_header: could not find section header 20
[ +0,000001] [drm] Finding MEM_TOPOLOGY section header
[ +0,000000] [drm] Section MEM_TOPOLOGY details:
[ +0,000001] [drm] offset = 0x30e9150
[ +0,000000] [drm] size = 0x648
[ +0,000065] icap.u icap.u.15728640: get_axlf_section_hdr: could not find section header 20
[ +0,000003] rom.u rom.u.0: verify_timestamp: Shell timestamp: 0x5da8da6e
[ +0,000001] rom.u rom.u.0: verify_timestamp: Verify timestamp: 0x5da8da6e
[ +0,000001] icap.u icap.u.15728640: icap_download_bitstream_axlf: incoming xclbin: 9ee667a5-99f1-4abc-a641-258a27979bc6
on device xclbin: 00000000-0000-0000-0000-000000000000
[ +0,000092] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 7 via HW
[ +0,000106] mailbox.m mailbox.m.13631488: process_request: received request from peer: 7, passed on
[ +0,009438] icap.m icap.m.15728640: get_axlf_section_hdr: could not find section header 20
[ +0,000005] rom.m rom.m.0: verify_timestamp: Shell timestamp: 0x5da8da6e
[ +0,000000] rom.m rom.m.0: verify_timestamp: Verify timestamp: 0x5da8da6e
[ +0,002816] icap.m icap.m.15728640: icap_download_bitstream_axlf: incoming xclbin: 9ee667a5-99f1-4abc-a641-258a27979bc6
on device xclbin: 00000000-0000-0000-0000-000000000000
[ +0,000004] icap.m icap.m.15728640: get_axlf_section_hdr: section 11 offset: 51291136, size: 410
[ +0,000120] icap.m icap.m.15728640: axlf_set_freqscaling: set 4 freq, data: 260, kernel: 500, sys: 450, sys1: 0
[ +0,000008] icap.m icap.m.15728640: icap_freeze_axi_gate: freezing CL AXI gate
[ +0,000027] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 0, Current 300 Mhz, New 260 Mhz
[ +0,010922] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz waiting for locked signal
[ +0,099457] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(0) 0xd01
[ +0,000003] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(2) 0x5
[ +0,000007] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 1, Current 500 Mhz, New 500 Mhz
[ +0,000007] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 2, Current 450 Mhz, New 450 Mhz
[ +0,000001] icap.m icap.m.15728640: icap_free_axi_gate: freeing CL AXI gate
[ +0,000016] icap.m icap.m.15728640: get_axlf_section_hdr: section 0 offset: 1400, size: 51284947
[ +0,000001] icap.m icap.m.15728640: icap_download_hw: downloading bitstream, length: 51284947
[ +0,000001] icap.m icap.m.15728640: icap_freeze_axi_gate: freezing CL AXI gate
[ +0,000019] icap.m icap.m.15728640: icap_download_clear_bitstream: downloading clear bitstream of length 0x0
[ +0,000021] icap.m icap.m.15728640: bitstream_parse_header: Design "pfm_top_wrapper;PARTIAL=TRUE;COMPRESS=TRUE;UserID=0XFFFFFFFF;Version=2019.2"
[ +0,000002] icap.m icap.m.15728640: bitstream_parse_header: Part "xcu280-fsvh2892-2L-e"
[ +0,000002] icap.m icap.m.15728640: bitstream_parse_header: Timestamp "20:34:31 2019/12/02"
[ +0,000002] icap.m icap.m.15728640: bitstream_parse_header: Raw data size 0x30e8b40
[ +4,788116] icap.m icap.m.15728640: wait_for_done: XHWICAP_SR: 5
[ +0,000010] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 0, Current 260 Mhz, New 260 Mhz
[ +0,010929] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz waiting for locked signal
[ +0,099503] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(0) 0xd01
[ +0,000004] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(2) 0x5
[ +0,000008] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 1, Current 500 Mhz, New 500 Mhz
[ +0,010928] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz waiting for locked signal
[ +0,099516] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(0) 0xa01
[ +0,000004] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(2) 0x2
[ +0,000008] icap.m icap.m.15728640: icap_ocl_freqscaling: Clock 2, Current 450 Mhz, New 450 Mhz
[ +0,010929] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz waiting for locked signal
[ +0,099503] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(0) 0x901
[ +0,000004] icap.m icap.m.15728640: icap_ocl_freqscaling: clockwiz CONFIG(2) 0x2
[ +0,000004] icap.m icap.m.15728640: icap_free_axi_gate: freeing CL AXI gate
[ +0,000023] icap.m icap.m.15728640: get_axlf_section_hdr: could not find section header 1
[ +1,532209] icap.m icap.m.15728640: get_axlf_section_hdr: section 6 offset: 51286352, size: 1608
[ +0,000014] icap.m icap.m.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 6, err: 0
[ +0,000003] icap.m icap.m.15728640: get_axlf_section_hdr: section 8 offset: 51287960, size: 3048
[ +0,000003] icap.m icap.m.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 8, err: 0
[ +0,000008] icap.m icap.m.15728640: get_axlf_section_hdr: section 8 offset: 51287960, size: 3048
[ +0,000056] icap.m icap.m.15728640: get_axlf_section_hdr: section 6 offset: 51286352, size: 1608
[ +0,000006] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: DDR[0]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: DDR[1]
[ +0,000004] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000126] mig.m mig.m.10485760: mig_probe: MIG name: HBM[0], IO start: 0xd2905800, end: 0xd29058ff mig->type 1
[ +0,000048] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485760
[ +0,000003] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485760 is active
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000065] mig.m mig.m.10485761: mig_probe: MIG name: HBM[1], IO start: 0xd2905800, end: 0xd29058ff mig->type 2
[ +0,000029] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485761
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485761 is active
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000054] mig.m mig.m.10485762: mig_probe: MIG name: HBM[2], IO start: 0xd2985800, end: 0xd29858ff mig->type 1
[ +0,000027] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485762
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485762 is active
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000050] mig.m mig.m.10485763: mig_probe: MIG name: HBM[3], IO start: 0xd2985800, end: 0xd29858ff mig->type 2
[ +0,000026] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485763
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485763 is active
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000047] mig.m mig.m.10485764: mig_probe: MIG name: HBM[4], IO start: 0xd2925800, end: 0xd29258ff mig->type 1
[ +0,000028] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485764
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485764 is active
[ +0,000003] xclmgmt 0000:01:00.0: __xocl_subdev_create: creating subdev mig.m
[ +0,000047] mig.m mig.m.10485765: mig_probe: MIG name: HBM[5], IO start: 0xd2925800, end: 0xd29258ff mig->type 2
[ +0,000027] xclmgmt 0000:01:00.0: __xocl_subdev_create: Created subdev mig inst 10485765
[ +0,000002] xclmgmt 0000:01:00.0: __xocl_subdev_create: subdev mig.m inst 10485765 is active
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[6]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[7]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[8]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[9]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[10]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[11]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[12]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[13]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[14]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[15]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[16]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[17]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[18]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[19]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[20]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[21]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[22]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[23]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[24]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[25]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[26]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[27]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[28]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[29]
[ +0,000003] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[30]
[ +0,000002] icap.m icap.m.15728640: icap_create_subdev: ignore ECC controller for: HBM[31]
[ +0,000006] icap.m icap.m.15728640: icap_download_bitstream_axlf: icap_download_bitstream_axlf err: 0
[ +0,001635] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 7 via HW
[ +0,003075] icap.u icap.u.15728640: get_axlf_section_hdr: section 8 offset: 51287960, size: 3048
[ +0,000064] icap.u icap.u.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 8, err: 0
[ +0,000002] icap.u icap.u.15728640: get_axlf_section_hdr: section 6 offset: 51286352, size: 1608
[ +0,000004] icap.u icap.u.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 6, err: 0
[ +0,000003] icap.u icap.u.15728640: get_axlf_section_hdr: section 7 offset: 51291008, size: 124
[ +0,000004] icap.u icap.u.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 7, err: 0
[ +0,000002] icap.u icap.u.15728640: get_axlf_section_hdr: section 9 offset: 816, size: 584
[ +0,000004] icap.u icap.u.15728640: icap_parse_bitstream_axlf_section: icap_parse_bitstream_axlf_section kind 9, err: 0
[ +0,000002] icap.u icap.u.15728640: get_axlf_section_hdr: section 11 offset: 51291136, size: 410
[ +0,000008] icap.u icap.u.15728640: get_axlf_section_hdr: section 8 offset: 51287960, size: 3048
[ +0,000004] icap.u icap.u.15728640: get_axlf_section_hdr: section 6 offset: 51286352, size: 1608
[ +0,000006] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: DDR[0]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: DDR[1]
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000104] mig.u mig.u.10485760: ecc_reset: Unable to reset from userpf
[ +0,000023] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485760
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485760 is active
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000049] mig.u mig.u.10485761: ecc_reset: Unable to reset from userpf
[ +0,000017] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485761
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485761 is active
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000045] mig.u mig.u.10485762: ecc_reset: Unable to reset from userpf
[ +0,000017] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485762
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485762 is active
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000048] mig.u mig.u.10485763: ecc_reset: Unable to reset from userpf
[ +0,000015] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485763
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485763 is active
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000046] mig.u mig.u.10485764: ecc_reset: Unable to reset from userpf
[ +0,000015] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485764
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485764 is active
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_create: creating subdev mig.u
[ +0,000043] mig.u mig.u.10485765: ecc_reset: Unable to reset from userpf
[ +0,000015] xocl 0000:01:00.1: __xocl_subdev_create: Created subdev mig inst 10485765
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_create: subdev mig.u inst 10485765 is active
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[6]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[7]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[8]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[9]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[10]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[11]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[12]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[13]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[14]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[15]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[16]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[17]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[18]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[19]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[20]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[21]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[22]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[23]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[24]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[25]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[26]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[27]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[28]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[29]
[ +0,000003] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[30]
[ +0,000002] icap.u icap.u.15728640: icap_create_subdev: ignore ECC controller for: HBM[31]
[ +0,000005] icap.u icap.u.15728640: icap_download_bitstream_axlf: icap_download_bitstream_axlf err: 0
[ +0,000003] icap.u icap.u.15728640: icap_read_from_peer: reading from peer
[ +0,000005] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 10 via HW
[ +0,000302] mailbox.m mailbox.m.13631488: process_request: received request from peer: 10, passed on
[ +0,000004] xclmgmt 0000:01:00.0: xclmgmt_read_subdev_req: req kind 1
[ +0,007031] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 10 via HW
[ +0,000313] xocl 0000:01:00.1: xocl_init_mem: Topology count = 40, data_length = 1600
[ +0,000009] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[0]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[1]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[2]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x20000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[3]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x30000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[4]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x40000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[5]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x50000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:1
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[6]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[7]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[8]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[9]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[10]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[11]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[12]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[13]
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[14]
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[15]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000002] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[16]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[17]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[18]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[19]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[20]
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[21]
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[22]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[23]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[24]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[25]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[26]
[ +0,000010] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[27]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[28]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[29]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[30]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: HBM[31]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: DDR[0]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x4000000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x400000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: DDR[1]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x8000000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x400000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[0]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x200000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[1]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x200400000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[2]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x200800000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[3]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x200c00000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[4]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x201000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Memory Bank: PLRAM[5]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Base Address:0x201400000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Size:0x20000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Type:2
[ +0,000000] xocl 0000:01:00.1: xocl_init_mem: Used:0
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[0]
[ +0,000002] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x0, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000071] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[1]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x10000000, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000004] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[2]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x20000000, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000012] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[3]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x30000000, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000004] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[4]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x40000000, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000004] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Allocating Memory Bank: HBM[5]
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: base_addr:0x50000000, total size:0x10000000
[ +0,000001] xocl 0000:01:00.1: xocl_init_mem: Found a new memory region
[ +0,000004] xocl 0000:01:00.1: xocl_init_mem: drm_mm_init called
[ +0,000003] xocl 0000:01:00.1: xocl_read_axlf_helper: Loaded xclbin 9ee667a5-99f1-4abc-a641-258a27979bc6
[ +0,000859] icap.u icap.u.15728640: icap_lock_bitstream: bitstream 9ee667a5-99f1-4abc-a641-258a27979bc6 locked, ref=1
[ +0,000002] xocl 0000:01:00.1: exec_reset: exec_reset(0) cfg(0)
[ +0,000000] xocl 0000:01:00.1: exec_reset: exec_reset resets
[ +0,000001] xocl 0000:01:00.1: exec_reset: exec->xclbin(00000000-0000-0000-0000-000000000000),xclbin(9ee667a5-99f1-4abc-a641-258a27979bc6)
[ +0,000003] xocl_mb_sche mb_scheduler.u.4194304: client_ioctl_ctx: CTX add(9ee667a5-99f1-4abc-a641-258a27979bc6, pid 3094, cu_idx 0xffffffff) = 0, ctx=1
[ +0,000015] xocl 0000:01:00.1: exec_cfg_cmd: ert per feature rom = 1
[ +0,000001] xocl 0000:01:00.1: exec_cfg_cmd: dsa52 = 1
[ +0,000003] xocl 0000:01:00.1: cu_reset: configured cu(0) base@0x1800000 poll@0x (null) control(0) ctx(0)
[ +0,000002] xocl 0000:01:00.1: cu_reset: configured cu(1) base@0x1810000 poll@0x (null) control(0) ctx(0)
[ +0,000001] xocl 0000:01:00.1: cu_reset: configured cu(2) base@0x1820000 poll@0x (null) control(0) ctx(0)
[ +0,000002] xocl 0000:01:00.1: cu_reset: configured cu(3) base@0x1830000 poll@0x (null) control(0) ctx(0)
[ +0,000001] xocl 0000:01:00.1: exec_cfg_cmd: configuring embedded scheduler mode
[ +0,000002] xocl 0000:01:00.1: exec_cfg_cmd: scheduler config ert(1), dataflow(0), slots(16), cudma(1), cuisr(0), cdma(0), cus(4)
[ +0,000684] icap.u icap.u.15728640: icap_unlock_bitstream: bitstream 9ee667a5-99f1-4abc-a641-258a27979bc6 unlocked, ref=0
[ +0,000003] xocl 0000:01:00.1: exec_stop: exec_stop(0000000087f54aa8)
[ +0,000008] xocl_mb_sche mb_scheduler.u.4194304: client_ioctl_ctx: CTX del(9ee667a5-99f1-4abc-a641-258a27979bc6, pid 3094, cu_idx 0xffffffff) = 0, ctx=0
[ +0,005085] xmc.u xmc.u.11534336: xmc_read_from_peer: reading from peer
[ +0,000008] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 10 via HW
[ +0,000169] mailbox.m mailbox.m.13631488: process_request: received request from peer: 10, passed on
[ +0,000002] xclmgmt 0000:01:00.0: xclmgmt_read_subdev_req: req kind 0
[ +0,000074] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 10 via HW
[ +0,002783] icap.u icap.u.15728640: icap_lock_bitstream: bitstream 9ee667a5-99f1-4abc-a641-258a27979bc6 locked, ref=1
[ +0,000002] xocl 0000:01:00.1: exec_reset: exec_reset(0) cfg(1)
[ +0,000002] xocl_mb_sche mb_scheduler.u.4194304: client_ioctl_ctx: CTX add(9ee667a5-99f1-4abc-a641-258a27979bc6, pid 3094, cu_idx 0xffffffff) = 0, ctx=1
[ +0,004794] xocl 0000:01:00.1: _xocl_drvinst_open: OPEN 2
[ +0,000004] [drm] creating scheduler client for pid(3094), ret: 0
[ +0,000071] xocl 0000:01:00.1: xocl_native_mmap: successful native mmap @0x0 with size 0x2000000
[ +0,006276] xocl_mb_sche mb_scheduler.u.4194304: client_ioctl_ctx: CTX add(9ee667a5-99f1-4abc-a641-258a27979bc6, pid 3094, cu_idx 0x0) = 0, ctx=2
[ +3,644978] firewall.m firewall.m.9437184: check_firewall: AXI Firewall 3 tripped, status: 0x100000
[ +0,000005] xclmgmt 0000:01:00.0: health_check_cb: firewall tripped, notify peer
[ +0,000233] mailbox.u mailbox.u.13631488: process_request: received request from peer: 6, passed on
[ +0,000004] xocl 0000:01:00.1: xocl_mailbox_srv: received request (6) from peer
[ +0,000002] xocl 0000:01:00.1: xocl_mailbox_srv: firewall tripped, request reset
[ +2,015706] xocl 0000:01:00.1: xocl_hot_reset: resetting device...
[ +0,000008] xocl 0000:01:00.1: xocl_drvinst_kill_proc: kill 3094
[ +3,103908] firewall.m firewall.m.9437184: check_firewall: AXI Firewall 3 tripped, status: 0x100000
[ +0,000004] xclmgmt 0000:01:00.0: health_check_cb: firewall tripped, notify peer
[ +0,000152] mailbox.u mailbox.u.13631488: process_request: received request from peer: 6, passed on
[ +0,000003] xocl 0000:01:00.1: xocl_mailbox_srv: received request (6) from peer
[ +0,000001] xocl 0000:01:00.1: xocl_mailbox_srv: firewall tripped, request reset
[ +1,279834] xocl:xdma_xfer_submit: xfer 0x0000000051836b97,8192, s 0x1 timed out, ep 0x0.
[ +0,000005] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000988c73e7) = 0x1fc00006 (id).
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000af4411c5) = 0x00000001 (status).
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x0000000033f154b4) = 0x00f83e1f (control)
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000daee2aac) = 0xf7c90000 (first_desc_lo)
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x000000003774b65b) = 0x00000000 (first_desc_hi)
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000e209ba90) = 0x00000001 (first_desc_adjacent).
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x000000000d32aa03) = 0x00000000 (completed_desc_count).
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000933d6b03) = 0x00f83e1e (interrupt_enable_mask)
[ +0,000002] xocl:engine_status_dump: SG engine 0-H2C0-MM status: 0x00000001: BUSY
[ +0,000002] xocl:transfer_abort: abort transfer 0x0000000051836b97, desc 2, engine desc queued 0.
[ +0,000019] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: DMA failed, Dumping SG Page Table
[ +0,000005] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 0, 0xf133ef000
[ +0,000003] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 1, 0xf74fc2000
[ +0,225749] [drm] client exits pid(3094)
[ +0,000007] icap.u icap.u.15728640: icap_read_from_peer: reading from peer
[ +0,000056] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 10 via HW
[ +0,000620] mailbox.m mailbox.m.13631488: process_request: received request from peer: 10, passed on
[ +0,000006] xclmgmt 0000:01:00.0: xclmgmt_read_subdev_req: req kind 1
[ +0,006014] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 10 via HW
[ +0,000245] xocl 0000:01:00.1: destroy_client: CTX reclaim (9ee667a5-99f1-4abc-a641-258a27979bc6, 3094, 0)
[ +0,000003] icap.u icap.u.15728640: icap_unlock_bitstream: bitstream 9ee667a5-99f1-4abc-a641-258a27979bc6 unlocked, ref=0
[ +0,000004] xocl 0000:01:00.1: exec_stop: exec_stop(0000000087f54aa8)
[ +0,000010] xocl 0000:01:00.1: xocl_drvinst_close: CLOSE 3
[ +0,000038] [drm] client exits pid(3094)
[ +0,000003] xocl 0000:01:00.1: xocl_drvinst_close: CLOSE 2
[ +0,000002] xocl 0000:01:00.1: xocl_drvinst_close: NOTIFY 00000000500b72ce
[ +0,000068] xocl 0000:01:00.1: xocl_drvinst_kill_proc: return 0
[ +0,000004] xocl 0000:01:00.1: xocl_reset_notify: PCI reset NOTIFY, prepare 1
[ +0,000005] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 0
[ +0,000011] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 1
[ +0,000006] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 2
[ +0,000006] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 3
[ +0,000005] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 4
[ +0,000004] xocl 0000:01:00.1: xocl_cleanup_mem: Taking down DDR : 5
[ +0,000023] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000094] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000038] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000033] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000034] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000032] xocl 0000:01:00.1: __xocl_subdev_destroy: Destroy subdev mig, cdev (null)
[ +0,000050] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev icap, cdev (null)
[ +0,000002] icap.u icap.u.15728640: xocl_drvinst_kill_proc: return 0
[ +0,000024] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev mailbox, cdev 00000000f9608937
[ +0,000087] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev xmc, cdev (null)
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_offline: release driver xmc
[ +0,000221] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev firewall, cdev (null)
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_offline: release driver firewall
[ +0,000043] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev xvc_pub, cdev 00000000f7d35038
[ +0,004258] xocl 0000:01:00.1: __xocl_subdev_offline: release driver xvc_pub
[ +0,000060] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev mb_scheduler, cdev (null)
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_offline: release driver mb_scheduler
[ +0,000014] [drm] /var/lib/dkms/xrt/2.3.1301/build/driver/xocl/userpf/../subdev/mb_scheduler.c:3113 scheduler thread exits with value 0
[ +0,000067] [drm] command scheduler removed
[ +0,000032] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev dma.xdma, cdev (null)
[ +0,000002] xocl 0000:01:00.1: __xocl_subdev_offline: release driver dma.xdma
[ +0,018046] xocl 0000:01:00.1: __xocl_subdev_offline: offline subdev rom, cdev (null)
[ +0,000003] xocl 0000:01:00.1: __xocl_subdev_offline: release driver rom
[ +0,000005] rom.u rom.u.0: feature_rom_remove: Remove feature rom
[ +0,000038] xocl 0000:01:00.1: __xocl_subdev_online: online subdev mailbox, cdev (null)
[ +0,000002] mailbox.u mailbox.u.13631488: mailbox_enable_intr_mode: failed to add intr handler
[ +0,000248] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 11 via HW
[ +0,000092] mailbox.m mailbox.m.13631488: process_request: received request from peer: 11, passed on
[ +0,000006] mailbox.m mailbox.m.13631488: mailbox_post_response: posting response for: 11 via HW
[ +0,911719] xocl 0000:01:00.1: xocl_mb_connect: ch_state 0x3, ret 0
[ +0,000005] mailbox.u mailbox.u.13631488: mailbox_request: sending request: 5 via HW
[ +0,000154] mailbox.m mailbox.m.13631488: process_request: received request from peer: 5, passed on
[ +0,000007] xclmgmt 0000:01:00.0: reset_hot_ioctl: Trying to reset card 256 in slot PCI Bus 0000:01:00:0
[ +2,671922] firewall.m firewall.m.9437184: check_firewall: AXI Firewall 3 tripped, status: 0x100000
[ +0,000005] xclmgmt 0000:01:00.0: health_check_cb: firewall tripped, notify peer
[ +0,000032] xclmgmt 0000:01:00.0: xocl_thread: xclmgmt health thread exit.
[ +0,000053] xclmgmt 0000:01:00.0: xocl_thread_stop: xclmgmt health thread stop ret = 0
[ +0,000004] xclmgmt 0000:01:00.0: xocl_thread_stop: xclmgmt health thread has terminated
[ +0,000005] xmc.m xmc.m.11534336: stop_xmc: Stop Microblaze...
[ +0,000005] xmc.m xmc.m.11534336: stop_xmc_nolock: MB Reset GPIO 0x1
[ +0,000007] xmc.m xmc.m.11534336: stop_xmc_nolock: XMC info, version 0x1ecf81, status 0x30019001, id 0x74736574
[ +0,000004] xmc.m xmc.m.11534336: stop_xmc_nolock: Stopping XMC...
[ +0,000006] xmc.m xmc.m.11534336: stop_xmc_nolock: Stopping scheduler...
[ +0,023789] mailbox.u mailbox.u.13631488: process_request: received request from peer: 6, passed on
[ +0,000005] xocl 0000:01:00.1: xocl_mailbox_srv: received request (6) from peer
[ +0,000002] xocl 0000:01:00.1: xocl_mailbox_srv: firewall tripped, request reset
[ +0,083911] xmc.m xmc.m.11534336: stop_xmc_nolock: XMC/sched Stopped, retry 2
[ +0,000009] xmc.m xmc.m.11534336: stop_xmc_nolock: XMC info, version 0x1ecf81, status 0x30039003, id 0x74736574
[ +0,000005] icap.m icap.m.15728640: icap_freeze_axi_gate: freezing CL AXI gate
[ +0,531785] icap.m icap.m.15728640: icap_free_axi_gate: freeing CL AXI gate
[ +0,511945] xclmgmt 0000:01:00.0: __xocl_subdev_offline: offline subdev icap, cdev 000000006fd48eff
[ +0,000145] icap.m icap.m.15728640: xocl_drvinst_kill_proc: return 0
[ +0,000026] xclmgmt 0000:01:00.0: __xocl_subdev_offline: offline subdev mailbox, cdev 00000000d2fed7f4
[ +0,000058] mgmt_msix dma_msix.m.3145728: user_intr_config: configure intr at 0xffffa4470c780000
[ +0,000031] mgmt_msix dma_msix.m.3145728: user_intr_unreg: intr 11 unreg success, start vec 4
[ +0,000003] xclmgmt 0000:01:00.0: xclmgmt_reset_pci: Reset PCI

 

0 Kudos
Reply
benedetto73
Adventurer
Adventurer
1,299 Views
Registered: ‎09-30-2019

Thanks @mcertosi .

Turning off health checks at least doesn't hang linux anymore, thanks.

Answers to your questions

Q: What shell, XRT, and Xilinx tool version? 

A: xilinx_u280_xdma_201920_1
Xilinx Vitis IDE v2019.2 (64-bit)
SW Build 2708876 on Wed Nov 6 21:40:25 MST 2019

Q: Can you insert ILAs into the kernel? 
A: Will do.

Q: Can you insert profiling logic into the kernel?
A: how do I do that?

0 Kudos
Reply
mcertosi
Xilinx Employee
Xilinx Employee
1,281 Views
Registered: ‎10-19-2015

Hi @benedetto73 

Happy that turning off health check stopped the crash. The card should still be unuseable in this state but we can collect addtional debugging information. Can you run the query command and attach the output? 

You can use the --dk switch with v++ to add ILAs, first you run --dk list_ports to see the interfaces v++ can add ILAs to, then you use this format to add ILAs and lightweight protocol checkers to that interface. 

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug1393-vitis-application-acceleration.pdf

System ILAs can be inserted into the design using the v++ --dk option as shown below:
$ v++ --dk chipscope:<compute_unit_name>:<interface_name>

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
benedetto73
Adventurer
Adventurer
1,232 Views
Registered: ‎09-30-2019

Hi @mcertosi ,
I believe I added chipscope by checking the appropriate flags in the UI.

After rebuilding, the program exhibits the same behavior.
Dmesg logs seems to indicate problems in transfering data to the Alveo.

Excerpt from dmesg logs here, full log in attachment.

[ +0,000001] xocl:transfer_abort: abort transfer 0x0000000075a0e46e, desc 4, engine desc queued 0.
[ +0,000019] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: DMA failed, Dumping SG Page Table
[ +0,000006] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 0, 0xbf8941000
[ +0,000003] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 1, 0xa67532000
[ +0,000003] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 2, 0xb6e21f000
[ +0,000003] dma.xdma.u dma.xdma.u.3145728: xdma_migrate_bo: 3, 0x974ddd000
[ +1,645323] usb 3-4: USB disconnect, device number 5
[ +8,590549] xocl:xdma_xfer_submit: xfer 0x0000000075a0e46e,16384, s 0x1 timed out, ep 0x4000.
[ +0,000006] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x000000006f9a9218) = 0x1fc00006 (id).
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x000000002d6e9398) = 0x00000001 (status).
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x0000000042ac232f) = 0x00f83e1e (control)
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x000000008f3462d6) = 0xf9960000 (first_desc_lo)
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000e264ede8) = 0x00000000 (first_desc_hi)
[ +0,000003] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000581f1d09) = 0x00000001 (first_desc_adjacent).
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x00000000d693cf87) = 0x00000000 (completed_desc_count).
[ +0,000002] xocl:engine_reg_dump: 0-H2C0-MM: ioread32(0x0000000032e500b5) = 0x00f83e1e (interrupt_enable_mask)
[ +0,000003] xocl:engine_status_dump: SG engine 0-H2C0-MM status: 0x00000001: BUSY
[ +0,000002] xocl:transfer_abort: abort transfer 0x0000000075a0e46e, desc 4, engine desc queued 0.

 

0 Kudos
Reply
benedetto73
Adventurer
Adventurer
1,227 Views
Registered: ‎09-30-2019

Hi  @mcertosi , 2 questions.

1- Once I have added ILA/chipscope, how to use it to debug?
2- xbutil validate returns errors. Log down here.

 

betto@mfgs2:~$ sudo /opt/xilinx/xrt/bin/xbutil validate -d 0000:01:00.0
INFO: Found 1 cards

INFO: Validating card[0]: xilinx_u280_xdma_201920_1
INFO: == Starting AUX power connector check:
AUX power connector not available. Skipping validation
INFO: == AUX power connector check SKIPPED
INFO: == Starting PCIE link check:
LINK ACTIVE, ATTENTION
Ensure Card is plugged in to Gen3x16, instead of Gen3x8
Lower performance may be experienced
WARN: == PCIE link check PASSED with warning
INFO: == Starting verify kernel test:
..Host buffer alignment 4096 bytes
Compiled kernel = /opt/xilinx/xsa/xilinx_u280_xdma_201920_1/test/verify.xclbin
Shell = xilinx_u280_xdma_201920_1
Index = 0
PCIe = GEN3 x 8
OCL Frequency = 300 MHz
DDR Bank = 2
Device Temp = 64 C
MIG Calibration = True
Finished downloading bitstream /opt/xilinx/xsa/xilinx_u280_xdma_201920_1/test/verify.xclbin
CU[0] hello:hello_1 @0x1800000
[0] HBM[0] @0x0
Error: Unable to sync BO

ERROR: == verify kernel test FAILED
INFO: Card[0] failed to validate.

ERROR: Some cards failed to validate.

 

0 Kudos
Reply
benedetto73
Adventurer
Adventurer
1,220 Views
Registered: ‎09-30-2019

@mcertosi 

scratch the validate issue, after a reboot it seems fine:

betto@mfgs2:~$ sudo /opt/xilinx/xrt/bin/xbutil validate
INFO: Found 1 cards

INFO: Validating card[0]: xilinx_u280_xdma_201920_1
INFO: == Starting AUX power connector check:
AUX POWER NOT CONNECTED, ATTENTION
Board not stable for heavy acceleration tasks.
WARN: == AUX power connector check PASSED with warning
INFO: == Starting PCIE link check:
LINK ACTIVE, ATTENTION
Ensure Card is plugged in to Gen3x16, instead of Gen3x8
Lower performance may be experienced
WARN: == PCIE link check PASSED with warning
INFO: == Starting verify kernel test:
INFO: == verify kernel test PASSED
INFO: == Starting DMA test:
Buffer Size: 256 MB
Host -> PCIe -> FPGA write bandwidth = 5407.35 MB/s
Host <- PCIe <- FPGA read bandwidth = 5791.2 MB/s
INFO: == DMA test PASSED
INFO: == Starting device memory bandwidth test:
...........
Maximum throughput: 43690 MB/s
INFO: == device memory bandwidth test PASSED
INFO: == Starting PCIE peer-to-peer test:
P2P BAR is not enabled. Skipping validation
INFO: == PCIE peer-to-peer test SKIPPED
INFO: == Starting memory-to-memory DMA test:
M2M is not available. Skipping validation
INFO: == memory-to-memory DMA test SKIPPED
INFO: Card[0] validated with warnings.

INFO: All cards validated successfully but with warnings.

0 Kudos
Reply
mcertosi
Xilinx Employee
Xilinx Employee
1,186 Views
Registered: ‎10-19-2015

Hi @benedetto73 

Xbutil validate could not complete because the board was already in an error state. This is expected. 

Rebooting the computer clears the error, then validate works. This is also expected. 

Please determine the error states AXI firewall trip message by using xbutil query. 

 

When the ILAs are added to the design you can debug by using Vivado Hardware Manager, Xilinx Virtual Jtag Cable over PCIe (XVC), and the Vivado Hardware Server. Have you added the ILAs? There are a lot of directions involved so I don't want to cloud this post with directions until you are ready to use them. 

When your kernel crashes, could you make sure you handle the error such that the OpenCL objects are released? This will allow for another level of debug and kernel profiling. Otherwise when we enable kernel logging, you will see this message 

"Please ensure all OpenCL objects are released by your host code (e.g., clReleaseProgram())."

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
martin31821
Visitor
Visitor
844 Views
Registered: ‎01-08-2020

Hello @mcertosi 

I'm having exact the same problem, I'm getting:

[16808.633149] xocl:engine_status_dump: SG engine 0-H2C0-MM status: 0x00000001: BUSY
[16808.633151] xocl:transfer_abort: abort transfer 0x000000007f31d62f, desc 1, engine desc queued 0.
[16808.633154] xocl:engine_status_dump: SG engine 0-H2C1-MM status: 0x00000001: BUSY
[16808.633157] xocl:transfer_abort: abort transfer 0x0000000034d1dc04, desc 699, engine desc queued 0.
[16808.633169] dma.xdma.u dma.xdma.u.2097152: dev ffff91600e6ff410, xdma_migrate_bo: DMA failed, Dumping SG Page Table
[16808.633175] dma.xdma.u dma.xdma.u.2097152: dev ffff91600e6ff410, xdma_migrate_bo: 0, 0xd1da51000
[16808.633194] dma.xdma.u dma.xdma.u.2097152: dev ffff91600e6ff410, xdma_migrate_bo: DMA failed, Dumping SG Page Table
[16808.633199] dma.xdma.u dma.xdma.u.2097152: dev ffff91600e6ff410, xdma_migrate_bo: 0, 0xd14c9e000

My kernel uses HBM0 to HBM5, with 256MB bank size each, my OpenCL memory objects are created on the correct memory locations (checked with xbutil query, getting the correct memory object sizes/counts on each bank).

My AXI firewall reliably trips with code 0x100000 (ERRS_BRESP) in level 3, i'm getting the kernel log above.

The kernel has two memory interfaces:
M_AXI_GMEM and S_AXI_CONTROL, for both of them I have an ILA integrated into my design and I'm having a vivado instance ready which is able to debug the Alveo Card.


How should I setup triggers to see a what happens underneath?

EDIT: I'm using the xilinx_u280_xdma_201920_3 platform.

Cheers,
Martin

0 Kudos
Reply
mcertosi
Xilinx Employee
Xilinx Employee
821 Views
Registered: ‎10-19-2015

Hi @martin31821 

This is likely a different root cause than what this thread was debugging.

ERRS_BRESP = Bit 20 A slave must only give a write response after both the write address and the last write data item are transferred and the BID, if any, must match an outstanding AWID

The error is saying that someone is misbehaving. Most likely one kernel was expected to send back data but either didn't or sent back too much data.

If all the kernels are the same we should try to simplify the problem and reproduce the error.

We should also instrument the light weight protocol checker into the interfaces of your design and see if there are any protocol violations between the kernels and the HBMs. 

Since you already have ILAs instrumented, can you tell me where they are?

This is a difficult error to trigger on, you'd be looking for a an uninitiated write data, the condition would look like a BVALID but without a WVALID earlier. This kind of conditional triggering isn't possible.

Have you ran your host code in hardware emulation using Vitis? 

Can you remove some of the kernels and replicate the error? 

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
martin31821
Visitor
Visitor
813 Views
Registered: ‎01-08-2020

Hi @mcertosi 

I have exactly one kernel, but the problem seems to be related with my host code, since it is hanging even before I'm calling `clEnqueueTask()`, during the `clMigrateMemObjects()` after my `clCreateProgramWithBinary`.

I tried to track it down to a specific number or size of buffer objects, but I didn't manage to find a cause.

The LAPC did not report any error.
Hardware emulation works fine.

My host code looks like this:

fs::path kernel_path = fs::current_path().parent_path();
kernel_path /= "kernel/kernel.xclbin";

auto devices = xcl::get_xil_devices();
auto fileBuf = xcl::read_binary_file(kernel_path);
cl::Program::Binaries bins {{ fileBuf.data(), fileBuf.size() }};
auto device = devices[0];
cl::Context context(device);
cl::CommandQueue queue(context, device, CL_QUEUE_PROFILING_ENABLE);

std::vector<unsigned char, aligned_allocator<unsigned char>> st (4096, 42);
cl::Program program(context, { device }, bins, NULL);
cl::Kernel kernel(program, "search");

cl_mem_ext_ptr_t memExt;
memExt.obj = st.data();
memExt.param = 0;
memExt.flags = XCL_MEM_TOPOLOGY | 0; // HBM0
cl::Buffer st_buffer(context, (cl_mem_flags)(CL_MEM_READ_ONLY | CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR) ,st.size()*sizeof(cl_uchar), &memExt);
queue.enqueueMigrateMemObjects({ st_buffer }, 0);
queue.finish(); // <-- This blocks and leads to AXI timeouts/firewall trips.
0 Kudos
Reply
mcertosi
Xilinx Employee
Xilinx Employee
752 Views
Registered: ‎10-19-2015

Hi @martin31821 

I think the firewall trip may have happened prior to running the host code you've sent over. 

When the firewall is tripped you need to clear it with either a reboot or using xbutil reset. Do either of those until xbutil query does not report a firewall trip, then run your host program again. 

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
EW
Newbie
Newbie
659 Views
Registered: ‎06-13-2020

I found this post due to a different reason but it might be related: I just installed a new U280 card into a machine which had previously contained a different Alveo card for about half year. I installed the latest U280 shells (xilinx-u280-xdma-201920.3-2789161.x86_64 and corresponding dev) but did not update the xrt which had been installed and used (successfully) with the previously installed board. xbutil validate ok but issues a warning "AUX POWER NOT CONNECTED" - that's why I came here. (Obviously I had connected the card properly) My search yielded this post as result because the original poster's run shows exactly that warning from xbutil. My 2 cents: - If the warning is correct, and there is in fact a lack of power then that can obviously lead to all kinds of problems - the least (but likely) of which is a hang - After I installed the latest xrt (the one which is listed to go with the version of shells) - xbutil does not print the warning any longer and my card validates as expected. - The oddity of the warning about such a basic issue (power cable not plugged in...) being caused by a version mismatch between xrt and shell could point to a similar cause in the other two mentioned issues mentioned in this post (and which are as far as I can so far unresolved)
0 Kudos
Reply