cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
awmurray
Visitor
Visitor
1,076 Views
Registered: ‎11-05-2020

XRT 2019.2 Branch w/U50 Error on Board Tests

I have the XRT 2019.2 branch 2019.2 installed on RHEL 7.9 with a U50 card installed.

I am trying to run the board tests and I'm getting errors.  I may be trying to run the tests incorrectly or from the wrong directory.

The RPM that I built and installed is from XRT/build/Debug/xrt_201920.2.3.0.7.9-xrt.rpm

Build instructions: https://github.com/Xilinx/XRT/blob/2019.2/README.rst

Test instructions: https://github.com/Xilinx/XRT/blob/2019.2/src/runtime_src/doc/toc/test.rst

This is what I've tried (from the testing link above):

[admin@localhost tests]$ /home/admin/alan/XRT/build/board.sh -board U50 -sync
NO XCLBIN : /home/admin/alan/XRT/tests/xrt
NO XCLBIN : /home/admin/alan/XRT/tests/xrt
NO XCLBIN : /home/admin/alan/XRT/tests/xrt
[… repeated for each test/directory]

 

0 Kudos
Reply
14 Replies
awmurray
Visitor
Visitor
1,006 Views
Registered: ‎11-05-2020

Update:

Looking at XRT/build/build.sh, I can see I'm missing a directory with the *.xclbin files in them:

 

From build.sh:

sdx=/proj/xbuilds/${rel}_daily_latest/installs/lin64/Scout/${rel}
if [[ $rel < 2019.2 ]]; then
   sdx=/proj/xbuilds/${rel}_daily_latest/installs/lin64/SDx/${rel}
fi


I'm completely missing a /proj directory.  Where is that supposed to come from??

 

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
983 Views
Registered: ‎12-06-2019

Hi @awmurray ,

The test instructions talk about the building and testing processes used internally by our development.

As your OS isnt listed in the supported OS's list, I think building XRT for your OS was the right way to go. From there, you can follow the steps in UG1370 for the installation of the shell for your U50 through validation of the card: https://www.xilinx.com/support/documentation/boards_and_kits/accelerator-cards/1_6/ug1370-u50-installation.pdf

The validate command will check various settings and conditions on the card and confirm it can download and run a test xclbin.

Our of curiosity, is there a reason you are installing XRT 2019.2 instead of the latest released version? https://www.xilinx.com/products/boards-and-kits/alveo/u50.html#gettingStarted

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

awmurray
Visitor
Visitor
953 Views
Registered: ‎11-05-2020

I have switched and used the steps you gave in this link:

 https://www.xilinx.com/support/documentation/boards_and_kits/accelerator-cards/1_6/ug1370-u50-installation.pdf

Everything looks good but when I'm trying to flash the firmware it gives "No card found".

[admin@localhost alan]$ sudo /opt/xilinx/xrt/bin/xbmgmt flash --update --shell xilinx_u50_xdma_201920_1
No card is found!

[admin@localhost alan]$ sudo /opt/xilinx/xrt/bin/xbmgmt flash –scan
No card is found!

 

The "lspci -vd 10ee:" shows the device, though.

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
942 Views
Registered: ‎12-06-2019

Hi @awmurray ,

Thanks for the update. If xbmgmt cant see the card, but lspci does, we may have an issue with XRT and the drivers it uses. Can you please post the output of 

sudo lspci -vd 10ee:

Is this a new card, or has it worked previously in a different configuration?

Per the system requirements of XRT 2019.2: https://xilinx.github.io/XRT/2019.2/html/system_requirements.html, RHEL 7.9 is not a supported OS. Did you built XRT for your system, or did you acquire a pre-built installation from the product page archive?

Best,

-Emery

Xilinx Support

 

awmurray
Visitor
Visitor
921 Views
Registered: ‎11-05-2020

I used the prebuilt RPMs from the PDF for this machine (didn't use any custom build/installs).

The lspci is:

awmurray_0-1605209880486.png

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
895 Views
Registered: ‎12-06-2019

Hi @awmurray ,

The lspci output looks reasonable assuming the card is running a golden or recovery image. One bit missing from the bottom that I would expect is the xclmgmt driver being loaded under, as shown in this example:

$ sudo lspci -vd 10ee:
05:00.0 Processing accelerators: Xilinx Corporation Device d020
        Subsystem: Xilinx Corporation Device 000e
        Flags: bus master, fast devsel, latency 0, IRQ 84, NUMA node 0
        Memory at 3bffc000000 (64-bit, prefetchable) [size=32M]
        Memory at 3bffe000000 (64-bit, prefetchable) [size=128K]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1c0] #19
        Capabilities: [1f0] Virtual Channel
        Capabilities: [e00] Access Control Services
        Kernel driver in use: xclmgmt
        Kernel modules: xclmgmt


Could you try doing the following:

modprobe xclmgmt

Then see if xbmgmt can see the card? If not, please output dmesg to a log file and attach it so we can see what messages are being issued. 
In addition, could you kindly note what kernel version your RHEL 7.9 system is using?

You can also acquire the source for Github and compile it for your system. Instructions are here: https://xilinx.github.io/XRT/master/html/build.html and the code from here: https://github.com/Xilinx/XRT

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

 

 

 

0 Kudos
Reply
awmurray
Visitor
Visitor
868 Views
Registered: ‎11-05-2020

I get this:

[admin@localhost alan]$ sudo modprobe xclmgmt

modprobe: FATAL: Module xclmgmt not found.

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
852 Views
Registered: ‎12-06-2019

Hi @awmurray ,

Looks like we have an issue with the installation. Let's do the following (and please post or attach the console outputs of each):

1. Uninstall XRT:

sudo yum remove xrt

2. Download XRT and the 201920_3_xdma shell from the U50 product page: https://www.xilinx.com/products/boards-and-kits/alveo/u50.html#gettingStarted

3. Install XRT:

sudo yum install ./xrt_202010.2.7.766_7.4.1708-x86_64-xrt.rpm

4. Unzip, then install the shell for the U50:

Xilinx_u50-gen3x16-xdma-201920.3-2784799_noarch_rpm $ sudo yum install ./*.rpm

5. Flash  the shell onto the U50:

sudo /opt/xilinx/xrt/bin/xbmgmt flash --update --shell xilinx_u50_gen3x16_xdma_201920_3

6. Cold boot the system (power it clear off, then start it back up).

At this point, we should now be able to source xrt and validate the card:

source /opt/xilinx/xrt/setup.sh
xbutil validate

 

If you run into errors during any of the steps above, post the output and error you receive, and I'll take a look.

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

 

0 Kudos
Reply
awmurray
Visitor
Visitor
814 Views
Registered: ‎11-05-2020

I get an error on XRT:

awmurray_0-1605545132035.png

 

awmurray_1-1605545143509.png

 

I thought that kernel header version was correct from the troubleshooting- maybe not now with this new XRT?

Kernel packages:

awmurray_1-1605547194484.png

 

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
790 Views
Registered: ‎12-06-2019

Hi @awmurray ,

Previous to installing XRT, did you install the dependencies as outlined in UG1370 on Page 18: https://www.xilinx.com/support/documentation/boards_and_kits/accelerator-cards/1_6/ug1370-u50-installation.pdf :

 

For Redhat:
a. Open a terminal window and enter the following command:
$ sudo yum-config-manager --enable rhel-7-server-optional-rpms
This enables an additional repository on your system.
b. Enter the following command to install EPEL:
$ sudo yum install -y https://dl.fedoraproject.org/pub/epel/epelrelease-latest-7.noarch.rpm

 

I see the xocl module failed to build. Can you please post the make.log it references?

As this installation has issues, let's do the following.

1. Uninstall XRT

sudo yum remove xrt

2. Acquire and run the xrtdeps.sh script from here: https://github.com/Xilinx/XRT/blob/master/src/runtime_src/tools/scripts/xrtdeps.sh

sudo ./xrtdeps.sh

3. Follow the steps in my previous post starting at Step #3 to try the XRT installation again.

Please keep me posted.

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

 

0 Kudos
Reply
awmurray
Visitor
Visitor
752 Views
Registered: ‎11-05-2020

I did the steps- I get "no card found" on the flash command.

This is the make.log file:

awmurray_0-1605639404922.png

The only other thing I found in the troubleshooting section is if there is a USB in the card making it in maintenance mode- but I don't have physical access to the machine (but I can if I need to).

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
736 Views
Registered: ‎12-06-2019

Hi @awmurray ,

Correct. If the card has the debug USB cable plugged in, that can happen. Please check, and ensure the cable is not plugged in.

Do you have the outputs from the steps you tried? If the install output is the same as before, then the xclmgmt and xocl modules did not build and load successfully, hence it makes sense you wont be able to query or flash the card until that is resolved.

I think the next steps are the following:

1. Uninstall XRT

2. Build and install XRT for your OS: https://xilinx.github.io/XRT/2020.1/html/build.html

OR

3. Setup XRT on a host running a supported OS version: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_1/ug1451-xrt-release-notes.pdf

 

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

0 Kudos
Reply
awmurray
Visitor
Visitor
703 Views
Registered: ‎11-05-2020

It looks like it isn't installing the firmware as you suggested.  I have included the output of the dependency script output (xrtdepts.txt) and the build_sh.txt.

I am on the 2020.1 branch of XRT.

From build.sh.out:

 

-- xrtexec.hpp
CMake Warning at runtime_src/ert/CMakeLists.txt:52 (message):
  ****************************************************************

  No firmware files built or copied, resulting XRT package will be missing
  ERT scheduler firmware.  Use build.sh -ertfw <dir> to specify path to a
  directory with firmware to copy during XRT build.

  

  ****************************************************************

 

 

 

0 Kudos
Reply
emeryw
Xilinx Employee
Xilinx Employee
633 Views
Registered: ‎12-06-2019

Hi @awmurray ,

Per the instructions for building XRT https://xilinx.github.io/XRT/master/html/build.html:

XRT includes source code for ERT firmware. It needs to be compiled with the MicroBlaze GCC compiler, which is available in Xilinx Vitis™ Software Platform. To generate a complete XRT package, please install Vitis™ Software Platform and setup XILINX_VITIS environment variable. If XILINX_VITIS is not available in the build system, the building and packaging steps for ERT will be skipped. On the deployment system, XRT will try to find the ERT firmware in /lib/firmware/xilinx directory. If it’s not available, errors will be reported


Did you meet the requirements of installing Vitis and setting the environment variable as instructed above?
The xrtdeps.txt output doesnt appear to have anything glaring, but the build_sh.txt does, most notably issues with boost, such as:

 

 

CMakeFiles/core_pcielinux_objects.dir/scan.cpp.o: In function `boost::filesystem::path::path<boost::filesystem::directory_entry>(boost::filesystem::directory_entry const&, boost::enable_if<boost::filesystem::path_traits::is_pathable<boost::decay<boost::filesystem::directory_entry>::type>, void>::type*)':
/usr/include/boost/filesystem/path.hpp:139: undefined reference to `boost::filesystem::path_traits::dispatch(boost::filesystem::directory_entry const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::codecvt<wchar_t, char, __mbstate_t> const&)'
collect2: error: ld returned 1 exit status
make[2]: *** [runtime_src/core/pcie/linux/libxrt_core.so.2.6.0] Error 1
make[1]: *** [runtime_src/core/pcie/linux/CMakeFiles/xrt_core.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

 

 

Also, the end of the file abruptly ends:

 

 

[ 77%] Built target xocl
make: *** [all] Error 2

 

 

Do you have a custom installation of boost on your system, or an environment variable pointing to a different version?

Is this a fresh installation of RHEL 7.9?

Best,

-Emery

-------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

-------------------------------------------------------------------------

0 Kudos
Reply