cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer
Observer
361 Views
Registered: ‎11-12-2019

No device found after updating the firmware on the board

 

Hi,

I'm trying to use the U250 board.

After flashing the firmware to the Alveo card and cold reboot, the board cannot be found by lspci and xbmgmt scan.
(To flash the firmware, I used the command xbmgmt flash --update --shell <shell_name>)

I also tried flashing the board with mcs file. (https://www.xilinx.com/support/answers/71757.html)

Is there no need to flash the firmware again after flashing the board with mcs file?

Thanks,

 

8 Replies
Highlighted
Observer
Observer
316 Views
Registered: ‎02-12-2020

Re: No device found after updating the firmware on the board

This is strange!  Based on what I know (not a whole lot) the part should default to loading the factory default "golden" image from the flash if there is any problem with the image you loaded into the user region.  And hopefully that "golden" image region is read only so that you can't overwrite it.  Does the part still show up via JTAG?

0 Kudos
Observer
Observer
305 Views
Registered: ‎02-12-2020

Re: No device found after updating the firmware on the board

I just performed almost the exact same routine (except I programmed the flash with a custom MCS through the Vivado Hardware Manager) and now I no longer see a PCIe device show up on the bus at all, even after a cold boot.  So I'm very curious to see how this turns out, as I suspect I'm experiencing the same problem.

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
273 Views
Registered: ‎10-19-2015

Re: No device found after updating the firmware on the board

Hi @wonsikleee and @jrwagz 

Can you tell me which operating system, kernel, shell, and XRT are installed on the system? 

Do you have the flash logs or the logs printed to the screen from flashing the card? 

not showing up in LSPCI is an issue... was there a state where you were able to see the card with lspci? 

What shell was on the card before you updated? 

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Highlighted
Observer
Observer
256 Views
Registered: ‎02-12-2020

Re: No device found after updating the firmware on the board

@mcertosi 

I am running Fedora 27, and because of that I'm not using XRT, but instead I'm doing the Custom Flow, and using the Hardware Manager to program the MCS file to the flash directly.

When I load the "revert_to_golden.mcs" file that is attached to AR# 71757, I am able to get the card to again show up on the bus as expected, which I'm validating with the following command:

# lspci -vd 10ee:
17:00.0 Processing accelerators: Xilinx Corporation Device d000
        Subsystem: Xilinx Corporation Device 000e
        Flags: fast devsel, IRQ 11, NUMA node 0
        Memory at b2000000 (32-bit, non-prefetchable) [disabled] [size=32M]
        Memory at b4000000 (32-bit, non-prefetchable) [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1c0] #19
        Capabilities: [400] Access Control Services

So I can at least get back to a known functioning baseline.

However, what I am attempting to do is to load a custom MCS onto the Customer Programmable region of the flash (which starts at address 0x01002000, as per UG1289 and AR# 71756). 

I followed the steps outlined in UG1289 and in AR# 71756 to generate this custom MCS file (which by the way is the generated example design for the PCIe DMA IP, generated in 2019.1, WITHOUT tandem PCIe enabled.  I made sure that all of the settings outlined in the "u200_bitstream_constraints.xdc" were set, and the command I used to generate the MCS file was:

write_cfgmem -format MCS -size 128 -interface SPIx4 -loaddata "up 0x01002000 ./au200_pcie_dma_2019_1_no_tandem_ex.runs/impl_1/au200_pcie_dma_2019_1_no_tandem_ex.bit" au200_pcie_dma_2019_1_no_tandem_ex.mcs
Command: write_cfgmem -format MCS -size 128 -interface SPIx4 -loaddata {up 0x01002000 ./au200_pcie_dma_2019_1_no_tandem_ex.runs/impl_1/au200_pcie_dma_2019_1_no_tandem_ex.bit} au200_pcie_dma_2019_1_no_tandem_ex.mcs
Creating config memory files...
Creating bitstream load up from address 0x01002000
Loading datafile ./au200_pcie_dma_2019_1_no_tandem_ex.runs/impl_1/au200_pcie_dma_2019_1_no_tandem_ex.bit
Writing file ./au200_pcie_dma_2019_1_no_tandem_ex.mcs
Writing log file ./au200_pcie_dma_2019_1_no_tandem_ex.prm
===================================
Configuration Memory information
===================================
File Format        MCS
Interface          SPIX4
Size               128M
Start Address      0x00000000
End Address        0x07FFFFFF

Addr1         Addr2         Date                    File(s)
0x01002000    0x05C741FD    Feb 12 13:17:32 2020    ./au200_pcie_dma_2019_1_no_tandem_ex.runs/impl_1/au200_pcie_dma_2019_1_no_tandem_ex.bit
0 Infos, 0 Warnings, 0 Critical Warnings and 0 Errors encountered.
write_cfgmem completed successfully
write_cfgmem: Time (s): cpu = 00:00:07 ; elapsed = 00:00:08 . Memory (MB): peak = 10785.043 ; gain = 146.895 ; free physical = 12759 ; free virtual = 111801

I really want Tandem PCIe enabled, but I can't get the example design with tandem PCIe enabled to generate, see https://forums.xilinx.com/t5/Alveo-Accelerator-Cards/XDMA-Tandem-PCIe-Example-App-on-u200/td-p/1074329 for details on that issue).

It is possible that my image isn't loading fast enough on the part, and therefore not enumerating on the PCIe bus in time, and thus why I'm not seeing it via lspci?

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
246 Views
Registered: ‎10-19-2015

Re: No device found after updating the firmware on the board

Hi @jrwagz 

Thanks for the extra information. Custom shell creation is very difficult and I can't always give the best guidance in that area. 

IF you think the bitstream is too big to load then have the PCIe link, warm boot the server. This will keep the FPGA programmed, and allow PCIe to re-enumerate. 

If that doesn't work, then your design is broken somewhere else. You'd want a PCIe bus analyzer or a smaller easier to debug design, or possibly play with a partial reconfiguration design where the pcie and some other necessary functionality is the only thing in the shell, then you later load the acceleration kernel. 

At that point, you have something similar to our current offering, so unless your design specifically needs a feature that we are unable to offer in our shell/platform today the easier path, and the path that I can guide you down, is to use the acceleration environment. 

I'm sure that's not as helpful as you'd like, let me know what else you think I can do. 

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Highlighted
Observer
Observer
245 Views
Registered: ‎02-12-2020

Re: No device found after updating the firmware on the board

I further investigated whether or not my design simply wasn't loading fast enough to enumerate the PCIe bus in time, and found out that indeed that was my issue.

I tested that by performing a reboot of the machine, rather than a cold boot, and when the server came back up, my custom image enumerated on the PCI bus as expected:

# lspci -vd 10ee:
17:00.0 Serial controller: Xilinx Corporation Device 9034 (prog-if 01 [16450])
        Subsystem: Xilinx Corporation Device 0007
        Flags: fast devsel, IRQ 35, NUMA node 0
        Memory at b5e00000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1c0] #19
        Kernel modules: xdma

So I guess this is a workaround for me for the time being, I'll just know that I need to do a cold boot, followed by a reboot to get this image to load, until I can get tandem PCIe to work as desired.

0 Kudos
Highlighted
Observer
Observer
241 Views
Registered: ‎02-12-2020

Re: No device found after updating the firmware on the board

@mcertosi , thanks for the idea, that's exactly what it was (PCIe enumeration time).  If you can find the right person to help solve my problem with getting the tandem PCIe example design to work (https://forums.xilinx.com/t5/Alveo-Accelerator-Cards/XDMA-Tandem-PCIe-Example-App-on-u200/td-p/1074329) that would be most appreciated!

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
238 Views
Registered: ‎10-19-2015

Re: No device found after updating the firmware on the board

Hi @jrwagz 

I'm glad you have a workaround for now. 

Can you try to cross-post or move the post over to the PCIe experts forum? You'll have better traction there, unfortunately we all have our limits of expertise.

Regards,

M

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------