UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
772 Views
Registered: ‎02-19-2019

Issues Setting up LSF

Jump to solution

Hello Xilinx board!
I've been trying to setup a test LSF cluster at my work, and I've run into a series of problems while trying to get this to work that I was hoping I might find some answers to here.

For a bit of background, my machine is running Ubuntu 16.04 in VirtualBox running on a Windows 10 host machine. I'm using Vivado 2018.2. The cluster currently only consists of an Ubuntu master and an Ubuntu slave running in separate virtual machines.
After working through some permissions issues, I am able to now start the cluster, test it using Vivado (which it passes), and send the slave work.

The LSF command I'm using for this is:
bsub -R select[type=X86_64] -N -q normal -m slave-vm

Which puts jobs in the "normal" queue and sends them only to my slave-vm (excluding the master from jobs).
I can see directories in the .runs directory get populated:

2019-05-28 15_57_25-Ubuntu 16.04 [Running] - Oracle VM VirtualBox.png

(For my simple test project I only have three out of context modules that need to be synthesized).
Part way through synthesis of any of these modules however, I receive an error that looks like the following:

 

*** Running vivado
with args -log zcu102_led_controller_simpl_0_0.vds -m64 -product Vivado -mode batch -messageDb vivado.pb -notrace -source zcu102_led_controller_simpl_0_0.tcl


****** Vivado v2018.2 (64-bit)
**** SW Build 2258646 on Thu Jun 14 20:02:38 MDT 2018
**** IP Build 2256618 on Thu Jun 14 22:10:49 MDT 2018
** Copyright 1986-2018 Xilinx, Inc. All Rights Reserved.

source zcu102_led_controller_simpl_0_0.tcl -notrace
Command: synth_design -top zcu102_led_controller_simpl_0_0 -part xczu9eg-ffvb1156-2-e -mode out_of_context
Starting synth_design
Attempting to get a license for feature 'Synthesis' and/or device 'xczu9eg'
INFO: [Common 17-349] Got license for feature 'Synthesis' and/or device 'xczu9eg'
INFO: Launching helper process for spawning children vivado processes
INFO: Helper process launched with PID 7633
---------------------------------------------------------------------------------
Starting RTL Elaboration : Time (s): cpu = 00:00:04 ; elapsed = 00:00:08 . Memory (MB): peak = 1659.742 ; gain = 0.000 ; free physical = 5584 ; free virtual = 9201
---------------------------------------------------------------------------------
INFO: [Synth 8-6157] synthesizing module 'zcu102_led_controller_simpl_0_0' [/home/projects/xilinx-zcu102-2018.2/hardware/xilinx-zcu102-2018.2/xilinx-zcu102-2018.2.srcs/sources_1/bd/zcu102/ip/zcu102_led_controller_simpl_0_0/synth/zcu102_led_controller_simpl_0_0.v:57]
INFO: [Synth 8-6157] synthesizing module 'led_controller_simplified' [/home/projects/xilinx-zcu102-2018.2/hardware/xilinx-zcu102-2018.2/xilinx-zcu102-2018.2.srcs/sources_1/bd/zcu102/ipshared/809c/hdl/led_controller_simplified.v:23]
INFO: [Synth 8-6155] done synthesizing module 'led_controller_simplified' (1#1) [/home/projects/xilinx-zcu102-2018.2/hardware/xilinx-zcu102-2018.2/xilinx-zcu102-2018.2.srcs/sources_1/bd/zcu102/ipshared/809c/hdl/led_controller_simplified.v:23]
INFO: [Synth 8-6155] done synthesizing module 'zcu102_led_controller_simpl_0_0' (2#1) [/home/projects/xilinx-zcu102-2018.2/hardware/xilinx-zcu102-2018.2/xilinx-zcu102-2018.2.srcs/sources_1/bd/zcu102/ip/zcu102_led_controller_simpl_0_0/synth/zcu102_led_controller_simpl_0_0.v:57]
---------------------------------------------------------------------------------
Finished RTL Elaboration : Time (s): cpu = 00:00:05 ; elapsed = 00:00:13 . Memory (MB): peak = 1659.742 ; gain = 0.000 ; free physical = 5445 ; free virtual = 9069
---------------------------------------------------------------------------------
Failed to open file ./.Xil/Vivado-7542-seth-VirtualBox/elab.rtd. Please check the path and rerun synthesis.
invalid command name "NULL"
INFO: [Common 17-83] Releasing license: Synthesis
6 Infos, 0 Warnings, 0 Critical Warnings and 1 Errors encountered.
synth_design failed
ERROR: [Common 17-69] Command failed: Synthesis failed - please see the console or run log file for details
INFO: [Common 17-206] Exiting Vivado at Tue May 28 16:11:37 2019...

 

The file it fails to open, elab.rtd, I have verified is being created at that path with proper permissions.
This was hard to verify, because as soon as the failure occurs, the .Xil directory is wiped.

In an attempt to simplify the problem, and eliminate the intricacies of LSF from the equation,
(since I assumed I was running into a permissions issue of some sort),
I've put my small test project on a shared network drive and have tried synthesizing it with a single computer (the master VM).

I have the project directory mounted as a samba share using the following entry in /etc/fstab:
//192.168.7.2/share/Seth/xilinx-zcu102-2018.2 /home/projects/xilinx-zcu102-2018.2 cifs user=***,pass=***,rw,file_mode=0777,dir_mode=0777 0 0

Surprisingly enough, I get the same exact failure as above when trying to synthesize a project stored on a network drive when not using LSF.
If i take the same exact project, put it locally on my machine at the exact same path as where I mounted it, it synthesizes just fine.

Is having a project directory stored on a network share something that is supported by Vivado?
Is there something special I need to do to make this work?
Since LSF is an option for remote synthesis, and LSF requires you to have shared directories
between computers in your cluster, I assumed that this was something that was supported by Vivado.

Any tips or hints from people who've tried out or done this configuration are greatly appreciated.
I've been banging my head against the wall trying to get this to work!

Thanks,
~Seth

0 Kudos
1 Solution

Accepted Solutions
Mentor watari
Mentor
766 Views
Registered: ‎06-16-2013

Re: Issues Setting up LSF

Jump to solution

Hi seth.kramer@spectranetix.com 

 

It seems VirtualBox issue or samba issue.

I suggest you to use latest VirtualBox and NFS instead of samba without delayed writing option.

 

Best regards,

View solution in original post

0 Kudos
3 Replies
Mentor watari
Mentor
767 Views
Registered: ‎06-16-2013

Re: Issues Setting up LSF

Jump to solution

Hi seth.kramer@spectranetix.com 

 

It seems VirtualBox issue or samba issue.

I suggest you to use latest VirtualBox and NFS instead of samba without delayed writing option.

 

Best regards,

View solution in original post

0 Kudos
734 Views
Registered: ‎02-19-2019

Re: Issues Setting up LSF

Jump to solution

Thanks for the reply watari!
I'm using a very recent version of VirtualBox (which I've since upgraded to 6.0.8 and didnt resolve the issue)
but you're probably correct about the write delays causing my issue.

Unfortunately I'm not a networking person by trade, so I'm probably going to have to do some
spelunking to try and figure out how i might reduce the delays.

My original test setup was using a directory shared via samba on my master VM,
and then mounted on my slave VM.
In briefly looking at the samba configuration options, it appears something like:
socket options = IPTOS_LOWDELAY TCP_NODELAY
might help.

Will have to look into setting up NFS as a possible option.

0 Kudos
694 Views
Registered: ‎02-19-2019

Re: Issues Setting up LSF

Jump to solution

So I tried setting up my Samba server with different config options to minimize the delay, but was unsuccessful with just:
socket options = IPTOS_LOWDELAY TCP_NODELAY

Based on suggestions, i then went to setup an NFS server on my master.
Had some permissions issues getting this to work at first, but worked through them.
This solved my issue and now LSF works in Vivado for me.
Below is a brief description of setting up NFS for LSF in hopes that it helps someone in the future trying to configure this.

I set up my intial NFS configuration by using the guide here:
https://www.linuxuprising.com/2018/11/easy-nfs-share-setup-in-ubuntu-linux.html
Essentially it uses Simple NFS GUI to give you a basic implementation.

I then made some modifications to my master and slave to get permissions that would allow
the slave to both read/write and execute files.

Master
I modified the entry for my share made in /etc/exports to look like this:

# Shared folder NFS as Server
/home/build/projects/ 192.168.7.91(rw,all_squash,anonuid=0,anongid=0,sync)

From my understanding, what this does is squash all user account access from clients to
an anonymous user who has root access (anonuid=0,anongid=0).
This could be a large security risk if used as a final implementation, but it works just fine for my testing.


Slave
I modified the entry created in /etc/fstab:

# Shared folder NFS from Server & mount point
192.168.7.92:/home/build/projects /home/build/projects nfs exec,auto,hard,intr 0 0

I changed the mount path (/home/build/projects in my case) to match that of the master,
since for LSF to work, the paths to the project directory must be the same for all hosts in the cluster.
I also modified the default "users" option to "exec", since "users" defaults to the noexec option.
A good way to check the current options for a given mountpoint is by looking at the entry made for it in /etc/mtab.