UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Advisor eilert
Advisor
8,844 Views
Registered: ‎08-14-2007

HW-cosimulation speed with ml506.

Hi,

I have a simple FIR-Lowpass of 74th order generated with the FIR compiler 5.0.

The model has just one input and one output gateway.

 

I generated a hw-cosim block for the ML506 board with communication over the network.

Ping-Times: rtt min/avg/max/mdev = 0.120/0.174/0.260/0.045 ms

 

Now I took the times for two different numbers of samples using tic/toc with the sim command from a matlab script:

                              SW-Simulation(full)      SW-Simulation(cached)       HW-Cosimulation

2^14 Samples:          101s                                 12s                                                95s

2^16 Samples           116s                                 38s                                             147s

2^18 Samples           241s                               161s                                             377s

 

The SW-Simulation(full) is slow because the cache is cleared.

The times include model loading time, simulation initialisaton and the actual simulation.

So there are offsets included.

The Offset for the ACE-Setup and FPGA Configuration is 77s.

 

Now, after plotting these results one can see, that the hw-cosimulation is not just slower, but the gap between hw-cosimulation and sw-simulation is increasing with the number of samples. So there will be no point where the hw-cosimulation becomes benefitial for the user.

 

I'm well aware that these numbers are heavily depending on the model that is used.

That's why I wonder now which kind of models (or xilinx blockset blocks) cause high sw-simulation times and therefore would actually benefit from a hw-cosimulation, despite the communication bottleneck that seems to be some limiting factor and needs to be investigated separately.

 

Regards

  Eilert

 

0 Kudos
9 Replies
Advisor evgenis1
Advisor
8,832 Views
Registered: ‎12-03-2007

Re: HW-cosimulation speed with ml506.

Hi Eilert,

 

It indicates that the amount of traffic between ML506 board and the SW might be the bottleneck, and slows the co-sim proportionally to the # of samples.

 

One quick test to validate this theory is to measure the amount of data sent on the network (e.g. with Wireshark) as a function of # of samples.

 

From my experience (which is also a common sense), co-sim has the most speedup comparing to sim when sufficiently large designs run on the board and have little communication overhead (like transactions). 

 

Thanks,

Evgeni

Tags (1)
0 Kudos
Advisor eilert
Advisor
8,828 Views
Registered: ‎08-14-2007

Re: HW-cosimulation speed with ml506.

Hi Evengi,

thanks for confirming my assumptions.

 

Reducing the communication overhead is a good idea, but how can it be done and what kind of applications could be used for that.

 

At the moment I'm simply streaming the samples and have a design that mainly consists of a high number of DSP48 Blocks. It seems like these blocks are easily simulated by Matlab. The FIR model is kept simple, because it is intended to be used in a presentation. Yet my hope was that the high number of DSP48 elements would cause more load for the simulator than it actually does.

 

 

Regards

  Eilert

0 Kudos
Highlighted
Advisor evgenis1
Advisor
8,828 Views
Registered: ‎12-03-2007

Re: HW-cosimulation speed with ml506.

Hi Eilert,

 

One option is to have a synthesizable testbench that includes self-generating source of traffic (e.g. pseudo-random generator of samples). There is always a challenge to come up with the "expected value" checker such that the output is not uploaded but checked on board as well. If there is a mismatch between expected and actual output, the error counter is incremented.

This way the whole system is running on board and the communication overhead is minimal (only polling for status).

 

I don't know if this scheme is applicable to your design.

 

Thanks,

Evgeni

Tags (1)
Advisor eilert
Advisor
8,825 Views
Registered: ‎08-14-2007

Re: HW-cosimulation speed with ml506.

Hi Evengi,

yes, that sounds like a good idea.

I need to work out some details but it should be possible to create a simple design with the mentioned properties.

*brain shifting into higher gear now*

 

Thanks for the suggestion.

 

Kind regards

  Eilert

 

0 Kudos
Advisor eilert
Advisor
8,825 Views
Registered: ‎08-14-2007

Re: HW-cosimulation speed with ml506. #UPDATE#

Hi,

just made another measurement of simulation times.

This time the FIR-LP was of 219th order, almost three times the size than for the test before.

 

Now things drastically changed:

                             SW-Simulation(full)      SW-Simulation(cached)       HW-Cosimulation

2^14 Samples:          256s                                 129s                                                95s

2^16 Samples           543s                                 545s                                              143s

2^18 Samples          2237s                              2197s                                             342s

Now the Simulink based simulations are way slower than the hw-cosimulation.

And there again is no convergence or crossing point expectable with rising sample times.

(So, communication overhead is no bottleneck anymore in this case.)

 

But the most interesting point is this:

Comparing the times for the hw-cosimulation of the two models shows that the size of the model seems to have no influence at all. Only the number of samples is important.

However, this might be because the two models are very much alike, exept for the size.

And the samples are exchanged as a continuous stream.

Other models might behave different.

 

Regards

  Eilert

0 Kudos
Advisor evgenis1
Advisor
8,816 Views
Registered: ‎12-03-2007

Re: HW-cosimulation speed with ml506. #UPDATE#

Hi Eilert,

 

I assume you don't monitor all the internal signals in 74th order and 219th order FIR - only the interfaces. If that's the case, then the co-sim numbers make sense. The amount of uploaded data remains about the same between the two experiments. The co-sim design on the board runs at much higher speed that the sw part (MHz vs KHz), therefore the increase in the design size doesn't have noticable effect of the co-sim runtime.

 

Thanks,

Evgeni

 

Tags (1)
0 Kudos
Advisor eilert
Advisor
8,811 Views
Registered: ‎08-14-2007

Re: HW-cosimulation speed with ml506. #UPDATE#

Hi Evengi,

you are right. The model is intended for showing simulation speed enhancement.

Also there's no access to the internal stages of the FIR anyway when using the FIR Compiler 5.0 (or any other).

 

About the speed, remember that in the first experiment it was the sw-simulation that was a lot faster.

Of course it is natural for a simulation to need more time on increasing complexity, since it is computed in a sequential manner. One interesting point here is that increasing the FIR-order by a factor of 3 causes an increase of the simulation time by a factor of about 12.

 

Despite the MHz clocking, in hardware it doesn't matter how long a pipeline may be, after initial latency results are generated on each clock cycle. It's the parallelism that beats the CPU.

Also, in the hw-cosim setup the DUT probably runs significantly slower than the original clock frequency, due to the synchronisation between the network core and Simulink.

 

   (95s-77s)  /  2^14 Samples = ca. 1 ms/Sample

   (342s-77s) / 2^18 Samples = ca. 1 ms/Sample

 

so the hw-cosim (after removing the configuratiojn time of 77s) needs about 1 ms per sample constantly.

Slight deviations caused by the ping times could be observed too. In the end there's not much left of the MHz's the board is clocked with. Only that it's needed to run the ethernet core properly, of course.

 

But I think it's nice to see how much information can be derived from the analysis of such a simple design. Where "simple" has to be seen in the effort spent in the design creation using Matlab/Simulink and sysgen. Wheras the true hardware structure is big and quite complex.

 

Kind regards

  Eilert

 

 

 

0 Kudos
Contributor
Contributor
6,647 Views
Registered: ‎05-24-2013

Re: HW-cosimulation speed with ml506. #UPDATE#

Can Any body give the example files and process how they did this ?

 

Secondly can any body give expert suggestion on XAPP 1031 and the procedure of measuring simulation run time.

 

 

Sorry i am asking in some others question, But i need help.

0 Kudos
Visitor jcriand
Visitor
6,355 Views
Registered: ‎05-15-2014

Re: HW-cosimulation speed with ml506. #UPDATE#

0 Kudos