03-21-2014 03:50 AM
We do want to use AMP (asymmetric multi-processing) on our ZC7030-based board, with a custom Linux running on core0 and a standalone application on core1 doing soft realtime stuff via the PL. The core1 application is NOT loaded by the FSBL, but rather treated like a "firmware" by the running Linux, which loads the application and stops/starts/resets the second core at leisure, even during runtime (think: updating firmware).
The FSBL leaves core1 alone, except for any "default" initialisation done by the code included in 14.4 SDK to get cpu1 into a clean wait state.
Problem: Sometimes the processing system freezes after restarting core1 from Linux, maybe half the time. The first upload/run never hangs. We always replace the binary in DDR, even if it is identical, on restarts. If the PS hangs, it does shortly (~1sec) after the CPU1 successfully runs the binary: I can see a line of output on the serial, coming from the standalone app, then the PS hangs.
The app uses OCM with a custom Linux driver to allow communication, Linux definitely does not use CPU1, and leaves DDR above 768M alone, which is partially used by the CPU1 app.
What are we doing wrong here? I assume that either the (complex) core1 application violates a shared resource (SCU, maybe?) we don't know about, or the loading process we implemented is broken.
The rough outline of such a restart is this, currently:
As you can see, this is rather ugly, and might or might not work reliably. Especially the wait loop is a crude hack, and I would also prefer a cleaner solution than this fiddling with 0x00000000. I have found another posting that offered a fix that might apply, and would cause my cpu1 to listen somewhere else for it's jump in address than at 0x00000000, and would try that out if no-one has a pin-point suggestion what else to try instead.
My questions really boil down to:
Sorry for the lengthy post, I wanted to provide enough details right off the bat.
Thanks a lot!
09-04-2014 11:10 AM
Did you ever solve the problems mentioned in your post from earlier this year? One thing that I found for AMP mode recently is that CPU1 baremetal shouldn't initialize the global timer in the BSP otherwise the Linux on CPU0 "freezes" or stalls for 100-200 seconds. It does recover eventually but just seems to hang. John of Xilinx gave the fix last week.
I am still interested in:
- writing WFE address location
- stopping CPU1 (reset + clock)
- writing CPU1 code to DRAM from Linux on CPU0 (shared DRAM, tell Linux will devicetree bootargs that memory limited to 768M)
- starting CPU1
- deassert reset and clock etc.
However, it doesn't quite work for me yet. Deasserting reset to A9_CPU_RST_CTRL register causes the CPU0 Linux to misbehave. No panic/fault but the prompt never comes back. I am doing this with Z7045 on Zynq ZC705 board.