Showing results for 
Show  only  | Search instead for 
Did you mean: 
Registered: ‎03-21-2014

ZC7030 AMP Replacing and restarting core1 standalone app from Linux running on core0 during runtime?

Hello everyone!


We do want to use AMP (asymmetric multi-processing) on our ZC7030-based board, with a custom Linux running on core0 and a standalone application on core1 doing soft realtime stuff via the PL. The core1 application is NOT loaded by the FSBL, but rather treated like a "firmware" by the running Linux, which loads the application and stops/starts/resets the second core at leisure, even during runtime (think: updating firmware).


The FSBL leaves core1 alone, except for any "default" initialisation done by the code included in 14.4 SDK to get cpu1 into a clean wait state.


Problem: Sometimes the processing system freezes after restarting core1 from Linux, maybe half the time. The first upload/run never hangs. We always replace the binary in DDR, even if it is identical, on restarts. If the PS hangs, it does shortly (~1sec) after the CPU1 successfully runs the binary: I can see a line of output on the serial, coming from the standalone app, then the PS hangs.


The app uses OCM with a custom Linux driver to allow communication, Linux definitely does not use CPU1, and leaves DDR above 768M alone, which is partially used by the CPU1 app.

What are we doing wrong here? I assume that either the (complex) core1 application violates a shared resource (SCU, maybe?) we don't know about, or the loading process we implemented is broken.


The rough outline of such a restart is this, currently:

  1. open /dev/mem and map 0x00000000 (cpu1_boot_vector), 0x30000000 (binary base), and 0xF8000000 (SLCR base) properly
  2. Disable SLCR write protection (SET 0xF8000008 0xDF0D)
  3. A9_CPU_RST_CTRL - assert clock stop and SW reset on CPU1 (SET 0xF8000244 0x22)
  4. memcpy() the binary to 0x30000000
  5. Store 3x32bit of data at 0x00000000 to a temporary memory location
  6. Overwrite(!) the location 0x00000000 with "shellcode" to jump to 0x30000000:
    const uint32_t cpu1_jumpcode[] = { 0xe59f0000, 0xe1a0f000, 0x30000000 
  7. A9_CPU_RST_CTRL - restart CPU1 by first clearing the reset bit, then the stop bit (CLEAR 0xF8000244 0x02, CLEAR 0xF8000244 0x20) without delay inbetween both calls.
  8. A wait loop. Yes, seriously. Don't laugh.
    /* give CPU some time to wake up */
    /* TODO: implement status register poll instead of silly wait loop */
  9. Restore the original values at 0x00000000
  10. Cleanup or allocated memory.


As you can see, this is rather ugly, and might or might not work reliably. Especially the wait loop is a crude hack, and I would also prefer a cleaner solution than this fiddling with 0x00000000. I have found another posting that offered a fix that might apply, and would cause my cpu1 to listen somewhere else for it's jump in address than at 0x00000000, and would try that out if no-one has a pin-point suggestion what else to try instead.


My questions really boil down to:


  • Is the loader sequence basically correct? Using SLCR to pull stop and reset, replace binary in DDR, fix boot vector, release reset, do not wait, release stop? Especially the workaround with using shellcode jumps and rewriting 0x00000000 in vivo seems dangerous to me, and could easily explain this behaviour.
  • To cleanly restart CPU1 while core0 keeps running independently, is pulling the reset pin all that's needed to do? What about cache invalidation and such?
  • Where can I find a complete list of shared resources between cores I have to watch out for when running independent applications?
  • How can I replace he wait loop with a proper check for a running core1? Isn't there some kind of flag I could read? This is really, really annoying me. :)
  • Is there any documentation about the subject of restarting CPU1 from a running CPU0 OS with a new binary, without affecting the CPU0 OS or PL?


Sorry for the lengthy post, I wanted to provide enough details right off the bat.


Thanks a lot!

0 Kudos
1 Reply
Registered: ‎03-27-2014

Hello herbrich,


Did you ever solve the problems mentioned in your post from earlier this year? One thing that I found for AMP mode recently is that CPU1 baremetal shouldn't initialize the global timer in the BSP otherwise the Linux on CPU0 "freezes" or stalls for 100-200 seconds. It does recover eventually but just seems to hang. John of Xilinx gave the fix last week.


I am still interested in:


- writing WFE address location

- stopping CPU1 (reset + clock)

- writing CPU1 code to DRAM from Linux on CPU0 (shared DRAM, tell Linux will devicetree bootargs that memory limited to 768M)

- starting CPU1

- deassert reset and clock etc.


However, it doesn't quite work for me yet. Deasserting reset to A9_CPU_RST_CTRL register causes the CPU0 Linux to misbehave. No panic/fault but the prompt never comes back. I am doing this with Z7045 on Zynq ZC705 board.

0 Kudos