UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
242 Views
Registered: ‎05-23-2018

Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

Hi,

I am developing an AMP application that runs baremetal on a custom board with a Zynq US+ (a CG-type, the kind with two A53 cores). Both cores execute almost identical code except for some initialization of the FPGA, which only the first core does. All has went on fine for several weeks of development, but now the processor continually gets stuck on the SynchronousInterruptHandler.

I have single stepped the code and seen that it can happen at various places (seemingly completely random!): in the middle of a printf statement or just while reading/writing registers in the FPGA.

The problem only initially occured for the first core. So then I tried swapping the base address of the two ELFs for the two cores, which made instead the other core get stuck in the SynchronousInterruptHandler. But after making some additions to the code, both cores continuously get stuck in the same manner.

How do I debug this? What can be wrong?

bild.pngBoth cores get stuck in the SynchronousInterruptHandler

The linker script tets up the memory like this (the linker variable ELF_BASE_ADDRESS is 0x0000000 for one core and 0x1FF80000 for the other, i.e. placed at start of, and halfway inside the memory respectively):

MEMORY 
{ psu_ddr_0_MEM_0 : ORIGIN = ELF_BASE_ADDRESS, LENGTH = (0x3FF00000 >> 1) /* ELF_BASE_ADDRESS must be defined as linker symbol */ psu_ocm_ram_0_MEM_0 : ORIGIN = 0xFFFC0000, LENGTH = 0x40000 psu_qspi_linear_0_MEM_0 : ORIGIN = 0xC0000000, LENGTH = 0x20000000 }


Zynq US+ CG
XSDK 2018.3

Sincerely,
Erasmus Cedernaes
Saab

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Scholar ericv
Scholar
223 Views
Registered: ‎04-13-2015

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

@erasmus.cedernaes.saab 

these registers provideinfo about where and the cause of the abort. They are:

ELR_ELn - address (or the one after) where the fault occirred

ESR_ELn - syndrome register

SPSR_ELn - state of the processor when the fault occurred

where n is the exception level in the abort (not when the fault occurred)

The registers are described in the ARM Architecture Reference Manual / ARMv8, forARMv8-A architecure Profile

 

7 Replies
Highlighted
Scholar ericv
Scholar
224 Views
Registered: ‎04-13-2015

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

@erasmus.cedernaes.saab 

these registers provideinfo about where and the cause of the abort. They are:

ELR_ELn - address (or the one after) where the fault occirred

ESR_ELn - syndrome register

SPSR_ELn - state of the processor when the fault occurred

where n is the exception level in the abort (not when the fault occurred)

The registers are described in the ARM Architecture Reference Manual / ARMv8, forARMv8-A architecure Profile

 

187 Views
Registered: ‎05-23-2018

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

@ericv
Thank you!

Looking in both Arm Architecture Reference Manual - Armv8, for Armv8-A architecture profile and Arm Cortex-A53 MPCore Processor Technical Reference Manual, I was able to interpret the state of the core that was stuck in the interrupt. However, the cause is still unclear. The Exception class indicated an Instruction Abort (0b100000), and the Instruction Specific Syndrome indicated a Translation fault, level 0 (0b0100).

Something that confuses me is that the exception does not occur when single stepping the code! Also, the Exception Link Register just points to the address of the Synchronous Interrupt Handler (i.e. the address that it gets stuck at).

0 Kudos
Scholar ericv
Scholar
176 Views
Registered: ‎04-13-2015

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

@erasmus.cedernaes.saab 

A translation fault is the processor accessing an address tagged as invalid in the MMU table. One thing easy to forget is if you are updating the MMU table once the MMU + cache have been enable is the MMU has its own small cache so the updated entries may not be seen by the MMU... and the MMU table is likely also in cached region. If that's your case, may be single stepping triggers an update of all that caching.

158 Views
Registered: ‎05-23-2018

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution
@ericv
I am not updating the MMU table in my application, so this shouldn't be a problem. Thanks for the tip anyway.
0 Kudos
Moderator
Moderator
120 Views
Registered: ‎09-12-2007

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

Is this still an open issue, as the thread is marked as solved.

0 Kudos
114 Views
Registered: ‎05-23-2018

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

Hi,

I still have an issue with why it gets stuck, but this question was only about how to debug it. I might open a new thread about why the problem occurs in the first place when I get time to look into it more. Thanks for asking!

Best regards,
Erasmus Cedernaes

0 Kudos
66 Views
Registered: ‎05-23-2018

Re: Stuck on SynchronousInterruptHandler - how to debug?

Jump to solution

I have also managed to solve the actual problem. The error occured due to an uninitialized pointer.

The code that dereferenced that pointer then wrote to the memory and thus wrote over the lower addresses (0x0000 up to say 0x0400). This was fine when only running on one core that did not use this address space. However, when running on an application where those addresses were in use, all hell broke loose.

I'll describe the solution a bit more when I get some time.

Edit:

So, the uninitialized pointer was dereferenced and the memory area was written with data instead of instructions. This overwrote the exception/interrupt handlers located at the lower addresses (about 0x200). However, this did not cause an error immediately. The error was triggered when a xil_printf-statement triggered a Synchronous abort interrupt with error code (0b000111). I believe this is completely expected, but since the instructions had been overwritten with garbage at address 0x200, then the first interrupt got completely stuck and triggered another exception which also got stuck. And so on.

I caught the error by offsetting the start-address of the ELF-file a bit. The error then stopped occuring, but another error appeared instead that was much simpler to debug.

Thanks for all the help!

0 Kudos