cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
whelm
Explorer
Explorer
5,314 Views
Registered: ‎05-15-2014

Debugger Memory Access and restrictions

Jump to solution

I'm coming here looking for some insights into a puzzle.  I'm working on a solution using L2 cache as RAM.  I tested out a technique and have been able to lock cache and read and write to it from the CPU.  Briefly it consists of preparing an otherwise unused block of memory as L2 (outer) but not L1 (inner) cacheable, with Write Back and Write Allocate in the MMU table.  Then enabling, flushing and invalidating the L2 cache, Writing a pattern to 512k bytes in the prepared block (they get written to cache, but not passed on to (nonexistent) memory) and then locking all 8 ways of the cache.

 

But here's the problem.  The debugger can't seem to access this block of memory.  My understanding is that the debugger is closely associated with the processor, so it should be able to read and write cacheable memory through the cache.  In this case, all it would be reading or writing is the cache.  But as soon as the debugger tries to load code into that memory region it gets a memory fault.

 

There are several possible explanations:

     1. There is a table the debugger uses to determine what it can read and write, in which case I need to understand how that works so I can modify the table.

     2.  The debugger jumps through some hoops to insure cache synchronization, which may in general be a good idea, but in this case would mess up the whole thing and result in failure.  In that case, I need to understand how the debugger works so I can circumvent this activity.  Ideally it would be using a TCL script which could be edited to change its behavior.

     3.  It is possible I don't understand enough about DAP and it isn't on the CPU side of the cache, in which case I may be stuck.

     4  It is also possible that the debugger has more than one way to access memory, and I am needing to use a different method.  I mean, I've written debugger trap files that reside in memory and communicate with a debugger, using the CPU.  Certainly, if all else failed, one could pass information back and forth between the cache and the debugger through the CPU.  But again, this might involve changes to the debugger to facilitate.

 

I'd appreciate any insight or being pointed to any references that could address these issues.

 

Thanks,

Wilton

 

0 Kudos
1 Solution

Accepted Solutions
sadanan
Xilinx Employee
Xilinx Employee
5,099 Views
Registered: ‎10-21-2010

Hi Wilton,

 

Sorry for the delayed response. I was travelling and didn't have access to the forums.

 

I'm not familiar with the first issue. Its certainly not related to the patch I sent you. Since you say its not fatal, I will leave it aside for now.

 

Regarding the original issue, I missed a line of code while merging the changes to 2018.2 branch. Can you please try with the new patch? Sorry for the confusion.

 

Please note that you need to add a memory map entry for the reserved address space, to be able to access it thru debugger. You can do it like this from XSCT console in SDK

 

targets -set -filter {name =~ "APU*"} ;# select the APU target

memmap -addr 0xf9000000 0x1000 ;# change the size based on your design

 

After this, you can access this memory region from GUI or XSCT, through A9 target.

 

View solution in original post

0 Kudos
12 Replies
sadanan
Xilinx Employee
Xilinx Employee
5,276 Views
Registered: ‎10-21-2010

Hi @whelm,

Debugger uses processor core to access the memory, but it also tries to do va to pa translation, to check if its an invalid address.

 

In the past we had a problem with QSPI memory locked in L2 cache. Would you be able to provide a test case so that we can check what the problem is. Sorry, I have not followed some of your description very well

0 Kudos
whelm
Explorer
Explorer
5,262 Views
Registered: ‎05-15-2014

Thanks for responding to my request.  I have attached a zip file containing the necessary elements to see the issue.  The application runs on custom hardware, but it is not necessary to run it.  The FSBL, which is the bulk of the zip should run on whatever you have available.  I'm using a Z007S, but a Z010 or other zynq should behave the same.  In addition to the full fsbl source,  I included the QSP rom image (mcs) and the bif that made it and ldscript_app.ld that linked the application--the last two to show how memory and flash were configured.

 

The simplest way to see the problem is to step through the FSBL to after MakeL2RAM() and then open a memory window at 0xF9000000.  It will come up all ???? when it should be showing the 0xA5 pattern.  Also if you step into MakeL2RAM() or set a breakpoint at the end of it, the last three assembly lines read a word at 0xF9000000, showing that the CPU can indeed see the pattern written there.

 

A more thorough test is to flash the mcs file and step to the end of FSBL where it hands off to the application.  It will probably be necessary to stop after BootModeRegister is loaded and change the variable to 1.  In order to get there, the board has to be configured for JTAG, but in order to complete, it has to think its QSPI.  Once at the handoff, it is clear that the CPU is stepping through code.  pc is incrementing and branches are being taken.  But the debugger just shows illegal memory accesses.

 

The mcs file in the attached zip, on my custom hardware boots and runs just fine.  It includes an RS-232 debug monitor and I can read and write to anywhere in the L2 cache at 0xF9000000 ot 0xF97FFFFF.  However, if I try to run it from the debugger it fails immediate when it tries to load the code at 0xF9000000.

The following information is background and not essential to seeing the issue.  Take or leave as much as you wish.

 

There are three places where FSBL has been modified.

     main.c has had the DDR checking stripped out, and possible a few other things of no interest to the project.  The function MakeL2RAM() was added at the end and called just before boot type selection was done.  This means that it is called even for JTAG boot, so the L2 cache is locked down at 0xF9000000 even for the debugger to use.

     image_mover.c has been modified so that PartitionMove() checks for a destination address in the 0xF9000000 block and uses a simple CPU based memcpy instead of the Pcap copy, which is based on DMA and can't access the cache.

     translation_table.S has one entry changed to specify the caching mode for the 0xF9000000 block:

/* 0xF9000000 for L2 cache as RAM  Outer Cachable, not inner cacheable, Write Back, Write Allocate */
.word   SECT + 0x15de2          /* S=b1 TEX=b101 AP=b11, Domain=b1111, C=b0, B=b0 */
.set    SECT, SECT+0x100000

 

The intended memory map loads FSBL starting at 0 (which is required for the boot rom to launch it).  Only the vector table is loaded there.  Then there is a gap to 0x20000 where the rest of code followed by data goes, ending before 0x2C000.  At 0x2C000 the translation table occupies the final 16k of that block of OCM, which will be used for the application as well as FSBL.  This ordering is also driven by the fact that the ROM bootloader can only load one partition from flash.  The uninitialized data, including heap and stack go into the final 64k of OCM at the top of memory.

 

The application code loads at 0xF9000000 into the L2 cache which has been prepared and locked for this purpose.  Application data goes at address 0.  The vectors there are not needed by that point, and there is 128k of space for them.  Uninitialized data can extend to 0x2BFFF, since it isn't touched until FSBL is finished.  The OCM at the top will be used for Ethernet buffers, because it is a convenient way to get a reasonable sized piece of non-cached memory for that purpose without resorting to secondary mmu tables.

 

Thanks so much for looking at this.

 

Wilton

 

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
5,248 Views
Registered: ‎10-21-2010

Hi Wilton,

 

Thanks for the additional details. Please give us some time to look into this. I will post any additional questions or updates here

 

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
5,217 Views
Registered: ‎10-21-2010

Hi Wilton,

 

Would you like a temporary patch until we push a more formal fix into SDK release? If so, for which version of SDK would you need a patch?

 

Some details about what is causing the problem

Debugger has some intelligence about the Zynq address map, and identifies this address region as invalid, since it doesn't know that this region is locked in L2CC. Also, it has a list of cacheable memory regions. Since this region is non-cacheable, debugger doesn't use CPU core to access this region, but instead uses AHB AP in DAP, which directly accesses the physical memory

0 Kudos
whelm
Explorer
Explorer
5,208 Views
Registered: ‎05-15-2014

Thanks for looking into this.  I suspected something of the sort was the case, and hoped that it would be in a configuration file somewhere so I could change it.

 

Yes, a patch would be greatly appreciated.  I'm using SDK 2018.2 (not .2.1).

 

Besides our own internal needs, I'm planning to release a document for others who might benefit from using L2 Cache in this manner.  There have been a couple articles previously, but the have been short on explanations that could be used to adapt them.  I'm hoping to clearly cover the process that FSBL goes through and how to adapt it to: XIP, various OCM configurations and L2 Cache in whatever combination a user might want.

 

Wilton

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
5,175 Views
Registered: ‎10-21-2010

PFA 2018.2 patch. You can use it in one of the following two ways

 

  1. Extract the contents of the zip to $XILINX_SDK (ex. C:\Xilinx\SDK\2018.2), such that the files in the zip overwrite the files in SDK installation
  2. Create a new directory (ex. C:\myvivado), and extract the contents of the zip to this directory. Before starting SDK/hw_server, set the env variable MYVIVDAO to point to this new directory (ex. set MYVIVADO=C:\myvivado)
0 Kudos
whelm
Explorer
Explorer
5,152 Views
Registered: ‎05-15-2014

It appears that we have taken a couple of steps backwards.  I was working on a stripped down version that isn't actually using the L2Cache.  I have some breakpoints set as I'm tracking down another problem.  I started a debug session, clicked RUN at main() and about four more times at various breakpoints.  There was a breakpoint at a function call.  I got an error pop up which said "'Update Action Enablement' has encountered a problem.  An internal error has occurred."  The details say "An internal error has occurred.  java.lang.InterpretedException".  (Note: this doesn't appear to be fatal.  I dismissed it and could continue debugging.  I re-ran the program and it occurred at a different breakpoint.  But again did not stop me from proceeding.)

 

I also tried to start debugging the application who's code is linked to reside in L2Cache at 0xF9000000.  I got the usual dreaded "Error while launching program: Memory write error at 0xF9000000.  AHB AP transaction error, DAP status f0000021".  I also tried debugging the FSBL, which sets up the L2Cache.  The test code that reads 0xF9000000 indeed reads back the correct value, but the Memory Monitor still shows ????????.

 

Is there anything I need to configure to tell it where it can and can't use AHB and where it is legal to access memory and where it isn't?

 

Wilton

 

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
5,100 Views
Registered: ‎10-21-2010

Hi Wilton,

 

Sorry for the delayed response. I was travelling and didn't have access to the forums.

 

I'm not familiar with the first issue. Its certainly not related to the patch I sent you. Since you say its not fatal, I will leave it aside for now.

 

Regarding the original issue, I missed a line of code while merging the changes to 2018.2 branch. Can you please try with the new patch? Sorry for the confusion.

 

Please note that you need to add a memory map entry for the reserved address space, to be able to access it thru debugger. You can do it like this from XSCT console in SDK

 

targets -set -filter {name =~ "APU*"} ;# select the APU target

memmap -addr 0xf9000000 0x1000 ;# change the size based on your design

 

After this, you can access this memory region from GUI or XSCT, through A9 target.

 

View solution in original post

0 Kudos
whelm
Explorer
Explorer
5,091 Views
Registered: ‎05-15-2014

Thanks you very much.  That appears to resolve the problem.

I presume I will need those two XSCT commands anytime I start debugging a project that needs this.

And in case anyone else is following this thread, the second one should be:

     memmap -addr 0xf9000000 -size 0x80000

I am now indeed able to see the pattern I stored in L2Cache and jump into it.

 

I may have another problem.  I learned on another thread that I need to Run As . .  the FSBL, and then Debug As .. the application.

(The FSBL is what sets up the MMU table and loads and locks the cache).  I tried that, it says Existing launch configuration 'System Debugger using Debug_WV_FSBL.elf on Local' conflicts with the newly launched configuration.  It is recommended to terminate the old configuration.  Do you wish to terminate it?  My first reaction is no, because I don't want it reconfiguring anything.  But I've tried it both ways.  I also have tried issuing the two XSCT commands before starting to debug the application.  But no matter what combination I try, I end up with DAP status 00000021 because of Memory write error at 0xF9000000.  I presume it is because the memmap change got overwritten in the process of launching the application.  Unfortunately once I get the 00000021 error, I can't look at any memory, so trying to dig deeper has been unsuccessful.  Maybe I need to put the XSCT commands in the debug configuration somehow or uncheck some of the steps it is doing?

 

Again, Thanks,

Wilton

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
4,750 Views
Registered: ‎10-21-2010

There are two ways to debug an application.

1. Launch debug of app directly, w/o running FSBL. SDK debug config will init the PS and start the app. This is not a 1-1 match to the production flow (see below)

2. Run FSBL, and then launch debug of the app. However, as I mentioned in #1, the default app debug config in SDK runs PS init script, so you'll end up initializing the PS twice (PS init is also run by FSBL). Also, if the app debug config has reset system option enabled, all the initialization done by FSBL is lost, so you have to disable these two options in app debug config. Since you're relying on FSBL to init the MMU and lock L2, these settings will be lost when you run the app (BSP initializes the MMU, caches, etc.). If you don't need to lock an address range in L2 in FSBL, I suggest you move this configuration to your application code. Otherwise, you may have to do this again in the app

 

To run the memmap command during debug launch, you can put it in a tcl script, and specify it under 'Execute script' option in Target setup tab

0 Kudos
whelm
Explorer
Explorer
4,593 Views
Registered: ‎05-15-2014

Have been distracted for some time, but returned to this successfully.  I'm working on a document to help support this methodology, which I will attach once more of the dust settles.  But the short answer is that we have a working system.  It involves:

     A modified FSBL that maps 0xF900_0000 in the MMU as L2 (outer) cachable, writeback, then fills it with a pattern and locks it.  It also eliminates the DDR tests and fatal exit on failure.  It modifies the code transfer to use a software loop rather than DMA, since DMA doesn't go through L2 cache.  And finally, removes the cache disables from the exit code.

     A modified startup file in the application that doesn't mess with L2Cache.

     The most recent SKD debugger patch you sent.

     Unchecking reset CPU in the debug configuration.

     Replacing the ps7_init handling in the debugger configuration with a script that ran ps7_init, loaded and ran FSBL and opened the memory region as you described, before allowing the normal application debug script to load and start the application.

I did not find the "Execute Script" location, but by usurping ps7_init, I was able to create a one step debug environment that worked.

Moving the setup code to the application was not an option, because the code is what lives in the cache.  In a debug environment, the debugger has to be able to write it there.  In a production environment, FSBL writes it there.  So it must be done before Boot.S.

One final question for you.  At some point there's going to be a 2018.3 or 2019.1 release.  Will this patch be mainstreamed into it?  If not, will the patch work with that release?  We have no need of the supplementary releases, but would probably switch to the next main release when it is available, but will need to be able to continue using this feature.

Thanks,

Wilton

P.S.: Interestingly, I found out the hard way that the cache memory must be accessed through the A9 target.  Accessing through the APU target fails--presumably the APU target looks at it from the bus perspective, rather than the CPU perspective, as other memory regions could be accessed either way.

0 Kudos
sadanan
Xilinx Employee
Xilinx Employee
4,577 Views
Registered: ‎10-21-2010

Hi Wilton,

Yes, this fix is available in 2018.3 and beyond.

Regarding memory accesses thru APU vs A9, APU always accesses physical memory thru a dedicated AHB/AXI interface, so you'll not be able to access caches thru APU

0 Kudos