UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

The Zynq PS/PL, Part Seven: Adam Taylor’s MicroZed Chronicles Part 27

by Xilinx Employee ‎04-11-2014 12:20 PM - edited ‎04-11-2014 01:30 PM (115,441 Views)

This latest instalment of Adam Taylor's blog shows the result of a fixed-point math function implementation in the ARM-based Zynq SoC's programmable logic.

 

Having looked at how we can implement fixed-point mathematics within the PL (programmable-logic) side of the Zync SoC in previous blog posts in the MicroZed Chronicles series, we now focus on implementing these functions within a system and we will see the rather surprising results of doing so.

 

Before we get to cutting code, we need to determine the scaling factors (location of the decimal point) that we will use in this specific implementation. In this example, the input signal will range between 0 and 10 so we can pack four decimal bits and twelve fractional bits into a 16-bit input vector.

 

Figure 1.gif

 

We are implementing the above equation, which has three constants A, B, and C:

 

A = -0.0088
B = 1.7673
C =131.29

 

We need to scale these constants in our implementation. The beauty of doing this in an FPGA is that we can scale each constant differently to optimize performance, as in this table:

 

 Figure 2.gif

 

 

As we implement the above equation, we will need to consider the expansion of the resultant vectors, which for the terms Ax2 and Bx are defined below:

 

 Figure 3.gif

 

To perform the final addition with constant C we need to have the decimal point aligned. Therefore, we need to divide the results and Ax2 and Bx by a power of two to align the decimal points with C. The result will also be formatted in this value which is 8,8.

 

Figure 4.gif 

 

Having calculated the above we are ready to implement the design within the Vivado peripheral that we created in previous installments. The first implementation step is to open up the block diagram view within Vivado, right click on the peripheral, and select “Edit in IP Packager”. Once the IP Packager opens within the top-level file, we can easily implement a simple process that performs the calculation over a number of clock cycles. (Five clocks in this example, although you could optimize this further.)

 

 Figure 5.gif

 

 

Now we can re-package and rebuild the project within Vivado (remember to update the version number) before exporting the updated hardware to SDK.

 

Once we are within SDK we can use the same approach as before with the exception of using a fixed-point number system now instead of the floating-point system used in the earlier example:

 

 

  for(i=0; i<2560; i = i+25 ){

       XScuTimer_LoadTimer(&Timer, TIMER_LOAD_VALUE);

       timer_start = XScuTimer_GetCounterValue(&Timer);

       XScuTimer_Start(&Timer);

       ADAMS_PERIHPERAL_mWriteReg(Adam_Low, 4, i);

       result = ADAMS_PERIHPERAL_mReadReg(Adam_Low, 12);

       XScuTimer_Stop(&Timer);

       timer_value = XScuTimer_GetCounterValue(&Timer);

      printf("%d,%lu,%lu,%lu, \n\r",i,result,timer_start, timer_value);

     }

 

 

When the above code was built and run this on the MicroZed board, we see the following result output over the serial link: the result of 33610 equals 131.289 when divided by 2^8 which is correct and in line with the floating point calculation (see part 5 of this blog, The Zynq PS/PL, Part Five: Adam Taylor’s MicroZed Chronicles Part 25).

 

 

 Figure 6.gif

 

 

Although the numeric result is the same, the big difference is the time it takes to perform the calculations. Although the actual computation requires only 5 clocks by the peripheral design, generating the result consumes 140 clocks or 420ns versus 25 CPU clocks using the ARM Cortex-A9 processor on the PS side of the Zynq SoC.

 

Why the discrepancy? Shouldn’t the programmable logic be faster?

 

This is a lesson in peripheral I/O overhead. When using the PL side, we must take into account the bus latency over the AXI bus and the AXI bus frequency which in this application is 142.8MHz (the requested was 150 MHz). The AXI bus overhead accounts for the longer-than-expected computation time. However, all is not lost. We’re just doing it wrong. Offloading tasks to the Zynq SoC’s PL is not intended to be used in this manner precisely because of this I/O overhead.

 

If we were to take a more reasonable approach, we would send a block of inputs requiring calculation to our peripheral using DMA as I explained in part 1 of this blog series on PL/PL interfacing. This example establishes why DMA is so important, which now permits me to explore how we use this experimental result in the next blog.

 

 

Please see the previous entries in this MicroZed series by Adam Taylor:

 

The Zynq PS/PL, Part Six: Adam Taylor’s MicroZed Chronicles Part 26

 

The Zynq PS/PL, Part Five: Adam Taylor’s MicroZed Chronicles Part 25

 

The Zynq PS/PL, Part Four: Adam Taylor’s MicroZed Chronicles Part 24

 

The Zynq PS/PL, Part Three: Adam Taylor’s MicroZed Chronicles Part 23

 

The Zynq PS/PL, Part Two: Adam Taylor’s MicroZed Chronicles Part 22

 

The Zynq PS/PL, Part One: Adam Taylor’s MicroZed Chronicles Part 21

 

Introduction to the Zynq Triple Timer Counter Part Four: Adam Taylor’s MicroZed Chronicles Part 20

 

Introduction to the Zynq Triple Timer Counter Part Three: Adam Taylor’s MicroZed Chronicles Part 19

 

Introduction to the Zynq Triple Timer Counter Part Two: Adam Taylor’s MicroZed Chronicles Part 18

 

Introduction to the Zynq Triple Timer Counter Part One: Adam Taylor’s MicroZed Chronicles Part 17

 

The Zynq SoC’s Private Watchdog: Adam Taylor’s MicroZed Chronicles Part 16

 

Implementing the Zynq SoC’s Private Timer: Adam Taylor’s MicroZed Chronicles Part 15

 

MicroZed Timers, Clocks and Watchdogs: Adam Taylor’s MicroZed Chronicles Part 14

 

More About MicroZed Interrupts: Adam Taylor’s MicroZed Chronicles Part 13

 

MicroZed Interrupts: Adam Taylor’s MicroZed Chronicles Part 12

 

Using the MicroZed Button for Input: Adam Taylor’s MicroZed Chronicles Part 11

 

Driving the Zynq SoC's GPIO: Adam Taylor’s MicroZed Chronicles Part 10

 

Meet the Zynq MIO: Adam Taylor’s MicroZed Chronicles Part 9

 

MicroZed XADC Software: Adam Taylor’s MicroZed Chronicles Part 8

 

Getting the XADC Running on the MicroZed: Adam Taylor’s MicroZed Chronicles Part 7

 

A Boot Loader for MicroZed. Adam Taylor’s MicroZed Chronicles, Part 6 

 

Figuring out the MicroZed Boot Loader – Adam Taylor’s MicroZed Chronicles, Part 5

 

Running your programs on the MicroZed – Adam Taylor’s MicroZed Chronicles, Part 4

 

Zynq and MicroZed say “Hello World”-- Adam Taylor’s MicroZed Chronicles, Part 3

 

Adam Taylor’s MicroZed Chronicles: Setting the SW Scene

 

Bringing up the Avnet MicroZed with Vivado

 

Comments
by Participant zwm215
on ‎07-15-2014 11:21 AM

I am a little bit confused with this. According to the code on the PS side, we are writing to reg1, which you defined as an OUT register. We are also reading from reg3, which is defined as an IN register. Should these be switched? I do not see how it makes sense to for a write command on the PS side to write to a register that is meant to hold data to be sent out.

 

Thanks for your help,

 

Zach

by Observer taylo_ap
on ‎07-15-2014 01:05 PM

Zach

 

I can see where the confusion may arrise, registers are inverse of what one might expect due. The file Adams_Perihperal_v1_0_S00_AXI.vhd the one which contains the AXI interface and the registers which are read and written over AXI. 

 

To use these registers within the file Adams_Perihperal_v1_0.vhd I needed to extract the lower three registers such I can do my mathematics upon it and then write the output back via the fourth and final register.

 

I hope this helps

 

Adam 

by Participant zwm215
on ‎07-15-2014 01:20 PM

Adam

 

I will see if it works. While attempting to debug the code, I would get to MY_PERIPHERAL_mWriteReg(My_BaseAddress, 4, i); line and it would run that forever because I believe it is unable to write to that register. In your block diagram it was hard to see what was connected to each other, but I found something online that showed the connections, but it also included a processor system reset.

 

This is the picture of my block diagram. If you have any suggestions please let me know.

PS_PL with my peripheral.png

 

Thanks,

 

Zach

by Participant zwm215
on ‎07-15-2014 01:26 PM

Adam, thanks for the clarification, after debugging my programming on the PS side, I get to MY_PERIPHERAL_mWriteReg(My_BaseAddress, 4, i); and it never gets past this line. One problem might be that my block diagram is incorrect. The one that you posted does not completely show all the connections.

 

Here is mine. Not sure if the proccessor system reset is neccessary, but it got rid of all the validation errors.

PS_PL with my peripheral.png

 

Thanks

 

Zach

by Observer taylo_ap
on ‎07-15-2014 01:29 PM

Hi Zach 

 

That looks like it is OK and my later examples hhave the reset block in it. Let me know if you have any issues 

 

Ad

by Participant zwm215
on ‎07-15-2014 01:38 PM

Sorry for the double post.

 

When I try to implement my design, I get this error:

 

Design will not pass DRC check. Router will ignore one driver
CRITICAL WARNING: [Route 35-14] Multi-driver net design_1_i/My_Peripheral_0/U0/My_Peripheral_v1_0_S00_AXI_inst/Q[8] detected. Design will not pass DRC check. Router will ignore one driver

 

Any ideas on what might have caused this?

 

Zach

by Participant zwm215
on ‎07-15-2014 02:10 PM

Problem fixed. Works perfectly!

 

While changing the vhdl code something must have gone wrong that never fixed itself so I just coppied your code back in and now it works.

 

Thanks Adam

Labels
About the Author
  • Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.