UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Visitor netas_fpga1
Visitor
8,045 Views
Registered: ‎12-29-2015

Zynq PS EMAC Ethernet (bare-metal) problem about performance

Hi Guys,

I am using Zedboard. I found a usefull code on xilinx forum (thanks to "sysseon")

"https://forums.xilinx.com/t5/Embedded-Development-Tools/Zynq-Ethernet-driver-Sending-several-BdRings/td-p/450544?db=5"

I modifed the code for both receive and transmit. My Zedboard is connected to PC via ETH cable. I am using Wireshark On PC.

While i am transmitting in a endless loop, i tried to see the performance. Then i realized a problem:

 

Wireshark is showing packages arrive in every 13us. My packets contain 1514byte. It is around ((1514*8)/13=931Mbit/s). It is ok. But sometimes a package is arriving around 0.8s. While i am calculating avarage it makes 4Mbit/s sec which is very poor.
I tried so many thing but i couldn't find a proper solution. But i realized that If i place a dumy code (a loop or a printf in my sending side which slow down the code) my eth speed is rising :) about 60Mbit/s. Any suggestion?

 

 

Ps:   I have already read UG585 Chapter 16 (Gigabit Eth Cont.) and Appendix B.

        I have scaned the forums and web.

        

0 Kudos
3 Replies
Xilinx Employee
Xilinx Employee
8,019 Views
Registered: ‎08-02-2007

Re: Zynq PS EMAC Ethernet (bare-metal) problem about performance

hi,

 

It looks that you are trying to evaluate the performance of GEM using Bare-metal. In this case, can you use LWIP library?

 

the performance numbers for GEM are given for bare-metal in the following link.

http://www.wiki.xilinx.com/Zynq-7000+AP+SoC+Performance+%E2%80%93+Gigabit+Ethernet+achieving+the+best+performance

 

--hs

----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------
0 Kudos
Visitor netas_fpga1
Visitor
7,986 Views
Registered: ‎12-29-2015

Re: Zynq PS EMAC Ethernet (bare-metal) problem about performance

Hi,

 

I need send RAW ethernet data transmission so I don't want to use LwIP stack library. (But i opened and examined the codes)
Actually i need a pseudo code for FSM to learn how to manage GEM efficiently. I think Documentation is not enough. (LwIP stack and Emac_example_intr_DMA is a little diffrent approach) My code is generated from Emac_example_intr_DMA. By the way you can see problem package at attachment.

 

I want to show my code pieces related EMAC: 

 

#define RXBD_CNT 32 /* Number of RxBDs to use */
#define TXBD_CNT 32 /* Number of TxBDs to use */

 

int main()
{
   init_platform();
   int Status;

   //init
   Status = EmacPsDmaIntr(&IntcInstance, &EmacPsInstance, EMACPS_DEVICE_ID, EMACPS_IRPT_INTR);
   if (Status != XST_SUCCESS)
   xil_printf("DMA init error \r\n");

   init_emac_protocol_buffers();
   fill_emac_tx_buffer_for_test();

   XEmacPs_Start(&EmacPsInstance);

 

      while(TRUE)
              send_emac_tx_buffer_for_test(&EmacPsInstance);
}

 

 

 

int send_emac_tx_buffer_for_test(XEmacPs *EmacPsInstancePtr)
{

   int TxFrameLength;
   unsigned long int bd_index, curr_wr_index;
   EthernetFrame buffer_payload_ptr;
   int Status;
   XEmacPs_Bd *BdTxPtr;
   XEmacPs_Bd *Bd_Aux;

   /////// Calculate the frame length
   TxFrameLength = XEMACPS_HDR_SIZE + PAYLOAD;

   BdTxPtr = MyBdPtr;

   Status = XEmacPs_BdRingAlloc(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, &BdTxPtr);
   if (Status != XST_SUCCESS)
   printf("Error allocating TxBD");

   // Setup 16 BDs
   Bd_Aux = BdTxPtr;


   for (bd_index = 0; bd_index < Num_Free_Bd; bd_index++)
   {

      EthernetFrame *buffer_payload_ptr;
      curr_wr_index = get_emac_curr_empty_tx_buffer();
      buffer_payload_ptr =(EthernetFrame *)(&emac_tx_protocol_buffers[curr_wr_index][0]);

      //add timestamp to package
      *((u32 *)(&emac_tx_protocol_buffers[curr_wr_index][24])) = Xil_EndianSwap32(Xil_In32(GLOBAL_TMR_BASEADDR +       GTIMER_COUNTER_LOWER_OFFSET));//
      //adding code to waste time is increasing performance
      Xil_DCacheFlushRange((u32)buffer_payload_ptr, TxFrameLength);

      XEmacPs_BdSetAddressTx(Bd_Aux, buffer_payload_ptr);
      XEmacPs_BdSetLength(Bd_Aux, TxFrameLength);
      XEmacPs_BdClearTxUsed(Bd_Aux);
      XEmacPs_BdSetLast(Bd_Aux);
      Bd_Aux = XEmacPs_BdRingNext(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Bd_Aux);
   }


   xil_printf("write something");//adding code to waste time is increasing performance

   ///////Enqueue to HW
   Status = XEmacPs_BdRingToHw(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, BdTxPtr);
   if (Status != XST_SUCCESS)
   printf("Error committing TxBD to HW");

   XEmacPs_Transmit(EmacPsInstancePtr);

   do{
   ///x2 instead of doing in interrupt.
   Num_Free_Bd=XEmacPs_BdRingFromHwTx(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), 32, &BdTxPtr);
   }while(Num_Free_Bd==0);

   XEmacPs_BdRingFree(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, BdTxPtr);

   return XST_SUCCESS;
}

 

As result Basically my pseudo send code like this;
It is repeatedly works in while loop

 

XEmacPs_BdRingAlloc

for (0; Num_Free_Bd; bd_index++)
{
buffer_payload_ptr = show address;
XEmacPs_BdClearTxUsed
XEmacPs_BdRingNext
}

 

XEmacPs_BdRingToHw
XEmacPs_Transmit
Num_Free_Bd=XEmacPs_BdRingFromHwTx
XEmacPs_BdRingFree

 

 

 

 

 

log1.JPG
0 Kudos
Visitor netas_fpga1
Visitor
7,854 Views
Registered: ‎12-29-2015

Re: Zynq PS EMAC Ethernet (bare-metal) problem about performance

Hi Guys,

I finally reach the 570Mbit/s. I could send 6.6M package (1514byte) (lossless). 

 

But still I have some questions;

I add some delay after my send function(whole process ) and after (XEmacPs_BdRingFree). Delays didn't change performance (still 570Mbit/s). If i remove the delays, code skips some packages while sending.

I am adding related part of my code:

 

while(TRUE)
{
send_emac_tx_buffer_for_test(&EmacPsInstance);
EmacPs_update_stats(&EmacPsInstance);
wait_global_timer_based(wait_value);//delay ~200uS
}

 

void send_emac_tx_buffer_for_test(XEmacPs *EmacPsInstancePtr)

{
Num_Free_Bd  = XEmacPs_BdRingGetFreeCnt(&(XEmacPs_GetTxRing(InstancePtr)));

 

XEmacPs_BdRingAlloc(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, &BdTxPtr);

 

Bd_Aux = BdTxPtr;
for (bd_index = 0; bd_index < Num_Free_Bd; bd_index++)
{
curr_wr_index = get_emac_curr_empty_tx_buffer();

buffer_payload_ptr =(EthernetFrame *)(&emac_tx_protocol_buffers[curr_wr_index][0]);
EmacPsUtilFrameHdrIndex(buffer_payload_ptr, Counter_Index++);
EmacPsUtilFrameHdrTimestamp(buffer_payload_ptr, Timestamp);

 

XEmacPs_BdSetAddressTx(Bd_Aux, buffer_payload_ptr);
XEmacPs_BdSetLength(Bd_Aux, TxFrameLength);
XEmacPs_BdClearTxUsed(Bd_Aux);
XEmacPs_BdSetLast(Bd_Aux);
Bd_Aux = XEmacPs_BdRingNext(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Bd_Aux);

Xil_DCacheFlushRange((u32)buffer_payload_ptr, 32U);
}

XEmacPs_BdRingToHw(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, BdTxPtr);

 

XEmacPs_Transmit(EmacPsInstancePtr);

do
{
Num_Free_Bd = XEmacPs_BdRingFromHwTxNETAS(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), (XEMACPS_SEND_BD_CNT/2), &BdTxPtr);
total_free_bd += Num_Free_Bd;
Status = XEmacPs_BdRingFree(&(XEmacPs_GetTxRing(EmacPsInstancePtr)), Num_Free_Bd, BdTxPtr);
wait_global_timer_based(wait_value2);//delay ~100uS
dsb();
} while(total_free_bd < (XEMACPS_SEND_BD_CNT/2) && Status != XST_SUCCESS);

 

}

 

Is there anyone who has an idea?
Anyone saw more speed?

Is these delays needed?(Why)

 

 

istatistik.png