The data on a PIPE interface is encrypted at Gen3 speed. When debugging PCIe issues, it is helpful to be able to look at the packets on the PCIe link.
To do so, users need to have a Protocol Link Analyser. Due to the high cost, not many users will have access to such a piece of equipment. The packet analysis tool provided with a Protocol Link Analyzer is very extensive and allows for in-depth analysis of the link traffic.
The Xilinx® UltraScale+ Devices Integrated Block for PCIe® Express Gen3 IP has a feature that allows you to integrate a descrambler module to decrypt the encrypted data on the PIPE interface. This does not provide the same amount of analytic data as a Protocol Link Analyzer does, but it can nevertheless be helpful in identifying a potential issue and in most cases can help in tracking the root cause of the issue.
This blog will provide details on how to analyze the output data from the descrambler module by identifying different types of PCIe packets that are coming from the link and going into the PCIe IP.
The descrambler module is enabled in the PCIe IP configuration GUI as follows:
The descrambler module is supported only in Gen3 mode.
If the checkbox is grayed out, make sure that the link speed in the 'Basic' tab of the configuration GUI is set to 8.0 GT/s. If the option is not available, set 'Mode' to 'Advanced' in the Basic tab.
In order to track the start of a valid PCIe packet on the PIPE interface, the interface provides two signals: *_sync_header and *_start_block.
To confirm whether the data on rx_data is a valid packet or not, check the following:
*_data_valid is asserted
*_start_block is asserted
*_sync_header is either '1' or '2'.
If the value is '1', it indicates the start of an Ordered Set.
If the value is '2', it indicates the start of a TLP or a DLLP packet.
Descrambled data analysis can also be done in simulation.
The waveform below is from the Gen3 example design simulation that is generated along with the IP.
The very first packet after exiting from the 'Recovery.Speed' LTSSM state is EIEOS (Electrical Idle Exit Ordered Set).
FF00FF00 shown in the above waveform is EIEOS.
The very first packet on the descrambled data signal will be EIEOS.
As stated earlier, *_start_block must be asserted and the *_sync_header signal should be '1' as shown in the waveform below.
Once all of the equalization states are complete, just before going into L0 state, you should see 555555E1.
This is the SDS (Start of Data Stream). Once you see an SDS, it means that the exchange of ordered sets has finished. The next packet type on the interface will be initial flow control credits i.e. we should see DLLP packets on the interface.
The waveform below shows a DLLP packet. A DLLP packet starts with an SDP (Start of DLLP Packet - ACF0). Data will be distributed across the lanes in the case of a multi-lane design.
InitFC1-P (Initial Flow Control Credit for Posted Data) starts with '40'. The capture below is from the Lecroy Analyzer depicting the identifier for InitiFC1-P.
In the waveform below, the DLLP packet is InitFC1-P.
There are four types of DLLP packet formats:
ACK or Nak DLLP Packet Format
Power Management DLLP Packet Format
Flow Control DLLP Packet Format
Vendor Specific DLLP Packet Format
Each DLLP packet is 6 symbols in length. Refer to the PCI Express specification for information on decoding the DLLP packet content. In the waveform below, '60' is InitFC-Cpl (Cpl for Completion). '50' is InitFC-NP (NP for Non-Posted).
UpdateFC-P starts with '80'. The update values in the descrambled data are in Hexadecimal, so it needs to be converted into decimal to get the exact number of credits available.
Ordered sets will always come per lane. Every lane will have its own ordered set. DLLPs and TLPs are striped across the lanes; it will be one byte per lane.
A DLLP starts in lane-0, lane-4 or lane-8 only i.e F0 can only be on lane 0, 4 or 8. TLPs can start on any lane.
TXRATE indicates which speed the link is operating at. Transition to the Gen3 speed happens in the 'Recovery.Speed' (0C) LTSSM state as shown in the waveform below.
The packets at Gen1/Gen2 speed before L0 state are not scrambled; it is scrambled for Gen3 speed only. The ordered sets at Gen1/Gen2 speed on the PIPE interface can be read directly. However, everything is scrambled in the L0 state for all speeds. The waveform below shows a capture at Gen1 speed.
Here, '4A' means its a TS1 ordered set. The descrambler module is needed only when the speed has changed to Gen3 as indicated by TXRATE.
AAAAAAAA in the waveform below indicates an SKP Ordered Set.
E1 indicates the SKP_END Symbol as defined in the PCIe Specification shown below.
The waveform below shows a TS1 ordered set on a Gen3 link. The '1E' shown indicates a TS1 ordered set at Gen3 speed.
The waveform below shows a TS1 ordered set in each lane. It does not span across multiple lanes. It will be the same for all lanes except for the lane number. In the waveform below, the lane numbers are 00 and 01 respectively.
'0E' here is Symbol-4. Symbol-4 is defined in the spec as follows:
0E= 0000_1110. When we map these bits to the Symbol-4 description in the spec, it indicates that Gen3 speed is supported.
Symbol-6 has a different meaning based on which equalization state the LTSSM is in. In the waveform shown below, LTSSM is '28' which means it is in phase-0.
Symbol-6 is 20, i.e. 0011_0000. Because it is in phase-0, bit 1:0 is set to '00'.
The waveform below shows a complete TS1 ordered set.
The waveform below shows a TS2 ordered set at Gen3 speed. The '2D' indicates that it is a TS2 ordered set.
So far we have talked about Ordered Sets and DLLPs, but now let's see how to identify TLPs on the PIPE interface.
Every TLP starts with an STP (Start of TLP Packet) token. So in the descrambler, look for any “nF” with *_start_block = 1 and *_sync_header= 2.
Each STP token is 4 symbols and indicates the start of a TLP.
The STP fields are defined as shown in the figure below:
Here is an example of MemWr (Memory Write TLP). Note that this is a x4 link, so everything is striped across. The waveform below is from the example design simulation.
This waveform shows a memory write transaction coming from the host test bench going into the user logic through the CQ interface of the PCIe hard block. This transaction is interpreted on the PIPE interface as follows: