cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
prasanthvthycaud
Adventurer
Adventurer
3,374 Views
Registered: ‎12-24-2013

DDR3 Data Corruption During Rank Switching

Jump to solution

Hi All,

 

I am using MT16KTF1G64HZ-1G6 DDR3 Module in my Kintex-7 Design. I am using 600Mhz as my DDR3 Memory clock and 150Mhz as User Interface Clock. For my application I had divide the DDR3 for a Ping-Pong based packet processing, in which Rank-0 is considered as Ping Buffer and Rank-1 is considered as Pong Buffer. 

 

Case 1: If the Packet processing is done based on Ping-Pong i,e the Rank switching between write/read happens instantly, the packets read from the DDR3 get corrupted. Rank Switching can be like Rank-0 Write, Rank-1 Write, Rank-0 Read, Rank-1 Read, these flow is not fixed it can be in any combination.

 

Case 2 : If the Packet processing is done only in either of Ping/Pong Buffer, i.e, the Rank switching will not happens, the packets are write/read only on a particular rank. At this time Data is not corrupted.

 

Can you please suggest me a way to resolve this issue.

 

Regards,

Prasanth

Tags (1)
0 Kudos
Reply
1 Solution

Accepted Solutions
ryana
Moderator
Moderator
3,820 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

Those resistor values are incorrect for 7-Series designs.  The expected VRP/VRN resistor values need to be 2x the target impedance.  Since your board is laid out with 50-ohm traces then you should be using 100-ohm resistors here.  Using 50-ohm resistors will cause issues.

 

The VRP/VRN DCI requirements are mentioned in UG471 the 7-Series Select I/O Resource Guide:

https://www.xilinx.com/support/documentation/user_guides/ug471_7Series_SelectIO.pdf

View solution in original post

23 Replies
ryana
Moderator
Moderator
3,324 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

This sounds like a signal integrity issue with the second rank on the DRAM interface or something timing/SI related to the rank switching. A quick test you can do is generate the example design and monitor the Debug signals for the traffic generator and see if you're able to reproduce the data error.  The Debugging DDR3/DDR2 Designs section starting on page 228 is rather long but goes in to all the details on how to do this and how to isolate errors related to the other rank.

Here's a link to the latest version:

https://www.xilinx.com/support/documentation/ip_documentation/mig_7series/v4_1/ug586_7Series_MIS.pdf

 

Try running the interface at the slowest possible rate to see if that reduces or eliminates the error.

Double check the Output Driver Impedance and ODT settings.

Double check all the board layout guidelines starting on page 195.

Go through the General Checks section starting on page 232.

From there follow the debug guide and isolate the issue as either a write error or read error and go from there.

prasanthvthycaud
Adventurer
Adventurer
3,303 Views
Registered: ‎12-24-2013

Hi @ryana

 

As you had suggested, I will try to test the Example design and will update you on the result.

 

In between, I wanted to update on some of my test results also,

 

Case 3: DDR3 Module which I am using is a 8GB and If the Packet processing is done by considering the memory as a single memory and continuously write/read packet to and from the DDR3, At this time Data is not corrupted. In this the address pointer moves from 0 to 8GB.

 

Case 4: This was similar to my Case 1 but the Rank Switching has a fixed patterns, i.e, If the Packet processing is done based on Ping-Pong i,e the Rank switching between write/read happens instantly with a fixed pattern of Rank Switching i.e, like Rank-0 Write, Rank-0 Read, Rank-1 Write, Rank-1 Read, At this time Data is not corrupted.

 

Case 5: Try running the interface at the slowest possible rate to see if that reduces or eliminates the error : I had changed the DDR3 Memory Clock to 550Mhz and repeated the Case 1 Test (Instant Rank Switching but no fixed patterns), At this time Data is not corrupted.

 

Can you please guide me in how to verify the Output Driver Impedance and ODT settings are correct or not. In MIG there is a settings "Internal Termination Impedance" default value is 50 ohm, on what basis does this settings has to be configured,

 

Regards,

Prasanth

0 Kudos
Reply
prasanthvthycaud
Adventurer
Adventurer
3,285 Views
Registered: ‎12-24-2013

Hi @ryana

 

I had attached the screen shots of MIG.

 

Regards,

Prasanth

1.JPG
2.JPG
3.JPG
4.JPG
0 Kudos
Reply
ryana
Moderator
Moderator
3,245 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

The optimal driver and ODT settings are determined based on your board layout because they need to be matched with your trace impedance.  Right now you have 34-Ohm drivers and and 60-ohm terminators on the interface.  Check with the person that did you board layout to see what assumptions they made for these and if they're different than you current IP configuration I would change them and try again.

 

Based on your test of running the interface at a slower rate and reducing the rate and which you're switching ranks it sounds like a signal integrity issue.  I would double check that all the board layout guidelines I mentioned were followed in this design.

 

 

prasanthvthycaud
Adventurer
Adventurer
3,219 Views
Registered: ‎12-24-2013

Hi @ryana

 

Thanks for your reply.

 

I will check with the Hardware Team and will confirm on the MIG settings on "optimal driver and ODT settings".

 

I have on doubt regarding the signal integrity, as the design works on slower frequency clock and I too agree with your point but during my test  i.e, Case 4: This was similar to my Case 1 but the Rank Switching has a fixed patterns, i.e, If the Packet processing is done based on Ping-Pong i,e the Rank switching between write/read happens instantly with a fixed pattern of Rank Switching i.e, like Rank-0 Write, Rank-0 Read, Rank-1 Write, Rank-1 Read, At this time Data is not corrupted. Even in this scenario the Rank Switching happens instantly but the Write/Read Pattern had made it fixed, for this scenario also I should observe a data corruption right? I know my understanding may be wrong, @ryana please correct me.

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
3,175 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

This scenario is a different type of load in the interface.  You're executing two commands on the rank before switching thus increasing the amount of time before you switch ranks.  This may be enough time to let the interface settle so you don't have the signal/power integrity issue appear.  You may even be able to generate another test where you have an access pattern like that in Case 1 but if you insert a delay between the first and second command, thus inserting a delay between switching ranks, you may be able to make the issue disappear. 

 

Also double check all the power rails in the design at the FPGA and at the DDR pins. 

Take a look at AR#62181 since the attached document gives solid guidance on how to make power and signal integrity measurements.

https://www.xilinx.com/support/answers/62181.html

prasanthvthycaud
Adventurer
Adventurer
3,117 Views
Registered: ‎12-24-2013

Hi @ryana

 

Thank you for your suggestions.

 

I had tried to give delay between the commands and it shows the data corruption looks to be eliminated but the system throughput had came down.

 

I had got confirmed with my hardware team on board rules and it is set for single-ended traces to be 50 ohms, and differential pairs to be 100 ohms. But I was not able to determine the optimal driver and ODT settings, can you please provide your comments.

 

I have a doubt on DCI Cascade option, right now it is disable in my design. In hardware board design Bank 33 VRP/VRN resistors  are connected, so is DCI Cascade option has to be enabled in MIG Settings?.

 

Regards,

Prasanth

 

 

0 Kudos
Reply
ryana
Moderator
Moderator
3,105 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

With 50-Ohm traces you should set the Output Driver Impedance Control to RZQ/6 and set the RTT (nominal) - On Die Termination (ODT) to RZQ/6.  If you only have the VRP/VRN resistors in one bank then the DCI Cascade option must be enabled in the MIG settings and you must add the DCI cascade constraint to your constraint file.

 

Here's an example where bank 33 is the master and banks 32 and 34 are the slave banks:

# Set DCI_CASCADE          
set_property slave_banks {32 34} [get_iobanks 33]

 

Sounds like 33 is your master but you need to update the slave banks as necessary.

prasanthvthycaud
Adventurer
Adventurer
3,098 Views
Registered: ‎12-24-2013

Hi @ryana

 

Thank you for your reply, I will update my MIG Settings.

 

I have one more doubt on Internal Termination Impedance? which of the 4 values has to be selected [OFF, 40, 50, 60].

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
3,168 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

Hmm, something doesn't make sense here.

 

DCI is only available for High Performance banks but Internal Termination is only for High Range banks and you can't split an interface between High Range and High Performance banks.

 

If you have the MIG GUI option for Internal Termination then your design is in the High Range banks and you don't need the DCI constraint or the DCI Cascade feature because you can't use them.  Here I would set the Internal Termination to 40-Ohms.

 

Can you confirm your interface is only in High Range banks or let me know which device you're using and which banks you're using?

prasanthvthycaud
Adventurer
Adventurer
3,162 Views
Registered: ‎12-24-2013

Hi @ryana

 

FPGA which I am using is Xilinx Kintex-7 XC7K160T FFG676-2. Banks used for DDR3 Interface is 32, 33, 34 which are High Performance banks. In MIG GUI I could see Internal Termination enabled.

 

For your kind information, I had attached the DDR3 IO Constraint file.

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
3,156 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

In that case the setting doesn't matter since you're only using HP banks.

You can set it to Off if you'd like.

prasanthvthycaud
Adventurer
Adventurer
3,144 Views
Registered: ‎12-24-2013

Hi @ryana,

 

I was looking into the https://www.xilinx.com/support/documentation/user_guides/ug475_7Series_Pkg_Pinout.pdf document and it mentioned that 

 

VRN : This pin is for the DCI voltage reference resistor of N transistor (per bank, to be pulled High with reference resistor). 

VRP : This pin is for the DCI voltage reference resistor of P transistor (per bank, to be pulled Low with reference resistor).

 

I think I am seeing some difference in my schematics.

Capture.JPG

 

Do you think this connection is correct? If not will this can create an issue on DDR3 Interface?

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
3,821 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

Those resistor values are incorrect for 7-Series designs.  The expected VRP/VRN resistor values need to be 2x the target impedance.  Since your board is laid out with 50-ohm traces then you should be using 100-ohm resistors here.  Using 50-ohm resistors will cause issues.

 

The VRP/VRN DCI requirements are mentioned in UG471 the 7-Series Select I/O Resource Guide:

https://www.xilinx.com/support/documentation/user_guides/ug471_7Series_SelectIO.pdf

View solution in original post

prasanthvthycaud
Adventurer
Adventurer
3,121 Views
Registered: ‎12-24-2013

Hi @ryana

 

Ok, I will change Resistors and check my design.

 

Is the voltage levels are correct? In KC705 EVALUATION Schematics VRN is connected to 1.5V and VRP is Grounded. In my hardware schematics it looks to be swapped, i.e, VRN is connected to Ground and VRP is connected to 1.5V.

 

ref.jpg

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
3,114 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

Sorry for not noticing that in your schematic but yes, those are incorrect.

VRP goes to ground and VRN goes to 1.5V.

prasanthvthycaud
Adventurer
Adventurer
3,110 Views
Registered: ‎12-24-2013

Hi @ryana,

 

Thank you for conforming.

 

I will try to correct these hardware changes and with the updated MIG settings, I will recheck my Design and will update the test results.

 

Regards,

Prasanth 

0 Kudos
Reply
prasanthvthycaud
Adventurer
Adventurer
2,935 Views
Registered: ‎12-24-2013

Hi @ryana,

 

I had updated the Hardware changes for VRP/VRN as per the Xilinx KC705 Evaluation Schematics. MIG settings is also updated with the corresponding changes, Output Driver Impedance Control to RZQ/6, On Die Termination (ODT) to RZQ/6, DCI cascade constraint to the constraint file as set_property slave_banks {32 34} [get_iobanks 33].

 

With this change if I load the bit file to my system, the system looks to be hanged or system will be down and wont recover back. If I comment the set_property slave_banks {32 34} [get_iobanks 33] in the constraint file, the system will able to load the bit file and the system will be UP.

 

Do you think I am missing some settings which you had suggested to use the MIG in DCI cascade?

 

Regards,

Prasanth 

0 Kudos
Reply
ryana
Moderator
Moderator
2,920 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

It could be a settings issue, a problem with the syntax, or a duplicated constraint that's confusing the IP.  Make sure the constraints match your design, as in, making sure the DCI cascade option is enabled in the MIG GUI and that the master/slave banks are correct.

prasanthvthycaud
Adventurer
Adventurer
2,442 Views
Registered: ‎12-24-2013

Hi @ryana

 

I had verified the MIG Settings and confirmed that the DCI Cascade option is enabled in the MIG GUI. I had attached the MIG generated reported of settings.

 

After Vivado Implementation I had tried to report the DCI Cascade for each bank and the details are as follows,

 

 

1. report_property [get_iobanks 32]
Property Type Read-only Value
BANK_TYPE string true BT_HIGH_PERFORMANCE
CLASS string true iobank
DCI_CASCADE string* false
IS_MASTER bool true 0
IS_SLAVE bool true 1
MASTER_BANK string true 33
NAME string true 32
SLR_INDEX int true 0

 

2. report_property [get_iobanks 33]
Property Type Read-only Value
BANK_TYPE string true BT_HIGH_PERFORMANCE
CLASS string true iobank
DCI_CASCADE string* false 34 32
IS_MASTER bool true 1
IS_SLAVE bool true 0
MASTER_BANK string true
NAME string true 33
SLR_INDEX int true 0

 

3. report_property [get_iobanks 34]
Property Type Read-only Value
BANK_TYPE string true BT_HIGH_PERFORMANCE
CLASS string true iobank
DCI_CASCADE string* false
IS_MASTER bool true 0
IS_SLAVE bool true 1
MASTER_BANK string true 33
NAME string true 34
SLR_INDEX int true 0

 

In the report I could see "DCI_CASCADE string* false", is this is correct? I am using the Vivado 2017.4.1.

 

Regards,

Prasanth

0 Kudos
Reply
ryana
Moderator
Moderator
2,423 Views
Registered: ‎11-28-2016

Hello @prasanthvthycaud,

 

That behavior looks correct.

I used the same constraint in an example design and when I got the IOBank properties they reported in a similar fashion.  The DCI_CASCADE string* false was always present but what matters is the IS_SLAVE and MASTER_BANK reporting was correct.

 

With this implemented design are you still seeing the data errors when switching banks?

prasanthvthycaud
Adventurer
Adventurer
2,380 Views
Registered: ‎12-24-2013

Hi @ryana

 

Thank you for confirming.

 

With this DCI Enable, I was not able to check the DDR3 Packet Processing as I mentioned before with the DCI enable it looks the Device hangs and the Hardware Team is looking into the Patches which is done for VRP/VRN.

 

FYI, we have two devices, one of them we had done the hardware VRP/VRN changes with the VRP/VRN resistor to 100 Ohms, in this device if we try to load the DCI enabled bit file the device gets shutdown/hang state, but if the same bit file If I load to the other device which don't have a hardware patch for the VRP/VRN correction will be able to load the bit file without any issue.

 

Regards,

Prasanth  

0 Kudos
Reply
prasanthvthycaud
Adventurer
Adventurer
2,235 Views
Registered: ‎12-24-2013

Hi @ryana,

 

First of all, thank you for the support.

 

I was able to test the DDR3 with the DCI changes by programming the bit file to the device using JTAG and with this changes the design looks to be fine as there is no Data Corruption is seen during Rank Switching, I had conducted a long run test of around 5 days and the test was passed without any Data Corruption

 

Regards,

Prasanth