07-18-2019 02:17 AM
We are developing a DDR3 based Ethernet System, in this system DDR3 is used to Buffer the incoming Ethernet Packets. For this system, we had developed two custom hardware which has a Kintex-7 FPGA, Micron DDR3-ECC Memory (MT18KSF1G72HZ – 8GB). ECC module was selected to store Data. DDR3 is running at clock frequency of 500Mhz.
Before explaining about the issue, a outline of the Hardware Changes is, Out of the developed two hardware, the Rev1 had an issue with the "DDR3 DM8" connection as it was mapped to W13 Pin of FPGA (IO_25_VRP_32) which was indeed gives issue during IP Core Generation because of the IO connection, according to Xilinx IO Guideline the "DDR3 DM8" has to be connected on AD11 Pin of FPGA (IO_L19P_T3_33), with this correction we had developed a new hardware Rev2.
To check the DDR3 interface on Rev1 Hardware, what we had done was use only 64bit of the ECC Memory and not using the remaining IOs related to ECC, so that it can behave as a NON-ECC Memory, this was done by custom logic i.e, the base part of the module which we are using is "MT41K512M8" with this we were able to get a NON-ECC module (MT16KTF1G64HZ – 8GB) supported by the Xilinx IP Core also, by this we were able to bring up the DDR3 on Rev1 hardware and was able to verify the interface apart from the 8bit parity.
Now, when the new version hardware Rev2 with the IO correction has received, we had checked the DDR3 with the Rev1 FPGA Build on Rev2 Hardware and it looks fine, but when we checked after enabling the ECC Bits by updating the IP Core with the MT18KSF1G72HZ – 8GB Exact DDR3 Part, the "init_calib_complete" signal from MIG is not asserting. As a debug when we compared the output files of the IP Cores which was generated with MT16KTF1G64HZ – 8GB and MT18KSF1G72HZ – 8GB a difference in Memory Timing Parameter i.e, change in tRP Value was seen. If am correct since the base part and the operating frequency are same for the two cores the timing Parameter has to be same, please correct me if am wrong.
Can you please guide me in finding what could be the issue.
07-18-2019 03:04 AM
Please let me know the Vivado and I'll have a double check on this.
07-18-2019 03:18 AM
Thanks for your reply.
I had missed to give the details on Tool, Vivado I am using is Vivado v2017.4.1 (64-bit).
Hope you had understood the details which I had mentioned, Please let me know if you need any further details.
07-18-2019 04:06 AM
After noticing the difference in Timing Prameter Value, We had even checked the Rev2 hardware by updating the tRP values on MIG for DDR3 ECC, but even then the "init_calib_complete" signal from MIG is not asserting.
Note sure any more changes has to be done or not.
07-26-2019 02:42 PM
Hello @prasanthvthycaud ,
In this scenario the base part between the two devices may be the same (MT41K512M8) but there can be differences between the two speed grades of the devices you selected which then determines the other DDR3 core timing parameters that need to be set for your operating point. Aside from just the Data Mask pin assignment there could have been other unintended or accidential changes to the rest of the hardware that may have impacted the DDR3 interface quality, power quality, or clock quality.
From your description the Rev1 build targeting the non-ECC SODIMM works fine on the Rev2 hardware. When I looked at the differences in the timing paramters between the two parts (Both are stating 1G6 speed grades) the tRP for the non-ECC device was 13.125ns and the tRP for the ECC device was 13.75ns. Based on the data sheets for the SODIMMs and the base memory device 13.125ns is appropariate for backwards compatability purposes. Here it seems the device with the more strict timing requirement works fine while the other device doesn't.
Have you double checked all the other settings in the IP between the two projects to make sure they're the same?
What happens if you target the ECC SODIMM and update tRP to be like the other device as a custom part?
For my own understanding are you running at 1.35V or 1.5V and what is your interface rate?
Overall it seems like something strange is happening here or something was overlooked in the other project settings since protocol wise this should work fine but there seems to be an issue only when the last data byte is enabled.
07-29-2019 12:15 AM
You can also try with a NON-ECC controller (MT16KTF1G64HZ – 8GB) basing on MT41K512M8 as you did with the Rev1 and implemented on Rev2. What's the frequency of the DDR3 interface running at? Have you tried run at a lower frequency?
07-29-2019 04:11 AM
DDR3 is running at a frequency of 500Mhz.
Thank you for your reply, Please find the answers for your question.
"1. Have you double checked all the other settings in the IP between the two projects to make sure they're the same? Yes, I had checked"
"2. What happens if you target the ECC SODIMM and update tRP to be like the other device as a custom part? DDR3 Initialization "init_calib_complete" signal from MIG is not asserting".
"3. For my own understanding are you running at 1.35V or 1.5V and what is your interface rate? It is 1.5V".
Yes, this issue looks something strange, FYI the screen shots of DDR3 MIG setting and the constraint file which we had used in design is attached.
As a debug we had done some couple of tests, and what we had observed is a un-stable behavior, i.e., some of the time the init_calib_complete signal Asserts and we were able to check the Packet Processing with Ethernet Packets by Write and Read Packets to and from DDR3, but same bit file if we reprogram the same hardware the init_calib_complete signal will not Asserts. Do you think for using the Additional 8bit for 72Bit interface (ECC module) have any specific hardware design guidelines had to be followed?
Details of the test what we conducted is as follows,
Test 1 : Generated MIG uses the tool's default setting with (MT18KSF1G72HZ-1G6) ECC is disabled and using the entire 72 bits to store data.
Test 2 : Generated MIG for a memory module (MT16KTF1G64HZ-1G6) to use only 64bit of the DDR3, the 64bit is mapped for this is, from ddr3_dq to ddr3_dq.
With this test what we think is, the problem arises when we use the additional 8bit interface (ddr3_dq to ddr3_dq) of ECC Module, or else the Hardware looks to be fine.
Please let me know your comments.
07-29-2019 11:26 PM
Is the failure related by certain bitstream files? You can try multiple resets on the DDR3 IP with one failing bit file.
07-30-2019 12:22 AM
No, we could see the failure for all the generated bitstream files, same bitstream may even work after some iterations of the Reprogramming the Hardware, sometimes it works fine (init_calib_complete signal Asserts) for the very first time itself, but next time after reprogramming the hardware with the same bitstream the failure (init_calib_complete signal will not Asserts) may come also.
Yes, during failure, we had tried to give a Soft Reset using our custom software which interacts with Hardware.
We had done some more test, i.e, in MIG selected the base part MT41K512M8 and used only 8bit,
Test 1 : Last 8bit (ddr3_dq to ddr3_dq) is mapped to the MIG as 8bit interface, generated the bitstream but the result seemed to be same, init_calib_complete signal will not Asserts.
Test 2 : First 8bit (ddr3_dq to ddr3_dq) is mapped to the MIG as 8bit interface, generated the bitstream, with this init_calib_complete signal Asserts.
Test 3 : Second Last 8bit (ddr3_dq to ddr3_dq) is mapped to the MIG as 8bit interface, generated the bitstream, with this init_calib_complete signal Asserts.
FYI, I had attached the Test 1 MIG Generated Datasheet. Please let me know your comments.