cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Getting Started with Versal Memory Interfaces

bethf
Xilinx Employee
Xilinx Employee
17 2 2,853

Introduction

 

This blog entry will cover important information you should understand before designing with Memory Interfaces on Versal™ ACAP devices. 

 It will additionally link you to relevant documentation, tutorials, and example designs.

You can find all of our Versal related blogs here.

 

IP Offerings

 

Versal ACAP offers the hardened Integrated DDR Memory Controller (DDRMC) along with soft memory interface IP options.

Additionally, the Performance AXI Traffic Generator is available to stimulate the Memory IP in both simulation and post-synthesis for hardware analysis.

 

The Versal Integrated DDRMC is the preferred solution due to its power, resource utilization, and timing closure savings.

The DDRMC has programmable network on chip (NoC) interface ports and is designed to handle multiple streams of traffic.

Additionally, it supports Quality of Service (QoS) classes to ensure appropriate prioritization of commands.

 

The NoC is a configurable AXI network used for sharing data between IP endpoints in the programmable logic (PL), the processing system (PS), and other hard blocks. This device-wide infrastructure is a high-speed, integrated data path with dedicated switching.

The Versal soft IP offerings are located within the PL and are similar to the soft memory interface IP offerings in the UltraScale/UltraScale+ device families.

 

Table 1: Versal IP Offerings

Memory Interfaces

 

DDRMC or Soft IP

Product Guide

Pin, Bank, Clocking, and Reset Rules

PCB Guidelines

Release Notes and Known Issues

Available IP Design Flows

Fabric Access

DDR4

DDRMC

(PG313)

(PG313)

(UG863)

(Xilinx Answer 75764)

IPI Required

NoC

Soft IP

(PG353)

(PG353)

(UG863)

(Xilinx Answer 75763) 

IPI or RTL Instantiation

Already in PL

LPDDR4/4X

DDRMC

(PG313)

(PG313)

(UG863)

(Xilinx Answer 75764)

IPI Required

NoC

RLDRAM3

Soft IP

(PG354) 

(PG354) 

(UG863)

Coming soon

IPI or RTL Instantiation

Already in PL

QDR-IV

Soft IP

(PG355)

(PG355)

(UG863)

Coming soon

IPI or RTL Instantiation

Already in PL

Traffic Generator

Soft IP

Product Guide

Clocking and Reset Rules

PCB Guidelines

Release Notes and Known Issues

Available IP Design Flows

Fabric Access

Performance AXI Traffic Generator

Soft IP

(PG381)

(PG381)

NA

(Xilinx Answer 75781)

IP Integrator or RTL Instantiation

Already in PL

 

Design Flows

There are two main design tools when targeting Versal ACAPs:

  • Vivado® Tools Design Flow to accelerate high-level FPGA design and verification
  • Vitis™ Environment Design Flow to build accelerated applications

 

For NoC and DDRMC designs, Vivado IP Integrator (IPI) is required. For the above listed soft IP, both RTL instantiation and IPI are supported.

For assistance using IP Integrator, visit the Vivado – Using IP Integrator Design Hub.

 

For design flow best practices and detailed memory interface IP walk-throughs, visit the Versal ACAP Design Guide (UG1273 Chapter 4 – “Design Flow”) and the above listed IP Product Guides.

The Example Designs referenced below are also available and provide excellent examples to help get started.

 

All Versal ACAP designs require the CIPS IP as it contains the PMC used to boot the device. For more information, see the Control Interface and Processing System IP Product Guide (PG352).

 

Getting Started with NoC/DDRMC

Designing with NoC and DDRMC is new to Versal ACAP and different from previous Xilinx device families.

To help you get started:

  • Learn about the NoC: Understand the basics of the NoC and how to configure the NoC within your design. Review Chapters 2 and 3 (Overview and NoC Architecture) of (PG313).
    • Utilize the "Introduction to NoC DDRMC Design Flow" tutorials on GitHub which include the following modules:
      • Basic NoC Design
      • Using the Integrated Memory Controller with the NoC
      • Isochronous Class with Streaming Traffic
      • Inter-NoC Interface-Connecting Multiple NoC Instances
      • Synthesis and Implementing the Design
    • Utilize the Versal™ Network on Chip/Multiple DDR Memory Controllers Tutorial .This example connects many different DDR devices simultaneously in one design to communicate to PS through NoC. It connects one DDR4 device and two interleaved LPDDR4 devices, which requires one NoC instance to configure the DDRMC for the DDR4 device and another NoC instance to configure the two interleaved DDRMCs for the two LPDDR4 devices.
    • Utilize the Efficient Data Movement with Versal Network on Chip tutorial. This tutorial uses a complex design example to demonstrate how the Versal™ Network on Chip (NoC) simplifies the design process for on-chip data movement. For comparison, a similar design is built in a Zynq® UltraScale+™ device. The NoC frees up programmable logic resources that are consumed by SmartConnect in the Zynq UltraScale+ design. Both designs can be run in hardware, and you can measure data movement and power consumption for comparison purposes.

  • Appropriately Configure the NoC IP:
    Configure the IP for the appropriate number of AXI masters, slaves, inter-NoC interfaces, and memory controllers.
    Based on the determined traffic model (see Designing for Performance section below), enter the QoS for each AXI channel and set the DDR Address Mapping.
    Configuring Memory Interfaces within the DDRMC is different from previous device families. The DDR controllers are implemented using the NoC IP Wizard. The wizard allows users to configure the target memory device options (memory density parameters, JEDEC timing parameters, and the mode register settings) rather than selecting the memory device from a drop-down menu.
    Additionally, the wizard provides the option for future device expansion to ensure a DDRMC pinout is sufficient when considering future memory topology expansions such as additional ranks, slots, or transitioning to 3DS devices.
    Review Chapter 4 "Integrated Memory Controller (DDRMC) Architecture" in (PG313) for additional information.

  • Design for Performance:
    When designing with the NoC and DDRMC, pre-planning to design for performance is critical.
    Chapter 7 "NoC Performance Tuning" in (PG313) reviews the key performance measures of bandwidth, latency, and system design trade-offs affecting performance, and how to optimize performance of the NoC and the integrated DDR Memory Controllers.
    Before designing for performance in your system, utilize the “Versal Network on Chip/DDR Memory Controller Performance Tuning” tutorial on GitHub which demonstrates the process of refining a design to achieve performance goals. You will start with a system DDR traffic spec and learn how to model this with the NoC, DDR memory controllers, and AXI traffic generators (TGs).

    When you are ready, design for performance in your system:

    1. Model the traffic flow. Determine the system’s traffic requirements including command and address patterns.

    2. Determine the system’s aggregate (read and write) bandwidth and bandwidth for each master.

    3. Compare the maximum theoretical bandwidth to the actual/achievable bandwidth for the NoC and DDRMC.
      For NoC bandwidth, refer to the Performance Metrics section of (PG313). For LPDDR4/DDR4 bandwidth, consider SDRAM overhead such as bus turn around time, page misses, and maintenance commands.

    4. Run simulations to ensure that channels are executing traffic as expected, determine bottlenecks, and utilize the levers available to tune for performance.
      • The Performance AXI Traffic Generator is intended for modeling traffic masters in Versal ACAP designs for performance evaluation of network on chip (NoC) based solutions. It is available in two versions: Non-Synthesizable for simulations only and Synthesizable for both simulations and running in hardware.
        Custom traffic patterns can be loaded into the IP through a .csv file. (Xilinx Answer 75782) provides .csv examples.
      • Include the AXI Performance Monitor IPs which will display read/write latency and bandwidth.

    5. Tune for performance and re-simulate:
      • Ensure that you have the right number of NoC NMUs and DDRMCs to meet your requirements.
        Interleaving memories, additional memories, wider data widths, and running the memories faster are options to consider.
      • Ensure that you have the DDR component that meets your traffic needs. For example, a DDR component with more banks and bank groups reduces page hits and switching penalties, ultimately resulting in better performance.
      • Determine DDR bandwidth for single versus dual channel.
      • Maximize efficiency in your DRAM command and address mapping to reduce DRAM penalties including page hits.
        Consider your address pattern, command pattern, transaction size, and number of threads accessing DDR.

    6. Replace AXI Performance Traffic Generators and simulate with targeted AXI endpoints. Design custom AXI IPs to maximize interface utilization.
      (UG1037) - Vivado Design Suite: AXI Reference Guide (especially Chapter 6 AXI System Optimization) and AXI Basics Blog series on the Forums are great resources.

      A few quick tips:
      • Ensure that you have Outstanding Reads and Outstanding Writes (visible in simulation) queued at all times.
      • Look at how axi_cache[1] is set. If it is marked as modifiable, this allows the NoC to change transactions to behave more efficiently.
      • Minimize data width conversions.

 

Design Process Hubs

Xilinx documentation is organized around a set of user design processes to help you find relevant content for your design needs.
Visit these Design Process Hubs for complete information related to your design process:

 

Example Designs and Tutorials

 

Additional Resources

 

2 Comments
garethc
Moderator
Moderator

Great content

yhorie
Xilinx Employee
Xilinx Employee

Very useful article.