UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor renardo18
Visitor
26,991 Views
Registered: ‎05-06-2015

PCI express Base Address Register

Hi, I try to implement (for the first time) the PCIexpress Gen 3 IP into a Kintex Ultra Scale FPGA.

 

I am not sure to understand clearly what BARs are. On the configuration memory of the IP, from the address 10h to 24h, there is  possibly 6 Base Address Register.

For instance, let's say that each BAR is 2 Kbyte wide.

 

Does this mean that in the address written into the register BAR0 at the address 10h, is the first address of the 2Kbyte memory of the BAR 0?

 

How do I access BARs? I understood that I have to access them through the AXI Stream interface of the IP, is that correct? 

 

And finally, Where are these BARs actually mapped in the FPGA? Do I need to instanciate a RAM (if so, where do I connect it?), or these 2Kbyte area of memory will be automatically instanciated into the IP but I can't access them directly?

 

Thank you

Tags (1)
9 Replies
Professor
Professor
26,987 Views
Registered: ‎08-14-2007

Re: PCI express Base Address Register

The BAR is only for address decoding.  It does not implement the memory, only the address window for accessing memory.  For each BAR, the system will read the size of the required memory window and allocate a physical address range for access to that window.  Inside the FPGA on the user side, you typically get signals indicating which BAR was active (which address window was used for this transaction).  You then need to decode this information to decide what to do with the transaction.

-- Gabor
0 Kudos
Adventurer
Adventurer
26,795 Views
Registered: ‎02-22-2016

Re: PCI express Base Address Register

Hi Gabor,

I have the same problem. I want to visualize specific memory bar address by vhdl/verilog instead of pcitree program. I don't have any idea how should I reach the specific data in the ip code. Where are these memory bars are allocated? 

how should I continue to find a solution or a way to develop my knowledge in this field? Is there any video/article/example or something like that?

Thank you in advence

0 Kudos
Scholar markcurry
Scholar
26,676 Views
Registered: ‎09-16-2009

Re: PCI express Base Address Register

Here's an example of how BARs work in PCIE land.  Hopefully with the example some of both posters' questions will be answered.

 

For a Xilinx PCIE Endpoint - the designers sets the BAR size in the gui, or by set the requisite parameters in RTL.

As an example a device requires Two Memory BARS, BAR0 Requires 128MB address space, BAR1 Requires 64MB address space.  These configuration values become part of the implemented design (are NOT run-time configurable).

 

At boot time, the root complex probes the entire PCIE tree.  It eventually reaches our device, and attempts to Map all the BARs into it's Device Memory Map. Here's the basic procedure:

  Writing all ones (32'hffff_ffff) to each BAR (using a CFG WR request) at the BAR locations in CFG memory (address 10h to 24h).

  It then Reads each BAR that it just wrote all ones to.  For our example this read will return (Ignore the lower nibble - those are special, and we're ignoring for now):

   BAR0 = 0xF800_0000

   BAR1 = 0xFC00_0000

   (The rest of the BARS will return all-zeros). 

 

So, we wrote all ones, but read back different data.  The lower 27 bits of zeros in BAR0 tell the initialization device that BAR0 requires 27 bits of address space - which equal 128MB (the range we were looking for).  Effectively, the endpoint ignores writes to the BAR for those address bits which will NOT be decoded by the BAR itself.  Those ignored bits are actually decoded from the specific transaction, later.

 

Similar for BAR1: The lower 26 bits are zero, so the endpoint's asking for 64 MB of address space. The remaining BARs are all zeros, so are not implemented.

 

The initialization device then finds a space in its entire device address space to stick BAR0, and BAR1.

Let's say it determines that BAR0 goes to 0x2000_0000, and BAR1 goes to 0x3800_0000.  It set's this by writing again to the BAR space (via CFG writes).  BAR0 is programmed with 0x2000_0000, BAR1 is programmed with 0x3800_0000.

 

Now reads/writes from the host to Device Address 0x2000_0000 through 0x27FF_FFFF will be route to our device, at BAR0.  Reads/Writes from the host to Device Address 0x3800_0000 through 0x3BFF_FFFF wiil route to our device, at BAR1.  Any transactions reaching our endpoint on BAR0 will then use the lower 27 bits of the transaction address for local addressing.

 

I've glossed over a bunch (64-bit addressing, IO space), as well as virtual addressing to physical addressing.  But that's the basics.

 

Does this explanation help?

 

Regards,

 

Mark

 

 

Adventurer
Adventurer
26,606 Views
Registered: ‎02-22-2016

Re: PCI express Base Address Register

Hi @markcurry,

Thank you for your descriptive answer. In my application, When root complex write something in a specified memory bar place at an endpoint device(AC701),  I think that I can show this value on ac701 LCD character display via endpoint code. I supposed that I can reach that memory place bar via ac701. After your answer, I understood that this memory bar is allocated to the endpoint device on root comlex bar memory map . Can I reach this value and show on lcd by using root complex code?

Thank you,

herdinc

0 Kudos
Xilinx Employee
Xilinx Employee
26,340 Views
Registered: ‎11-25-2015

Re: PCI express Base Address Register

Hi Herdnic, 

 

During enumeration, it scans the hierarchy and identifies the all devices. BAR mapping and identifies Capabilities based on PCI Configuration space for each device

 

Each device can have 6 BAR's maximum and each BAR can have its own MSI/MSI-X within the specific BAR range. Because all 6 BAR’s (if needed) will be mapped to independent physical function and they will have no relation to each other.

Each physical function can have many virtual function but virtual function have nothing to do with BAR mapping. 

 

Process:

 

Once linked up, entering into L0 state, the enumeration begins with Cfg Wr and Cfg Rds

The host after identifying all the devices connected. It find's more information about each device functionality with the PCI Configuration space of the device.

 

Based on 1st 4 DW's device id, vendor id etc information it gets. Next is BAR0 register which is always at 10h and BAR5 will always be available at 24h.It writes all 1s into the BAR 0 register and reads back the device's requested memory size in an encoded form. The design implies that all address space sizes are a power of two and are naturally aligned. Based on the memory requirement of that device while reading back it gets for example BAR0 = 0xF800_0000 (5 1s and 27 0s) which means 2power 27 which is 128MB of space needed by the device. The host understands it and writes the starting address of the BAR0 host memory mapped in the device's PCI configuration space BAR0 register.

 

Now, for that endpoint 128 MB of space is allocated in the HOST system memory and the device knows from where its memory space start in host memory based on the start address value available in the BAR0 register. It’s similar for all 6 BAR’s each having its own starting address. If not all only BAR0 is gonna be used by the device, all other BAR location will be 32’hFFFF_FFFF when host reads it. Device’s will have only information about where the BAR0 address start and how much MB is allocated for the device (here device means the firmware/software sitting on top of the particular endpoint).

 

Any other devices connected to the HOST who wants to talk to the particular device need to write into the corresponding Device BAR’s HOST memory location. And, the device cannot write anything into its own memory mapped BAR address range. If the device has all 6 BAR mapped. Then it can't access any location of all 6 BAR locations mapped in Host memory.

 

The software which is sitting on top of the device need to know what all other devices connected to the HOST and what those BAR locations are, MSI location with each BAR etc. software/firmware will be smart enough to generate particular address targeting that device BAR memory location so that it talks to the correct device.

 

So HOST during enumeration, after mapping all 6 BAR memory space and writing the start address into all 6 BAR registers. It scans the device capabilities starting from the capability pointer which is always at 34h.It takes the address available at the capability pointer in that register location and jumps to the address. In the new address it checks for the capability pointer value for 0,if not it takes the value(which is the next address where the capability register is) and jumps to the new register location and reads the capability pointer value in new register location  and the process continues till the capability pointer becomes 0.

 

For example, it will have 10 capabilities registers, if the capabilities pointer becomes 0 after jumping and reading 10 registers. Then it encodes all the register to know what the capabilities are meant for.

Capability ID = 5h means MSI capability register

Capability ID = 11h means MSI-X capability register

Capability ID = 10h means PCIE express Capability register

Capability ID = 0001h means PCIe Express Extended Capability Register

 And HOST enumerates and get to know all the capabilities of the particular device by reading and encoding those capability register.

 

It happens for all the devices connected to the HOST. In this process, it reads MPS of all devices connected in the hierarchy. It takes the lowest MPS size supported by any device in the hierarchy and programs that as the new MPS size to all the device connected to it.Likewise, Slot identication (which slot gets connected) in motherboard and many other identification will be processed by the HOST.

 

When a particular device is hot-pugged or removed, all the devices connected to the host enumerates again.

As a whole, each device function on the bus has a configuration space with 256 bytes long.It is addresses with 8 bit PCIe bus, 5bit device and 3 bit function number. This allows up to 256 buses, each with up to 32 devices, each supporting eight functions. A single PCI expansion card can respond as a device and must implement at least function number zero. 

 

I have tried to explain the BAR mapping and how host enumerates the PCI config space taking 1 device and 1 BAR location mapping. It’s applicable to all these 256X32X8.And each 8 function which will be mapped to one or many BAR's based on memory and physical function requirement. The sequence of operation in enumeration hierarchy is another vast topic.

 

Answer to your question:

 

Yes you can.

 

Since FPGA (device firmware) knows the starting address of where the device BAR is mapped in HOST memory and how much MB allocated.You can access the particular Memory location in host memory by writing a root complex code and start reading the data available in it.

 

But you can’t identify what data it is, because the BAR mapped location of the device FPGA should not do any access in its own memory location.2 things can happen

  • Someone who wants to talk to the FPGA (endpoint device) will be writing some data into its BAR mapped host memory.
  • Orelse if no other device is writing any data into the BAR memory, it might be all 1s or 0s (the value which is stored in host system memory at the time of reset).

So just to see RC memory data getting mapped to LCD (all 1 or 0s or any random data written by any other device) you can do it.But from the data you can encode any pattern in it. 

 

Thanks, Sethu

 

 

 

0 Kudos
Observer bstahlman
Observer
13,605 Views
Registered: ‎01-31-2017

Re: PCI express Base Address Register

Mark,

Good explanation, but I have a follow-up question... It doesn't sound as though the PCIe device really needs to know the address written to its BAR(s), and presumably the host will have the address-mapping information stored more conveniently in its own memory. So is the purpose of writing the address(es) to the BAR(s) to allow intermediate PCIe switches/bridges "snooping" the configuration write packets to "learn" the routes to the various endpoints? In other words, are switches responsible for decoding the upper address bits (the ones that were read back as 1's during device configuration) to figure out how to route Memory Write TLPs?

Thanks,

Brett S.

0 Kudos
Scholar markcurry
Scholar
13,601 Views
Registered: ‎09-16-2009

Re: PCI express Base Address Register

Brett,

 

I can only guess at what happens "before" the endpoint (i.e. at PCIE switches/root comples/etc).  The PCIE spec does go into some special cases which switches may run into, but I've never looked at it in detail.  I'm not sure switches/bridges would be "snooping" however - you may be putting the cart before the horse - the switch configuration may be actually defining the legal address ranges that the BAR may be set to.  But again, I'm just guessing here.

 

You're correct in that endpoint shouldn't really care what the "full" BAR address is (i.e. it should only use address bits [26:0] in my example for BAR0).  The full address will show up in PCIE packets - so your endpoint device must be designed such that it correctly ignores those extra bits. Addresses outside your BAR (but still received at your interface) would be some sort of fault condition, I'd assume.

 

Regards,

 

Mark

 

 

 

Observer bstahlman
Observer
13,581 Views
Registered: ‎01-31-2017

Re: PCI express Base Address Register

Mark,

Thanks. That makes sense. My assumption regarding "snooping" (perhaps the wrong term) by the switches was based on the following (possibly flawed) understanding. Would appreciate corrections/clarifications from anyone...

 

When the host's PCIe initialization code writes to a device's BAR registers, it addresses the device's Configuration Space using B/D/F. On the other hand, when the host's device drivers subsequently write to the device's memory/registers, they use addresses in the range defined by the BARs, and the B/D/F is not even included in the packet; thus, the switch would need to know how to route packets based on the address alone. Finally, all address ranges must be allocated by the host because these ranges have implications for MMU configuration, which the switches would know nothing about. Moreover, host device drivers must be able to use the addresses to specify target peripherals. It's even possible that in some systems, the PCIe controller abstracts the whole process so completely that writing to a PCIe device looks to the host like an ordinary memory write: e.g., PCIe controller decodes the write address on the system bus and automatically generates the corresponding Memory Write TLP(s) on the PCIe bus... Reads from PCIe devices could work similarly, with PCIe controller decoding the read address, generating the Memory Read TLP, and holding the system bus till the completer TLP arrives with the data read from the peripheral, at which point, the PCIe controller could place the returned data on the bus and allow the memory read cycle to complete... (Not saying it typically works that way, only that it seems possible...)

 

Thanks,

Brett S.

 

0 Kudos
Explorer
Explorer
10,509 Views
Registered: ‎04-11-2016

Re: PCI express Base Address Register

hi @sethus @markcurry

I have 256 MB (0x0ff00000)

 

0x82000000 0 0x40000000 0x40000000 0 0x0ff00000>; /* non-prefetchable memory */

 

 BARs sets into dtsi file of yocto linux kernel as below:

 

pcie: pcie@0x33800000 {
compatible = "fsl,imx7d-pcie", "snps,dw-pcie";
reg = <0x33800000 0x4000>, <0x4ff00000 0x80000>;
reg-names = "dbi", "config";
#address-cells = <3>;
#size-cells = <2>;
device_type = "pci";
ranges = <0x81000000 0 0 0x4ff80000 0 0x00010000 /* downstream I/O 64KB */
0x82000000 0 0x40000000 0x40000000 0 0x0ff00000>; /* non-prefetchable memory */
num-lanes = <1>;
interrupts = <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "msi";
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 0x7>;
interrupt-map = <0 0 0 1 &intc GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
<0 0 0 2 &intc GIC_SPI 124 IRQ_TYPE_LEVEL_HIGH>,
<0 0 0 3 &intc GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
<0 0 0 4 &intc GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX7D_PCIE_CTRL_ROOT_CLK>,
<&clks IMX7D_PLL_ENET_MAIN_100M_CLK>,
<&clks IMX7D_PCIE_PHY_ROOT_CLK>;
clock-names = "pcie", "pcie_bus", "pcie_phy";
pcie-phy-supply = <&reg_1p0d>;
status = "disabled";
};

 

and when I set 64 MB BARs size in 7 series PCIE gen2 lane1 xilinx IP core GUI, it doesn't work. but smaller BARs size like in KB works.

 

As I have already set 256 MB in yocto why this 64 MB not working?

 

or do I need some additional editing/mapping in this dtsi file?

0 Kudos