天天看点

[翻译自Mellanox官网]从最大性能角度来理解PCIE的配置.

Understanding PCIe Configuration for Maximum Performance

原文:

https://community.mellanox.com/docs/DOC-2496

从最大性能角度来理解PCIE的配置.

我们为什么使用PCIE?

PCIE可以用在任何系统中,用于不同modules之间的通信.

网络适配器需要与CPU和内存进行通信.

这意味着,为了处理网络流量,通过PCIe通信的不同设备应该被配置.

当连接到PCIe口的网络适配器时,PCIE支持自动协商网络适配器和CPU之间的最大能力.

PCIe Attributes

译者注:如下的4个指标是PCIE口的参数,不是HCA卡的参数.

1.PCIe Width ----类似于高速公路车道的个数?

PCIe width determines the number of PCIe lanes that can be used in parallel by the device for communication.

The width is marked as xA, where A is the number of lanes (e.g. x8 for 8 lanes). Mellanox adapters support x8 and x16 configurations, depending on their type.

# lspci -s 04:00.0 -vvv | grep Width
             LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
             LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
           

注意上面两行的不同含义: 

the device capabilities (under LnkCap)

 their current status (under LnkSta) 

2.PCIe Speed ----类似于高速公路每条车道的通行能力?

Determines the number of PCIe transactions possible

The speed is measured in GT/s which stands for "billion transactions per second". 

Together with the PCIe width, the maximal PCIe bandwidth is determined (speed * width).

# lspci -s 04:00.0 -vvv | grep Speed
             LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
             LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
           

PCIe speeds are identified as "generations", 

where 2.5GT/s is referred as "gen1", 5GT/s as "gen2", and 8GT/s as "gen3".

Most Mellanox products support all generations. 

You can view the PCIe generation by using the command lspci as well:

# lspci -s 04:00.0 -vvv | grep PCIeGen
                        [V0] Vendor specific: PCIeGen3 x8
           

3.PCIe Max Payload(有效载荷)Size -----类似以太网卡的MTU?

The PCIe Max Payload Size determines the maximal size of a PCIe packet, 

or PCIe MTU (similar to networking protocols)

This means that larger PCIe transactions are broken into PCIe MTU sized packets. 

This parameter is set only by the system and depends on the chipset architecture (e.g. x86_64, Power8, ARM, etc). 

You can view the PCIe Max Payload Size by using the command lspci (specified under DevCtl).

lspci -s 04:00.0 -vvv | grep DevCtl: -C 2
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
           

4.PCIe Max Read Request

PCIe Max Read Request determines the maximal PCIe read request allowed.

A PCIe device usually keeps track of the number of pending read requests due to having to prepare buffers for an incoming response.

The size of the PCIe max read request may affect the number of pending requests (when using data fetch larger than the PCIe MTU). 

Again, use the command lspci in order to query for the Max Read Request value:

计算PCIE的限制值.

The maximum possible PCIe bandwidth is calculated by multiplying the PCIe width and speed.

From that number we reduce ~1Gb/s for error correction protocols and the PCIe headers overhead. The overhead is determined by both the PCIe encoding (see PCIe speed for details), and the PCIe MTU:

Maximum PCIe Bandwidth = SPEED * WIDTH * (1 - ENCODING) - 1Gb/s.

For example, a gen 3 PCIe device with x8 width will be limited to:

Maximum PCIe Bandwidth = 8G * 8 * (1 - 2/130) - 1G = 64G * 0.985 - 1G = ~62Gb/s.

Another example - a gen 2 PCIe device with x16 width will be limited to:

Maximum PCIe Bandwidth = 5G * 16 * (1 - 1/5) - 1G = 80G * 0.8 - 1G = ~63Gb/s.

继续阅读