天天看點

[翻譯自Mellanox官網]從最大性能角度來了解PCIE的配置.

Understanding PCIe Configuration for Maximum Performance

原文:

https://community.mellanox.com/docs/DOC-2496

從最大性能角度來了解PCIE的配置.

我們為什麼使用PCIE?

PCIE可以用在任何系統中,用于不同modules之間的通信.

網絡擴充卡需要與CPU和記憶體進行通信.

這意味着,為了處理網絡流量,通過PCIe通信的不同裝置應該被配置.

當連接配接到PCIe口的網絡擴充卡時,PCIE支援自動協商網絡擴充卡和CPU之間的最大能力.

PCIe Attributes

譯者注:如下的4個名額是PCIE口的參數,不是HCA卡的參數.

1.PCIe Width ----類似于高速公路車道的個數?

PCIe width determines the number of PCIe lanes that can be used in parallel by the device for communication.

The width is marked as xA, where A is the number of lanes (e.g. x8 for 8 lanes). Mellanox adapters support x8 and x16 configurations, depending on their type.

# lspci -s 04:00.0 -vvv | grep Width
             LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
             LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
           

注意上面兩行的不同含義: 

the device capabilities (under LnkCap)

 their current status (under LnkSta) 

2.PCIe Speed ----類似于高速公路每條車道的通行能力?

Determines the number of PCIe transactions possible

The speed is measured in GT/s which stands for "billion transactions per second". 

Together with the PCIe width, the maximal PCIe bandwidth is determined (speed * width).

# lspci -s 04:00.0 -vvv | grep Speed
             LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
             LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
           

PCIe speeds are identified as "generations", 

where 2.5GT/s is referred as "gen1", 5GT/s as "gen2", and 8GT/s as "gen3".

Most Mellanox products support all generations. 

You can view the PCIe generation by using the command lspci as well:

# lspci -s 04:00.0 -vvv | grep PCIeGen
                        [V0] Vendor specific: PCIeGen3 x8
           

3.PCIe Max Payload(有效載荷)Size -----類似以太網卡的MTU?

The PCIe Max Payload Size determines the maximal size of a PCIe packet, 

or PCIe MTU (similar to networking protocols)

This means that larger PCIe transactions are broken into PCIe MTU sized packets. 

This parameter is set only by the system and depends on the chipset architecture (e.g. x86_64, Power8, ARM, etc). 

You can view the PCIe Max Payload Size by using the command lspci (specified under DevCtl).

lspci -s 04:00.0 -vvv | grep DevCtl: -C 2
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
           

4.PCIe Max Read Request

PCIe Max Read Request determines the maximal PCIe read request allowed.

A PCIe device usually keeps track of the number of pending read requests due to having to prepare buffers for an incoming response.

The size of the PCIe max read request may affect the number of pending requests (when using data fetch larger than the PCIe MTU). 

Again, use the command lspci in order to query for the Max Read Request value:

計算PCIE的限制值.

The maximum possible PCIe bandwidth is calculated by multiplying the PCIe width and speed.

From that number we reduce ~1Gb/s for error correction protocols and the PCIe headers overhead. The overhead is determined by both the PCIe encoding (see PCIe speed for details), and the PCIe MTU:

Maximum PCIe Bandwidth = SPEED * WIDTH * (1 - ENCODING) - 1Gb/s.

For example, a gen 3 PCIe device with x8 width will be limited to:

Maximum PCIe Bandwidth = 8G * 8 * (1 - 2/130) - 1G = 64G * 0.985 - 1G = ~62Gb/s.

Another example - a gen 2 PCIe device with x16 width will be limited to:

Maximum PCIe Bandwidth = 5G * 16 * (1 - 1/5) - 1G = 80G * 0.8 - 1G = ~63Gb/s.

繼續閱讀