天天看點

記憶體模型與同步原語 - 3 x86 記憶體模型

x86-TSO (Total Store Ordering) 是論文 x86-TSO: A Rigorous and Usable Programmer’s Model for x86 Multiprocessors 提出的一種 memory model,這是論文作者根據 Intel/AMD specification 的閱讀總結,提出的一種總結性的記憶體模型

雖然 Intel/AMD 官方一開始并不是按照這個模型進行架構設計的,但是目前這個模型可以比較好地描述 x86 架構下的 memory model,因而這個模型對于了解 x86 memory model 具有重要作用

記憶體模型與同步原語 - 3 x86 記憶體模型
x86-TSO 模型中存在 store buffer 但不存在 invalidate queue,同時模型可以簡要歸納為以下四點

  • The store buffers are FIFO and a reading thread must read its most recent buffered write, if there is one, to that address; otherwise reads are satisfied from shared memory.
  • An MFENCE instruction flushes the store buffer of that thread.
  • To execute a LOCK’d instruction, a thread must first obtain the global lock. At the end of the instruction, it flushes its store buffer and relinquishes the lock. While the lock is held by one thread, no other thread can read.
  • A buffered write from a thread can propagate to the shared memory at any time except when some other thread holds the lock.
x86-TSO: A Rigorous and Usable Programmer’s Model for x86 Multiprocessors

memory order

x86 架構下隻可能存在 StoreLoad reorder,但是隻限于對不同記憶體位址的操作

  • Reads are not reordered with other reads.
  • Writes are not reordered with older reads.
  • Writes to memory are not reordered with other writes.
  • Reads may be reordered with older writes to different locations but not with older writes to the same location.
  • Reads may be reordered with older writes to the same location in case of Intra-Processor Forwarding.
Intel Architecture Software Developer Manual, volume 3A, chapter 8, section 8.2 "Memory Ordering"

首先,x86 架構下不存在 invalidate queue,因而也就不存在 LoadLoad reorder

其次 x86 架構下存在 store buffer,store buffer 有可能導緻 StoreStore/StoreLoad reorder,但是 x86 架構下 single-CPU 上的多條 write 指令之間不會發生重排,即使這些 write 指令的是不同的記憶體位址,因而 x86 架構下不存在 StoreStore reorder

而由于 store buffer 的存在,x86 架構下是存在 StoreLoad reorder 的,但是也隻限于對不同記憶體位址的操作,也就是說對同一記憶體位址的 Store/Load 指令是不會發生 reorder 的

CPU barrier

x86 架構下隻存在 StoreLoad reorder,以下方法都可以用于 x86 架構下的 CPU barrier

  • LOCK instruction
  • SFENCE/LFENCE/MFENCE

LOCK 指令可以用于 x86 架構下的 CPU barrier

LOCK 指令最初用于實作 atomic RMW (Read-Modify-Write) 操作,但是 LOCK 指令執行的時候也會作 flush store buffer 操作

Locking operations typically operate like I/O operations in that they wait for all previous instructions to complete and for all buffered writes to drain to memory.

因而 load/store 操作都不能與 LOCK 指令發生重排,也就是說 LOCK 指令相當于一個 full barrier,因而可以用于消除 StoreLoad reorder

Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions.

FENCE

此外 x86 架構下專門的 FENCE 指令也可用于 CPU barrier

MFENCE 指令相當于一個 full barrier,load/store 操作都不能與其發生重排

MFENCE 指令執行時會執行 flush store buffer 操作,此時會等待 store buffer 中的所有 invalidate message 都收到對應的 invalidate acknowledge message 時,才能繼續執行 MFENCE 指令之後的記憶體通路指令,進而消除 StoreLoad reorder

all memory operantions stay above the line
---------------------
       MFENCE
---------------------
all memory operantions stay below the line           

LFENCE/SFENCE 指令則相當于是更加細粒度的 barrier

LFENCE 指令相當于一個 read barrier,前後的 load 指令都不能與 LFENCE 指令發生重排,此外 LFENCE 指令也不能與之後的 STORE 指令發生重排

all LOADs stay above the line
---------------------
       LFENCE
---------------------
all LOADs/STOREs stay below the line           

SFENCE 指令則相當于一個 write barrier,前後的 store 指令都不能與 SFENCE 指令發生重排

all STOREs stay above the line
---------------------
       SFENCE
---------------------
all STOREs stay below the line           

SFENCE — Serializes all store (write) operations that occurred prior to the SFENCE instruction in the program instruction stream, but does not affect load operations.

LFENCE — Serializes all load (read) operations that occurred prior to the LFENCE instruction in the program

MFENCE — Serializes all store and load operations that occurred prior to the MFENCE instruction in the program instruction stream.

Reads cannot pass earlier LFENCE and MFENCE instructions.

Writes and executions of CLFLUSH and CLFLUSHOPT cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.

LFENCE instructions cannot pass earlier reads.

SFENCE instructions cannot pass earlier writes or executions of CLFLUSH and CLFLUSHOPT.

MFENCE instructions cannot pass earlier reads, writes, or executions of CLFLUSH and CLFLUSHOPT.

繼續閱讀