sql 闩鎖 原因
SQL Server locks, discussed in the article All about locking in SQL Server, which is applied on data for the duration of the logical operation to preserve logical transaction consistency. SQL Server latches, however, are a special type of low-level system locks which are held as long as the physical operation lasts on the memory page in order to protect memory consistency
SQL Server鎖,在“ 關于 SQL Server中的鎖”一文中進行了讨論,該鎖在邏輯操作期間應用于資料,以保持邏輯事務的一緻性。 但是,SQL Server闩鎖是一種特殊類型的低級系統鎖,隻要實體操作在記憶體頁上持續進行,它們就會保留下來,以保護記憶體一緻性
SQL Server latches are an internal SQL Server mechanism that serves to protect shared memory resources, like pages and memory data structures inside the buffer pool, in order to coordinate access to those resources and protect them from corruption. Designed as an internal SQL Server mechanism that is not exposed outside the SQL Server Operating System (SQLOS), latches can be managed only by SQL Server, itself, and not by users (unlike locks that can be managed via NO LOCK hints). Every time SQL Server has to read memory, it will impose latches to the page or internal memory structure that cannot be accessed in a proper multi-threaded way. In this way, SQL Server establishes latches as a resource for the coordination of multiple physical thread execution in a SQL Server database
SQL Server闩鎖是一種内部SQL Server機制,用于保護共享記憶體資源,例如緩沖池中的頁面和記憶體資料結構,以協調對這些資源的通路并保護它們免遭損壞。 闩鎖被設計為内部SQL Server機制,不會在SQL Server作業系統(SQLOS)外部公開,隻能由SQL Server本身而不由使用者管理(不同于可以通過NO LOCK提示進行管理的鎖)。 每次SQL Server必須讀取記憶體時,它将對無法以适當的多線程方式通路的頁面或内部記憶體結構施加闩鎖。 這樣,SQL Server将闩鎖建立為一種資源,用于協調SQL Server資料庫中多個實體線程的執行
In the same manner as locks, SQL Server latches can come in various modes:
以與鎖定相同的方式,SQL Server闩鎖可以有多種模式:
- Destroy Latch (DT): the most restrictive latch mode, acquired when a latch is destroyed and a buffer is to be removed from the cache. DT latches block even the KP latch 銷毀闩鎖(DT) :限制最嚴格的闩鎖模式,在銷毀闩鎖并從高速緩存中删除緩沖區時擷取。 DT闩鎖甚至擋住了KP闩鎖
- Exclusive Latch (EX): acquires exclusive control of a page being written. Prevents all other latches to be acquired on the page where EX latch exists 排他鎖(EX) :獲得對正在寫入的頁面的排他控制。 防止在存在EX鎖存器的頁面上擷取所有其他鎖存器
- Update Latch (UP): restrictive similar to an exclusive latch, with an exception that it allows read operation to access the page, but restricts, explicitly, any write operation更新鎖存器(UP):與獨占鎖存器類似,但有限制,差別在于它允許讀取操作通路頁面,但明确限制任何寫操作
- Keep Latch (KP): it serves to preserve a latch order record but also to ensure that it stays in the buffer when a new latch is being placed on it 保留鎖存器(KP) :它不僅可以保留鎖存器順序記錄,而且還可以確定在将新的鎖存器放置在緩沖區上時将其保留在緩沖區中
- Shared Latch (SH): acquired on a page when a read request issued to a page is granted 共享鎖存器(SH) :在授予頁面的讀取請求時,在頁面上擷取
Similarly to locks, there is a compatibility or incompatibility component between the various latch modes. The table below gives an insight in compatibility between the various SQL Server latches
與鎖類似,各種鎖存模式之間存在相容性或不相容性元件。 下表深入介紹了各種SQL Server闩鎖之間的相容性
There are many different types of SQL Server latches, but essentially they can be split into three general categories: I/O latches, buffer latches, and non-buffer latches.
有很多不同類型SQL Server鎖存器,但實際上它們可以分為三大類:I / O鎖存器,緩沖區鎖存器和非緩沖區鎖存器。
I / O鎖存器 (I/O latches)
I/O Latches are acquired in situations when an outstanding I/O operation is executed over the pages stored in the buffer pool, or more precisely, when data has to be read from or written to physical storage. The SQL Server will use PAGEIOLATCH_XX wait types to report when a process is waiting for on a SQL Server I/O latch to be released
當對緩沖池中存儲的頁面執行了出色的I / O操作時,或更确切地說,當必須從實體存儲中讀取或寫入資料時,将擷取I / O鎖存器。 當程序正在等待釋放SQL Server I / O闩鎖時,SQL Server将使用PAGEIOLATCH_XX等待類型來報告
So, in situations when the page is requested to be brought from storage into a buffer pool, a PAGEIOLATCH will be acquired on that page, and if storage is not ready to be read the PAGEIOLATCH wait type count will increase
是以,在請求将頁面從儲存設備帶入緩沖池的情況下,将在該頁面上擷取PAGEIOLATCH,并且如果尚未準備好讀取儲存設備,則PAGEIOLATCH等待類型計數将增加
緩沖鎖存器 (Buffer latches)
In order to properly understand buffer latches, it is important to properly understand the idea behind the memory buffer pool, which is designed around the goal of maximizing SQL Server performance. The buffer pool is a physical memory range where data that is read from disk is stored in data pages. Data in SQL Server tables is stored in pages and each page has a fixed size of 8192 bytes (8 KB). Whenever a data page has to be read or written to, it will be first brought into a buffer pool. In that way, any further access to that page will be read directly from the memory buffer pool, thus improving SQL Server performance by minimizing disk IO.
為了正确地了解緩沖區闩鎖,正确了解記憶體緩沖池背後的思想很重要,該思想旨在最大化SQL Server性能。 緩沖池是一個實體記憶體範圍,将從磁盤讀取的資料存儲在資料頁中。 SQL Server表中的資料存儲在頁面中,每個頁面的固定大小為8192位元組(8 KB)。 每當必須讀取或寫入資料頁時,都會首先将其放入緩沖池中。 這樣,将直接從記憶體緩沖池中讀取對該頁面的任何進一步通路,進而通過最小化磁盤IO來提高SQL Server性能。
This implementation of the memory pool concept in SQL Server is what drives SQL Server physical memory usage can be high even in situations where there is no SQL Server activity. The loading of data in the buffer pool is based on the First-In First-Out (FIFO) principle.
即使在沒有SQL Server活動的情況下,SQL Server中記憶體池概念的這種實作也是驅動SQL Server實體記憶體使用率很高的原因。 緩沖池中的資料加載基于先進先出(FIFO)原理。
SQL Server uses the buffer manager for managing the buffer pool and it is therefore in charge of any hash tables, the pool array that contain pages and for pages stored in the buffer. The SQLOS is accessing the data stored in the memory exclusively via the buffer manager
SQL Server使用緩沖區管理器來管理緩沖池,是以它負責所有哈希表,包含頁面的池數組以及存儲在緩沖區中的頁面。 SQLOS僅通過緩沖區管理器通路存儲在記憶體中的資料
The pages that are modified in the buffer pool due to executed insert, delete or update command, are the so called “dirty” pages, while the unmodified pages are called “clean” pages. So when the page has to be accessed in memory, the SQL OS will acquire the buffer latch on that page. But unlike a lock, the SQL Server latch will not be held for the transaction duration but rather just during the critical period of a transaction, and it will be released as soon as it is no longer needed. SQL Server will use PAGELATCH_XX wait types to report when a process is waiting for on a SQL Server buffer latch to be released
由于執行了插入,删除或更新指令而在緩沖池中修改過的頁面稱為“髒”頁面,而未修改的頁面稱為“幹淨”頁面。 是以,當必須在記憶體中通路該頁面時,SQL OS将在該頁面上擷取緩沖區鎖存器。 但是與鎖不同的是,SQL Server闩鎖将不會在事務持續時間内被保持,而隻會在事務的關鍵時期内被保持,并且在不再需要它時将立即釋放它。 當程序正在等待釋放SQL Server緩沖區闩鎖時,SQL Server将使用PAGELATCH_XX等待類型來報告
非緩沖鎖存器 (Non-buffer latches)
Non-buffer latches are designed to protect and guarantee any physical memory structure other than pages stored in the buffer pool. SQL Server will use LATCH_XX wait types to report when a process is waiting for on a SQL Server buffer latch to be released. Non-buffer latches are not often encountered during, and thus those are the least documented, but here are some use cases that can lead to SQL Server contention with non-buffer latches:
非緩沖鎖存器旨在保護和保證除緩沖池中存儲的頁面以外的任何實體記憶體結構。 當程序正在等待釋放SQL Server緩沖區闩鎖時,SQL Server将使用LATCH_XX等待類型進行報告。 非緩沖鎖存器在此期間并不經常遇到,是以,文獻記載最少,但是這裡有一些用例可能導緻SQL Server與非緩沖鎖存器競争:
- Excessive parallelism – In a situation when a high level of parallelism is used on servers with 12+ logical processors, most if not all, queries can qualify to use parallel execution plans. In such a situation, non-buffered latches (LATCH_XX) will be acquired in memory to ensure the synchronization of internal memory structures used by parallel execution plans 過多的并行性 –在具有12個以上邏輯處理器的伺服器上使用高度并行性的情況下,大多數(如果不是全部)查詢可以使用并行執行計劃。 在這種情況下,将在記憶體中擷取非緩沖鎖存器(LATCH_XX),以確定并行執行計劃使用的内部記憶體結構同步
- Too many auto-grow/auto-shrink operations – in systems with poor planning of database sizing or storage capacity (bad default database settings), auto-grow operations can be executed frequently. In addition, when auto-shrink is turned on, frequent database shrinking will occur. When growth and shrink operations are executed, SQL Server acquires 自動增長/自動收縮操作太多 –在資料庫大小或存儲容量計劃不佳(預設資料庫設定不佳)的系統中,可能會頻繁執行自動增長操作。 此外,啟用自動收縮功能後,會頻繁發生資料庫收縮。 當執行增長和收縮操作時,SQL Server将擷取FCB, FGCB_ADD_REMOVE and FGCB_ALLOC latches class to ensure the access to the file control block and to ensure synchronized access to information stored in the filegroup FCB,FGCB_ADD_REMOVE和FGCB_ALLOC闩鎖類,以確定對檔案控制塊的通路并確定對檔案組中存儲的資訊的同步通路
- Very high frequency of DML operations on heap and BLOB data structures – In a situation where excessive DML operations are performed on heap and BLOB data, it is necessary to make sure to keep all internal memory structures in responsible for allocation and deallocation of pages to heap synchronized. In such situations, excessive LATCH_EX wait types can be encountered. When this occurs ALLOC_CREATE_FREESPACE_CACHE, ALLOC_FREESPACE_CACHE, ALLOC_EXTENT_CACHE wait types could be found as prevailing wait types via the sys.dm_os_latch_stats DMV 對堆和BLOB資料結構進行DML操作的頻率很高 –在對堆和BLOB資料進行過多DML操作的情況下,有必要確定使所有内部記憶體結構負責對要配置設定給堆的頁進行配置設定和釋放已同步。 在這種情況下,可能會遇到過多的LATCH_EX等待類型。 發生這種情況時,可以通過sys.dm_os_latch_stats DMV找到ALLOC_CREATE_FREESPACE_CACHE,ALLOC_FREESPACE_CACHE,ALLOC_EXTENT_CACHE等待類型作為主要的等待類型。
So, based on the previous, in situations when LATCH_XX wait type have excessive values or those are prevalent wait types, it is good to check which non-buffer latches are prevalent in the SQL Server using the following query
是以,根據前面的内容,在LATCH_XX等待類型具有過多值或這些值是普遍等待類型的情況下,最好使用以下查詢來檢查SQL Server中哪些非緩沖區鎖存器普遍存在
SELECT latch_class, wait_time_ms,waiting_requests_count, 100.0 * wait_time_ms / SUM
(wait_time_ms) OVER() AS '% of latches'
FROM sys.dm_os_latch_stats
WHERE latch_class NOT IN ('BUFFER')
AND wait_time_ms > 0
超級鎖 (SuperLatches)
Starting with SQL Server 2005, superlatches (also called sublatches) were introduced to improve SQL Server efficiency in highly concurrent OLTP workloads for a certain pattern of usage (i.e. very high shared read only access to the page (SH) while write access is very low or not exists). Superlatches are used by SQL Server only in NUMA systems with 32+ logical processors. This mechanism is an efficient way of SQL Server to deal with a latch contention by dynamically promoting an array of latches to a Superlatch and thus allowing an SH mode request to the superlatch, while the containing sublatches can remain different modes. When this occurs, the superlatch becomes just a pointer to an array of SQL Server latches.
從SQL Server 2005開始,引入了超級鎖(也稱為子鎖),以針對某些使用模式(即,對頁面(SH)的共享通路權限非常高,而對寫入的通路權限非常低的情況),在高度并發的OLTP工作負載中提高SQL Server的效率。或不存在)。 SQL Server僅在具有32個以上邏輯處理器的NUMA系統中使用超級鎖。 這種機制是SQL Server通過有效地将闩鎖數組動态提升為超級闩鎖,進而允許向超級闩鎖發出SH模式請求的有效方法,而包含子闩鎖可以保持不同的模式。 發生這種情況時,超級闩鎖将成為指向SQL Server闩鎖數組的指針。
A Superlatch will behave as a single latch with sublatch structures and there can be one sublatch per partition per logical CPU core. So when a superlatch is created, the CPU worker thread will just have to acquire the shared (SH) sublatch that is assigned to the scheduler. This ensures that a shared (SH) superlatch uses less resources while at the same time access to pages is more efficient comparing to non-partitioned shared latches. The reason for this is that the superlatch do not require any synchronization of the global state as it will access only the local NUMA memory
超級闩鎖将充當具有子闩鎖結構的單個闩鎖,每個邏輯CPU核心的每個分區可以有一個子闩鎖。 是以,當建立超級鎖存器時,CPU工作線程将隻需要擷取配置設定給排程程式的共享(SH)子鎖存器。 這樣可以確定共享(SH)超級鎖使用的資源較少,而與未分區的共享鎖相比,通路頁面的效率更高。 這樣做的原因是超級鎖不需要全局狀态的任何同步,因為它将僅通路本地NUMA記憶體
鎖存競争 (Latch Contention)
Latch contention is a frequent scenario for systems with large number of CPUs, and it is the consequence of situations when on the same in-memory structure, multiple threads are trying, concurrently, to acquire SQL Server latches that are not compatible with each other. Since SQL Server latches are controlled by an internal SQL Server mechanism, SQLOS will determine on its own when to use them. Due to the deterministic nature of SQL Server latches and their behavior, various parameters such as application design or database schema structure can significantly affect SQL Server latches
鎖存器争用是具有大量CPU的系統的常見情況,并且是在相同的記憶體結構中多個線程同時嘗試擷取彼此不相容SQL Server鎖存器的結果。 由于SQL Server闩鎖由内部SQL Server機制控制,是以SQLOS将自行決定何時使用它們。 由于SQL Server闩鎖的确定性和行為,各種參數(例如應用程式設計或資料庫架構結構)會嚴重影響SQL Server闩鎖
On high throughput systems which are designed for a large number of CPUs and thus, high-concurrency, active latch contention is expected as a regular occurrence of on memory structures are often accessed and protected using the latches. But the situation when latch contention and latch wait types wait time is large enough to decrease utilization of CPUs is what results in the reduced throughput
在為大量CPU設計的高吞吐量系統上,是以,高并發性會導緻主動鎖存器争用,因為經常會使用鎖存器通路并保護記憶體結構的正常發生。 但是,當鎖存器争用和鎖存器等待類型的等待時間足夠長以至于降低CPU使用率時,會導緻吞吐量降低
Recognizing and identifying the signs of latch contention is important, so let’s shed light on some symptoms of latch contention
識别和識别闩鎖争用的迹象很重要,是以讓我們了解一下闩鎖争用的某些症狀
The expected behavior of SQL Server latches, in relation to the transactions per second, is that transactions per second will increase along with increasing average SQL Server latch waits, that themselves increase at a slow rate that will be within the margins of the throughput. Such a situation is represented in the image below and this is the desired system behavior which indicates that logical processors are not conflicting with each other. In such a scenario, adding more logical processors means that more can be done
SQL Server闩鎖相對于每秒事務的預期行為是,每秒事務将随着平均SQL Server闩鎖等待次數的增加而增加,而它們自身的增長速度卻很慢,這将在吞吐量的範圍内。 下圖顯示了這種情況,這是所需的系統行為,表明邏輯處理器之間沒有沖突。 在這種情況下,添加更多邏輯處理器意味着可以完成更多工作
Situations when transactions/sec value is dropping when enabling additional logical processors while, at the same time, average SQL Server latch wait times are increasing at a greater rate than the system throughput, potentially indicate that there is a high probability that a problem with a latch contention may exist. The following image represents a typical situation where adding new logical processors worked until the certain point when longer latch wait times started to occur. This results in a situation where adding new logical processors will not have any benefits, up to a point where transactions/sec starts to negatively affect performance. This is ta typical situation where adding new logical processors actually had a negative, vs. a positive effect, as the resulting system environment will be spending a lot of time in a waiting state.
啟用其他邏輯處理器時,事務/秒值下降的情況,同時平均SQL Server闩鎖等待時間以大于系統吞吐量的速率增加,這潛在地表明存在問題的可能性很大。闩鎖争用可能存在。 下圖代表一種典型的情況,在這種情況下,添加新的邏輯處理器将一直工作到較長的闩鎖等待時間開始出現的某個時刻。 這導緻添加新的邏輯處理器不會帶來任何好處,直到事務/秒開始對性能産生負面影響為止。 在典型情況下,添加新的邏輯處理器實際上會産生負面影響,而正面影響則是這樣,因為最終的系統環境将在等待狀态下花費大量時間。
Latch contention that can affect the OLTP performance is mainly caused when high concurrency is the result of some of the following factors:
可能影響OLTP性能的闩鎖争用主要是由于以下某些因素導緻高并發性導緻的:
- Application design based on high concurrency – when a client application issues a high number of concurrent requests against the database 基于高并發性的應用程式設計 –當用戶端應用程式對資料庫發出大量并發請求時
- SQL Server logical files layout – allocation structures such as Global Allocation Map (GAM), Shared Global Allocation Map (SGAM), Page Free Space (PFS) and Index Allocation Map (IAM) can impact page latch contention when many concurrent threads are in conflict SQL Server邏輯檔案布局 –當許多并發線程發生沖突時,諸如全局配置設定圖(GAM),共享全局配置設定圖(SGAM),頁面可用空間(PFS)和索引配置設定圖(IAM)之類的配置設定結構可能會影響頁面闩鎖争用
- Database schema design – read, write, delete data access patterns, index B+tree depth, design of clustered and non-clustered indexes, rows size and density per page 資料庫架構設計 –讀取,寫入,删除資料通路模式,索引B +樹的深度,叢集和非叢集索引的設計,每頁的行大小和密度
- The performance of I/O subsystems – is a quite frequent cause since, due to low I/O subsystem performance, SQL Server must wait for the data to be moved to a buffer pool. Excessive PAGEIOLATCH_XX wait type is indicative of the slow I/O subsystem I / O子系統的性能 –是一個很常見的原因,因為由于I / O子系統的性能低,SQL Server必須等待将資料移到緩沖池中。 過多的PAGEIOLATCH_XX等待類型表示I / O子系統緩慢
- Large number of logical CPUs assigned to SQL Server – Excessive latch contention that affects the performance of SQL Server to a level that is not acceptable is indicated in the system with more than 16 logical CPUs, and more logical CPUs are available the higher level of contention might be 配置設定給SQL Server的邏輯CPU數量很多 –在具有超過16個邏輯CPU的系統中,訓示将過多的闩鎖争用影響SQL Server的性能達到不可接受的水準,并且可用邏輯CPU越多,争用級别越高可能
翻譯自: https://www.sqlshack.com/all-about-latches-in-sql-server/
sql 闩鎖 原因