Queued Spin Locks

The classic spin lock has problems when multiple threads wait on a spin lock. The biggest immediate problem is that although one contender eventually becomes the new owner, which one is a free-for-all. Simply from bad luck in the scramble, a processor can be left spinning unreasonably long. The solution is known as a queued spin lock because ownership is acquired in the same order that the contenders asked for it.

Queued spin locks first appeared in Windows 2000 but only for the kernel’s own use to improve performance for some particular spin locks that get used widely through the system, especially by the I/O Manager, as with the one that’s exposed through the IoAcquireCancelSpinLock function and the one that supports such functions as IoGetLowerDeviceObject. Windows XP saw this solution get generalised and documented.

The implementation is necessarily more complex than for classic spin locks. Acquiring a queued spin lock is more than just passing its address to a kernel function which then does something just to the lock. Since the processors that ever wait on a queued spin lock must be kept in order, some context must be managed for each of them. The particular design of that context looks to have been motivated at least in part by another problem with the classic spin lock. While multiple processors spin on the same lock, they each keep reading this same lock. The one address is out on the bus being fought over—over and over—even just to find out that everyone has to keep waiting. Better, then, is that whatever context supports the queuing also has each processor provide its own signal for when its wait is over. That the processors’ spin loops are largely independent of one another may even be a more important real-world benefit than the determinism of queueing.

Context

The context that each contender for a queued spin lock must supply is a KSPIN_LOCK_QUEUE. The address of one, typically but not necessarily embedded in a KLOCK_QUEUE_HANDLE, is passed to the kernel when asking to acquire a queued spin lock. It must then be allowed no other use until the contender has acquired the lock and released it.

Offset (x86) Offset (x64) Definition Versions
0x00 0x00
KSPIN_LOCK_QUEUE * volatile Next;
5.0 and higher
0x04 0x08
KSPIN_LOCK * volatile Lock;
5.0 and higher

The Lock member initially tells which spin lock is wanted. By requiring that the lock be naturally aligned, the low two or three bits of the Lock member are available for context. The whole of the Next member is too.

The point to the Next member is, of course, to support a queue of the one owner and the possibly many waiters. At the head of the queue is the KSPIN_LOCK_QUEUE structure for the processor that owns the lock. The tail of the queue is either the head, trivially, or the KSPIN_LOCK_QUEUE for the processor that most recently started waiting on the lock. The owner releases from the head. Waiters are added at the tail. Because an owner will present its own KSPIN_LOCK_QUEUE when releasing the lock, the head does not need to be tracked as context. Instead, the lock is kept pointing to the tail.

Note that this is a significant re-interpretation of the contents of a KSPIN_LOCK when used as a queued spin lock. While a queued spin lock is owned, it holds the address of the KSPIN_LOCK_QUEUE at the tail of the queue. While a queued spin lock is available, it holds NULL.

Acquisition

Given that the IRQL is already at or above DISPATCH_LEVEL, acquiring a queued spin lock starts with an xchg instruction (and its implied lock) to point the lock to the contender’s KSPIN_LOCK_QUEUE as the new tail while discovering the previous tail. If the lock previously held NULL, it was unowned and the contender now owns the lock.

Spinning

Given that the lock was owned, coordination is required such that the owner cannot release the lock (see below) until the contender’s addition to the queue is completed. Though the contender’s KSPIN_LOCK_QUEUE is the new tail as seen from the lock for the purpose of adding waiters, it is not yet as seen from the head for the purpose of eventually becoming the owner. It must yet be set into the previous tail’s Next member. The moment that it is, a release (or quick succession of them) can select the contender as the new owner. It therefore can’t join the queue until it has some means of being signalled that it has reached the head of the queue. This signal is the LOCK_QUEUE_WAIT bit (masked by 0x01) in the Lock member of the contender’s KSPIN_LOCK_QUEUE. It’s set before joining the queue. It gets cleared eventually as the contender’s signal that it is the new owner. Meanwhile, the contender tests this bit over and over for as long as the bit remains set. Queued spin locks have the same elaborations of this spin loop as do classic spin locks.

Release

The essence of releasing a spin lock is to clear the LOCK_QUEUE_WAIT bit in whatever KSPIN_LOCK_QUEUE structure is pointed to by the owner’s Next member. The processor that supplied this structure can then exit its spin loop as the lock’s new owner.

A complication exists when the owner’s Next member is already NULL. This may mean there is no waiter and that the lock is returning to being unowned. But it may instead mean that a processor has started trying to acquire the lock but has not yet completed its addition to the queue. The owner resolves this with the lock cmpxchg instruction. If there is no waiter, the lock will hold the address of the owner’s KSPIN_LOCK_QUEUE and the release will be completed by atomically clearing the lock to NULL. If there is an incompletely established waiter, the lock will hold the address of its KSPIN_LOCK_QUEUE, not the owner’s, and is left alone. It is then the owner that must wait for the prospective waiter to set the address of a KSPIN_LOCK_QUEUE into the owner’s Next member before the owner can clear the waiter’s LOCK_QUEUE_WAIT bit and cede ownership of the lock. This spin loop by the owner has no analogue for classic spin locks.

Exported Functions

As for the classic spin lock, the exported functions for queued spin locks were originally split between the kernel and the HAL. The kernel exports only the few functions that work with the lock but have nothing to do with the IRQL. The many compound functions that both adjust the IRQL and manage the lock are all exported from the HAL. This division got reorganised for 64-bit Windows so that all x64 builds have queued spin locks entirely as kernel functionality (coded in C rather than assembly language). For 32-bit Windows, this reorganisation waited until version 6.2 adopted the coding in C. The HAL’s functions then moved to the kernel: they continue to be exported from the HAL, but only as forwards to the kernel.

Function HAL Versions (x86 Only) Kernel Versions
KeAcquireInStackQueuedSpinLock 5.1 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeAcquireInStackQueuedSpinLockAtDpcLevel   5.1 and higher
KeAcquireInStackQueuedSpinLockRaiseToSynch 5.1 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeAcquireQueuedSpinLock 5.0 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeAcquireQueuedSpinLockRaiseToSynch 5.0 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeReleaseInStackQueuedSpinLock 5.1 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeReleaseInStackQueuedSpinLockFromDpcLevel   5.1 and higher
KeReleaseQueuedSpinLock 5.0 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeTryToAcquireQueuedSpinLock 5.0 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)
KeTryToAcquireQueuedSpinLockRaiseToSynch 5.0 and higher 5.2 from Windows Server 2003 SP1, and higher (x64 only);
6.2 and higher (x86)