WMI_SPINLOCK

The WMI_SPINLOCK structure is the event-specific data for one type of event that can be written to an NT Kernel Logger session. The particular type of event whose data is a WMI_SPINLOCK has the Hook ID PERFINFO_LOG_TYPE_SPINLOCK (0x0529).

Availability

For an NT Kernel Logger session to be sent any PERFINFO_LOG_TYPE_SPINLOCK events, the group mask PERF_SPINLOCK (0x20010000) must be enabled. The kernel can write this event in version 6.1 and higher for x64 builds but only in version 6.2 and higher for x86 builds.

Usage

The PERFINFO_LOG_TYPE_SPINLOCK event traces the release of a spin lock and includes information about the lock’s acquisition. For this purpose, spin lock means not just the original spin lock, but also the queued spin lock and the more recent executive spin lock.

Spin locks see such heavy use in the ordinary execution of Windows that tracing the acquisition and release of every spin lock would be wildly impractical, perhaps even in a debug build, but certainly in real-world use. What is supported instead is a statistical sampling. The default is to write a PERFINFO_LOG_TYPE_SPINLOCK event for each lock whose acquisition was contended, and in version 6.3 and higher, for each spin lock that was held for too long (at least a million CPU cycles), but only for roughly every thousandth whose acquisition was uncontended. Even these defaults can generate very many events. Parameters that govern this sampling can be queried and set in a EVENT_TRACE_SPINLOCK_INFORMATION structure, itself passed to and from the kernel through the SystemPerformanceTraceInformation case of the NtQuerySystemInformation and NtSetSystemInformation functions.

Documentation Status

The WMI_SPINLOCK structure is not documented but a C-language definition is published in the NTWMI.H from the Enterprise edition of the Windows Driver Kit (WDK) for Windows 10 version 1511.

Were it not for this relatively recent and possibly unintended disclosure, much would anyway be known from type information in symbol files. Curiously though, type information for this structure has never appeared in any public symbol files for the kernel or for the obvious low-level user-mode DLLs. In the whole of Microsoft’s packages of public symbol files, at least to the original Windows 10, relevant type information is unknown before Windows 8 and appears in symbol files only for AppXDeploymentClient.dll, CertEnroll.dll (before Windows 10) and Windows.Storage.ApplicationData.dll.

Layout

Data for the PERFINFO_LOG_TYPE_SPINLOCK event (as it exists in the trace buffers and thence the ETL file) comprises:

Trace Header

In the PERFINFO_TRACE_HEADER, the Size is the total in bytes of the trace header and all the event data. The HookId is PERFINFO_LOG_TYPE_SPINLOCK, which identifies the event. The Marker in this trace header is, at its most basic, 0xC0100002 (32-bit) or 0xC0110002 (64-bit). Additional flags may be set to indicate that extended data items are inserted between the trace header and the event data. Otherwise, the event data follows as the trace header’s Data array.

Event Data

The event data is just the one fixed-size structure. This WMI_SPINLOCK is 0x30 or 0x38 bytes in 32-bit and 64-bit Windows, respectively. Names and types in the following are taken from type information in the published symbol files for AppXDeploymentClient.dll in version 6.2 and higher, reconciled with the C-language definition that Microsoft published for the 1511 release of Windows 10.

Offset (x86) Offset (x64) Definition Versions
0x00 0x00
PVOID SpinLockAddress;
6.1 and higher (x64);
6.2 and higher (x86)
0x04 0x08
PVOID CallerAddress;
6.1 and higher (x64);
6.2 and higher (x86)
0x08 0x10
ULONG64 AcquireTime;
6.1 and higher (x64);
6.2 and higher (x86)
0x10 0x18
ULONG64 ReleaseTime;
6.1 and higher (x64);
6.2 and higher (x86)
0x18 0x20
ULONG WaitTimeInCycles;
6.1 and higher (x64);
6.2 and higher (x86)
0x1C 0x24
ULONG SpinCount;
6.1 and higher (x64);
6.2 and higher (x86)
0x20 0x28
ULONG ThreadId;
6.1 and higher (x64);
6.2 and higher (x86)
0x24 0x2C
ULONG InterruptCount;
6.1 and higher (x64);
6.2 and higher (x86)
0x28 0x30
UCHAR Irql;
6.1 and higher (x64);
6.2 and higher (x86)
0x29 0x31
UCHAR AcquireDepth;
6.1 and higher (x64);
6.2 and higher (x86)
0x2A 0x32
union {
    struct {
        UCHAR AcquireMode : 6;
        UCHAR ExecuteDpc : 1;
        UCHAR ExecuteIsr : 1;
    };
    UCHAR Flags;
};
6.1 and higher (x64);
6.2 and higher (x86)
0x2B 0x33
UCHAR Reserved [5];
6.3 and higher

Plausibly the WMI_SPINLOCK structure is defined for 32-bit Windows 7 but just isn’t used. Though the 64-bit kernel’s code for spin locks had been in C (or C++) from the start, i.e., for Windows Server 2003 SP1, the corresponding code in the 32-bit kernel is still in assembly language in Windows 7. Its evolution from Windows NT 3.1 had gone as far as adding hypervisor notifications and, for Windows 7, the maintenance of performance counters in the KPRCB, but there it was left. Not until Windows 8 does 32-bit Windows trace events for spin locks.

Note that collection of PERFINFO_LOG_TYPE_SPINLOCK events exposes two kernel-mode addresses to inspection from user mode: the SpinLockAddress and CallerAddress.

The CallerAddress is from somewhere up the call stack when the lock was released. If the lock is used by a driver or other module outside the kernel, then the lock was released by calling either of the exported functions KeReleaseSpinLock and KeReleaseSpinLockFromDpcLevel, and the CallerAddress is the address to which this called function would return. Note that this need not be in the same routine that called an exported function to acquire the lock. For locks that are used internally by the kernel, as many are, widespread inlining has the side-effect that the CallerAddress can be in some sense distant, e.g., in a routine that called another which in turn acquired and released the lock possibly as only a very small part of its work.

The AcquireTime and ReleaseTime are time stamps from the rdtsc instruction and thus count the same cycles as WaitTimeInCycles. Three time stamps are taken: first when the lock is queried for acquisition; then when the lock actually has been acquired, which is the AcquireTime; and finally when the lock is released, which is its ReleaseTime. The WaitTimeInCycles counts from when the lock is queried until it is acquired. If indeed the lock was not immediately acquired, the processor will have waited in a relatively tight spin loop. The SpinCount tells how many additional times this loop tested whether the lock is yet available. The InterruptCount counts interrupts (in the sense of the InterruptCount in the processor’s KPRCB) from when the lock was queried for acquisition until it is released.

The Irql is recorded while the WMI_SPINLOCK is prepared, i.e., after the lock is released but before any restoration to a pre-acquisition IRQL. Broadly speaking, it is whatever applied while the lock was held. Once upon a time, this would have meant that the Irql is always at or above DISPATCH_LEVEL.

The AcquireDepth tells how many spin locks the processor held at the time of release. This includes the lock that is being released, and therefore must be at least 1. For all known versions that can write the event, nested acquisition of spin locks is tracked to a depth of eight.

The known values for AcquireMode are: