KPROFILE

The KPROFILE (formally _KPROFILE) is the structure in which the kernel keeps information about an active request to examine and act on a profile interrupt.

That KPROFILE is Microsoft’s name is certain—see below—but not from documentation or from any header in a programming kit or even from public symbol files. That is some measure of the structure’s being internal to the kernel. Against this is that the structure is in the formal scheme of kernel objects that start with a type from the KOBJECTS enumeration and a size. Many such objects are documented, if only as being opaque, and have full C-language definitions in headers from as far back as the Device Driver Kit (DDK) for Windows NT 3.1. The main difference is that the documented kernel objects can be caller-supplied but the only creator of a KPROFILE is the kernel itself.

Profiling

Historically, a KPROFILE is created only when user-mode software completes two steps: first, to call the undocumented NtCreateProfile or NtCreateProfileEx function to describe what execution to sample via what sort of profile interrupt, subject to what conditions, with what storage of results; second, to start this profiling by calling NtStartProfile. The kernel creates a KPROFILE, which then carries the conditions for which interrupts to act on and the parameters for what’s to be done as this action. Among the possible conditions is that profiling can be specific to a process or may apply globally. Correspondingly, the KPROFILE goes into either: a per-process list, whose head is the ProfileListHead very near the start of the KPROCESS; else into a global list, whose head is in the kernel’s own data section. When the corresponding profiling is stopped, typically by a call to NtStopProfile, the KPROFILE is removed from its list and is destroyed.

Profile interrupts are arranged with the Hardware Abstraction Layer (HAL), either to recur periodically or when some limit is reached for a processor-specific Performance Monitoring Counter (PMC). Whenever the kernel learns of a profile interrupt’s occurrence, from the HAL via KeProfileInterrupt or KeProfileInterruptWithSource, the global list of profile objects and the list for the current process are both examined and acted on.

Originally, and even still for a profile object that is created as described in the preceding paragraphs, the examination and action are tightly constrained by the inputs to the NtCreateProfile and NtCreateProfileEx functions. The examination matches the circumstances of the interrupt against the conditions that are recorded in the profile object. That the interrupted execution is for the process that was specified at the profile’s creation is known from the object’s presence in the list for the current process at the time of the interrupt. Other conditions are that:

If all these conditions are met, the action is simply to increment an execution count in a specified buffer according to where the interrupted execution lies within the profiled address range. The set of these execution counts is then a frequency distribution of execution within the profiled region, as sampled by the recurring profile interrupts.

All this basic functionality for profiling was in place right from version 3.10 with just the one exception that qualification by profile source and executing processor had to wait for version 3.51. Moreover, this basic profiling has changed remarkably little in the decades since. For present purposes, arguably the main change is simply in the numerical value of the object type at the beginning of every KPROFILE that is created as described above: it is 0x0F up to and including version 3.51 but 0x17 ever after.

Profile Callback Objects

As Windows developed, the kernel allowed more ways and reasons to ask the HAL to generate profile interrupts. It thus acquired more things to do on learning of a profile interrupt’s occurrence. Except for processing the applicable lists of profile objects, all that the kernel originally did with profile interrupts was to count them. Starting with Windows XP, however, the kernel allows that profile interrupts can be arranged not for adding to a histogram in a specified buffer, as above, but for recording each one in an event trace. Such special cases in the handling of profile interrupts had accreted enough by Windows 8 that some unification must have seemed worthwhile. This took the form of introducing a second type of profile object, apparently thought of as a profile callback object.

The object type at the beginning of every KPROFILE that is created specifically as a profile callback object is 0x11 (instead of 0x17). For a profile callback object, the examination is less specific but the action is very general. The only condition to meet is whether the interrupt was generated from the expected profile source. The action to be taken is left to an essentially arbitrary callback routine. The Windows 10 kernel supplies three routines for profile callback objects. One is for internal bookkeeping (to do with cache errata support) but two are for behaviour that can be (and typically is) directed from user mode for event tracing.

A built-in profile callback object for a periodically recurring profile interrupt is “started” by enabling PERF_PROFILE (0x20000002) in the group mask for an NT Kernel Logger session. The documented way to do this from user mode is to set EVENT_TRACE_FLAG_PROFILE (0x01000000) in the EnableFlags member of an EVENT_TRACE_PROPERTIES structure that is given to the StartTrace and ControlTrace functions when starting or controlling an NT Kernel Logger session. The event that results on each interrupt has the hook ID PERFINFO_LOG_TYPE_SAMPLED_PROFILE (0x0F2E).

An array of up to four profile callback objects can be dynamically allocated for similar event tracing on receipt of profile interrupts that are generated from processor-specific performance monitoring counters. Little or nothing is documented about the steps required for arranging this. The counters must be specified in advance. The only known way from user mode is through TraceSetInformation with the information class TraceProfileSourceConfigInfo (6). The profiling of these sources, however, is not supported through the EnableFlags, only the group mask. The bit to set is PERF_PMC_PROFILE (0x20000400), again through TraceSetInformation but for the information class TraceSystemTraceEnableFlagsInfo (4). The event that results on each interrupt has the hook ID PERFINFO_LOG_TYPE_PMC_INTERRUPT (0x0F2F).

Slightness of documentation by Microsoft is a recurring theme with profiling, apparently because Microsoft has preferred to keep that programmers who find a use for profiling should rely on the magic of Microsoft’s tools rather than write their own or use those of a third-party diagnostics vendor.

Documentation Status

To say the KPROFILE is undocumented is an understatement. Microsoft has never published a C-language definition in any DDK or Windows Driver Kit (WDK) or any other programming kit, nor even declared the KPROFILE as opaque. The practical equivalent of a C-language definition might be published as type information in public symbols, as for many undocumented structures, but none are yet known to have shown even the name of the KPROFILE let alone of its members.

That said, type information for the KPROFILE does turn out to have been published, surely by oversight, in most versions of a statically linked library named CLFSMGMT.LIB which Microsoft supplies with the Software Development Kit (SDK) as if for user-mode programming. It’s present in this library’s x86 builds starting from Windows Vista and x64 from Windows 8. It stops for the 2004 release of Windows 10.

Variability

Though the KPROFILE is internal, it is almost as stable as many a documented structure, presumably as a side-effect of its very tightly constrained use. After version 3.51 allowed for specification of the profile source and of which processors will have their execution profiled, the only formal change is for Windows 7 to support more than 32 or 64 processors by way of processor groups. That the size then increases for 64-bit Windows 8 is simply from allowing for more processor groups:

Version Size (x86) Size (x64)
3.10 to 3.50 0x28  
3.51 to 6.0 0x2C 0x58
6.1 0x34 0x78
6.2 to 2004 0x34 0xF8

Layout

The sizes in the preceding table and the offsets and definitions in the tables that follow are from type information in CLFSMGMT.LIB for applicable versions. What’s presented for other versions comes from inspecting the kernel and assuming that continuity of behaviour likely means continuity of names and types.

Offset (x86) Offset (x64) Definition Versions
0x00 0x00
SHORT Type;
all
0x02 0x02
SHORT Size;
all
0x04 0x08
LIST_ENTRY ProfileListEntry;
all
0x0C 0x18
KPROCESS *Process;
all

The 16-bit Type takes its value from the KOBJECTS enumeration. The 16-bit Size is of the kernel object, in bytes.

If profiling is just of execution in one process, the ProfileListEntry links the KPROFILE into the ProfileListHead of the specified Process. Otherwise, profiling is global, the link is then into a list head in the kernel’s own data, and Process is NULL.

Parameters for Basic Profile Object

Before version 6.2, the structure continues directly with parameters that govern the profiling. Starting with version 6.2, these parameters are instead in an anonymous structure that’s the first branch of an anonymous union.

Offset (x86) Offset (x64) Definition Versions Remarks
0x10 0x20
PVOID RangeBase;
all  
0x14 0x28
PVOID RangeLimit;
all  
0x18 0x30
ULONG BucketShift;
all  
0x1C 0x38
PVOID Buffer;
all  
0x20 (3.10 to 3.50)  
BOOLEAN Started;
3.10 to 3.50 next at 0x2A
0x24 (3.10 to 3.50);
0x20
0x40
ULONG Segment;
all  

The RangeBase and RangeLimit are respectively the inclusive start and non-inclusive end addresses of the profiled region. This region is treated as an array of fixed-size buckets. The bucket size in bytes is necessarily a power of two. What the KPROFILE keeps is not the bucket size but a BucketShift  which is two less than the logarithm base 2 of the bucket size in bytes. This optimises the computation of which bucket holds the return address for any given interrupted execution and which 32-bit counter at Buffer gets the corresponding increment.

The Segment is a special provision for profiling virtual-8086 execution. It is evidently defined for the x64 structure even though it cannot be acted on.

Parameters for Profile Callback Object

The second type of profile object for version 6.2 is supported through a second anonymous structure within the anonymous union:

Offset (x86) Offset (x64) Definition Versions
0x10 0x20
VOID 
(*Callback) (
    KTRAP_FRAME *, 
    PVOID);
6.2 and higher
0x14 0x28
PVOID Context;
6.2 and higher

To be clear, the Context is what each invocation of the Callback gets as its second argument.

Processor Parameters

Version 3.51 brought more control of which processors’ execution is profiled and to what source of interrupt. Corresponding parameters were appended to the structure and have stayed there:

Offset (x86) Offset (x64) Definition Versions Remarks
0x24 0x48
KAFFINITY Affinity;
3.51 to 6.0  
KAFFINITY_EX Affinity;
6.1 and higher  
0x28 (3.51 to 6.0);
0x30
0x50 (5.2 to 6.0);
0x70 (6.1);
0xF0
SHORT Source;
3.51 and higher  
0x2A (3.51 to 6.0);
0x32
0x52 (5.2 to 6.0);
0x72 (6.1);
0xF2
BOOLEAN Started;
3.51 and higher previously at 0x20

The profile Source takes its value from the KPROFILE_SOURCE enumeration but narrowed to only 16 bits. The one-byte boolean indicator of whether profiling has Started predates version 3.51, but was moved to the alignment space that version 3.51 left after the narrowed profile source.

The Profiled Region

Special mention must be made of what the profile object records of the profiled region’s end address. As input to NtCreateProfile and NtCreateProfileEx, the profiled region is described by its address and size. Adding the two produces a non-inclusive end address, which is what’s described in the layout above. The intention seems plain that an interrupted instruction lies in the profiled region if its address is greater than or equal to the start address and less than the non-inclusive end address.

You may be wondering why an article that can exist only for advanced programmers troubles over so simple a point. And then you might infer that there must be at least an ambiguity, if not an outright defect, in the implementation. And so there is, but in the reverse direction from usual. It’s not that an early implementation was faulty and was eventually found to need fixing. It is instead that the fault came from carelessness when implementing new functionality.

For most of the history of Windows there’s not even ambiguity. Up to and including Windows 7, the end address that’s saved in the profile object as RangeLimit is the sum of the ProfileBase and ProfileSize arguments that were given for the profile’s creation, and when the profile object is examined on receipt of a profile interrupt this end address is interpreted as non-inclusive. Had the code been left like that, then the layout above would say “non-inclusive end address of profiled area” and this multi-paragraph digression would not exist. I certainly don’t want to treat my readers as if basic knowledge of their craft would better be spelt out in laboured detail.

Unfortunately, when the introduction of profile callback objects for Windows 8 brought a reworking of the code for KeProfileInterruptWithSource, the reworking introduced a simple error of arithmetic. The end address that’s saved in a profile object is still the sum of address and size, but when the profile object is examined at interrupt time this non-inclusive end address is instead interpreted as inclusive. A consequence is that if a sequence of correctly formed calls to create and start a profile for which the buffer that receives the execution counts just happens to end at a page boundary, then chance execution at exactly the non-inclusive end of the profiled area can crash Windows! Of course, the chance can be engineered, with the effect that even a low-integrity user-mode program can bring down all of Windows. For details and a demonstration, see Bug Check From User Mode By Profiling. Microsoft were told of this by me in early 2017 and again in mid-2018, and they fixed it for the 1809 release of Windows 10.