Geoff Chappell, Software Analyst
The HvlQueryNumaDistance function queries the hypervisor for the distance between two NUMA nodes.
NTSTATUS HvlQueryNumaDistance ( USHORT CpuNumaNode, USHORT MemoryNumaNode, ULONG64 *Distance);
The CpuNumaNode and MemoryNumaNode arguments specify the nodes to evaluate.
The Distance argument provides the address of a variable that is to receive the distance.
The function returns STATUS_SUCCESS if successful, else a negative error code.
If the function fails, it produces the distance as -1. This, not the return value, is what the kernel tests for the success or failure of its one internal use of this function. No other use of the function is known.
The HvlQueryNumaDistance function is exported by name from the kernel in version 6.3 and higher. It exists in version 6.2 but only as an internal routine, not as an exported function.
The HvlQueryNumaDistance function is not documented. It is, however, declared in the NTOSP.H from the Windows Driver Kit (WDK) for Windows 10.
The query is made of the hypervisor via hypercall code 0x0078, which Microsoft’s Hypervisor Top-Level Functional Specification documents as HvCallQueryNumaDistance. This documentation describes the distance as “the number of CPU cycles for 1024 accesses” from the CPU node to the memory node, else as -1 if “the calculation is not possible.”
The hypercall requires proximity domain IDs. The function obtains these from the given node numbers by looking in the kernel’s array of KNODE structures.
The function does not check that the hypervisor supports the query, i.e., that NumaDistanceQueryAvailable is set in the HV_HYPERVISOR_FEATURES that are the output of cpuid leaf 0x40000003. From the one place that the kernel calls this function internally, this check is made before calling the function, the point being that the kernel would itself compute the distance between nodes except for noticing that it can—indeed, would better—ask the hypervisor.
Incidentally, the kernel establishes its table of distances between nodes during phase 1 of the kernel’s initialisation. Without hypervisor support, the kernel computes the distances experimentally by: switching to a processor in the CPU node; obtaining a page of memory from the memory node, locked and mapped into system address space; and then, while at DISPATCH_LEVEL, timing the reading of the whole page as successive dwords or qwords (for 32-bit and 64-bit Windows, respectively).