RTL_BALANCED_NODE

The RTL_BALANCED_NODE structure is designed to be nested within another structure to allow that this other structure can be the node of an AVL or Red-Black tree.

Availability

The RTL_BALANCED_NODE structure looks to have been introduced for version 6.2 in both the kernel and NTDLL.DLL.

Documentation Status

The RTL_BALANCED_NODE structure is not documented. It is conspicuous for being defined in NTDEF.H even though it is not referenced from any function declaration or structure definition in any header from either the Windows Driver Kit (WDK) or Software Development Kit (SDK).

Contrast with the LIST_ENTRY, which similarly provides for structures as nodes in a double-linked list. It too is defined in NTDEF.H. It, though, is a staple of kernel-mode programming. Macros and inline routines for using the structure, e.g., InsertTailList and RemoveEntryList, have long been defined in such standard headers as NTDDK.H and WDM.H, and have long been used extensively. It would be a safe proposition that few real-world kernel-mode drivers do not use the LIST_ENTRY and the various supporting macros and inline routines. Support for the LIST_ENTRY in user-mode programming is thinner, being left to a smattering of headers for specialised purposes. For the RTL_BALANCED_NODE, no use at all is yet supported for either kernel-mode or user-mode programming: though Microsoft has published a C-language definition of the structure, it might as well not have.

Private symbol files that Microsoft has distributed in packages of public symbols show that the RTL_BALANCED_NODE is also known, through typedef, as RTL_AVL_NODE and RTL_RB_NODE, presumably for neatness when the node is intended specifically for an AVL or RB tree.

Layout

The RTL_BALANCED_NODE is 0x0C and 0x18 bytes, respectively, in 32-bit and 64-bit Windows.

Offset (x86) Offset (x64) Definition Versions
0x00 0x00
union {
    RTL_BALANCED_NODE *Children [2];
    struct {
        RTL_BALANCED_NODE *Left;
        RTL_BALANCED_NODE *Right;
    };
};
6.2 and higher
0x08 0x10
union {
    UCHAR Red : 1;
    UCHAR Balance : 2;
    ULONG_PTR ParentValue;
};
6.2 and higher

The point to the RTL_BALANCED_NODE is that it models a node in a tree. The nodes are ordered. The tree provides for finding nodes with the efficiency of a binary search. Each node can be connected to at most three other nodes: a left child, a right child and a parent. Each node but one, which is distinguished as the tree’s root, has at least a parent. A node’s left and right nodes are collectively the node’s children. They and their children, recursively, are collectively the node’s descendants. The tree is ordered such that for every node in the tree, all descendents of a left or right child are also to the left or right, respectively, of the node.

The Red bit applies if the node is in a Red Black tree, but Balance in an AVL tree. Either way, nodes are assumed to have (at least) 4-byte alignment. NTDEF.H defines RTL_BALANCED_NODE_RESERVED_PARENT_MASK as 3 and uses it for a macro, RTL_BALANCED_NODE_GET_PARENT_POINTER, to extract the address of the parent node from ParentValue.

Exported Support

The kernel and NTDLL each export several functions for working with nodes in either sort of tree. For an AVL tree:

For a Red Black tree:

Inline Support

Microsoft’s programmers, of course, have inline routines for the RTL_BALANCED_NODE much as for the LIST_ENTRY. This is plain in the binary code from the recurrence of moderately lengthy sequences with stitching that suggests compilation of a call rather than of repeated source code. Some details of this support that Microsoft keeps to itself also show in public symbol files, and so Microsoft’s names and even types are known for some of the inline routines that work with the RTL_BALANCED_NODE:

Indeed, this list is just for the original release of Windows 10. Later releases use these and more, with the strong suggestion that this Run-Time Library support for balanced trees has not only been in active use at Microsoft for nearly a decade but is being developed further or is finding more extensive use (or both).

Most of these inline routines take among their arguments the address of a callback routine. The types are known from public symbol files. Microsoft’s typedef names are known from private symbol files that Microsoft has distributed in packages of public symbols. There are three:

typedef LONG RTL_TREE_COMPARE_FUNCTION (PVOID, RTL_BALANCED_NODE *);
typedef VOID RTL_TREE_DELETE_FUNCTION (RTL_BALANCED_NODE *, PVOID);
typedef LONG RTL_TREE_WALK_FUNCTION (RTL_BALANCED_NODE *, PVOID);

For each, there is also the usual pointer type that has the P prefix.

Insertion

Not that the exported functions RtlAvlInsertNodeEx and RtlRbInsertNodeEx are documented, but programmers who would use them for inserting a node into a tree must either know Microsoft’s inline routines too or write their own. The exported functions know the tree is ordered, of course, and they know to preserve the ordering, but they know nothing of what governs this ordering. The caller orders the nodes by choosing where to insert them. By inserting to the right of an existing node, the caller expresses that the existing node had no right child and will be the right-most node to the left of the new node. If this node had a right child, the caller would instead insert to the left of the existing node that is the left-most node to the right of the new node. This scanning, aided by a callback routine for comparing nodes, is easily coded compactly enough for inlining. The colouring and rebalancing that may have to be done to the tree when inserting a node is not. Inserting a node is thus done in two steps:

  1. an inline routine, typically RtlTreeFindInsertLocation, scans the tree for the existing node to insert the new node to the left or right of;
  2. an exported function, meaning either RtlAvlInsertNodeEx or RtlRbInsertNodeEx, depending on the type of tree, does the insertion.

As noted above, symbol files tell the names and types of some inline routines. For each, the name’s suggestion of functionality and the type’s detailing of parameters and return value can then be matched against the binary to find with high confidence where the routine has been inlined. For instance, RtlTreeFindInsertLocation is coded very much like

FORCEINLINE
PRTL_BALANCED_NODE 
RtlTreeFindInsertLocation (
    PRTL_BALANCED_NODE Root,
    PVOID Context,
    PRTL_TREE_COMPARE_FUNCTION Callback,
    BOOLEAN *Right)
{
    PRTL_BALANCED_NODE node, next;

    *Right = FALSE;
    for (node = Root; node != NULL; node = next) {
        LONG cmp = Callback (Context, node);
        if (cmp < 0) {
            next = node -> Left;
            if (next == NULL) {
                *Right = FALSE;
                break;
            }
        }
        else {
            next = node -> Right;
            if (next == NULL) {
                *Right = TRUE;
                break;
            }
        }
    }
    return node;
}

The preceding fragment is reproduced to demonstrate a point of critical analysis. It is sometimes said that the names of inlined routines do not survive in public symbol files and their appearance in other code, at least in quantity, suggests access to source code. Here, however, is an example of an inlined routine being recoverable, with Microsoft’s name for the routine and types for the arguments and return value, without needing access to source code.