NTDEF.H

The public symbol file NTKRPAMP.PDB for the original release of Windows 10 tells that the kernel is built with the NTDEF.H header at

d:\th.public.fre\shared\inc\minwin

and draws from it the type definitions that are tabulated below.

Nowadays, NTDEF.H is among the headers in the Software Development Kit (SDK). It is there in the “shared” subdirectory with many other headers that are intended for use in both kernel-mode and user-mode programming. The SDK is intended to be installed before the Windows Driver Kit (WDK), typically from having installed Visual Studio first. Before Windows 8, kernel-mode programming kits were self-standing and so NTDEF.H is in the WDK and is earlier in the Device Driver Kit (DDK). Indeed, NTDEF.H is ancient, being one of relatively few headers in the DDK for Windows NT 3.1.

NTDEF.H defines many of the basic types for all of Windows programming. It is directly included by each of the standard headers for kernel-mode programming, here meaning WDM.H, NTDDK.H and NTIFS.H. Much of its content is duplicated in other headers, both for kernel-mode and user-mode programming. The line numbers on the left are known from the symbol file. All agree with the NTDEF.H from the SDK for the original release of Windows 10. The line numbers on the right are from headers that are readily available in the WDK or SDK and which are thought to acquire their definitions of these types as duplicates from NTDEF.H.

Line Number Type MINIPORT.H MINITAPE.H WINNT.H WUDFWDM.H
629 struct _QUAD 582 588    
709 struct _PROCESSOR_NUMBER 662 668 567  
720 struct _GROUP_AFFINITY 673 679 578  
1081 union _LARGE_INTEGER 862 868 791  
1086 unnamed struct for u in _LARGE_INTEGER 867 873 796  
1099 union _ULARGE_INTEGER 880 886 809  
1104 unnamed struct for u in _ULARGE_INTEGER 885 891 814  
1128 struct _LUID     838  
1386 enum _EVENT_TYPE       137
1427 struct _STRING       156
1467 struct _UNICODE_STRING       177
1528 struct _LIST_ENTRY     1083  
1538 struct _SINGLE_LIST_ENTRY     1093  
1552 struct _RTL_BALANCED_NODE        
1583 struct LIST_ENTRY32     1104  
1589 struct LIST_ENTRY64     1110  
1625 struct _STRING32        
1639 struct _STRING64        
1697 struct _OBJECT_ATTRIBUTES       222
2085 enum _NT_PRODUCT_TYPE        

Whether the kernel’s source code includes NTDEF.H directly or through some other header is not known, though the latter seems more likely. It is known, however, that this header is not WDM.H, NTDDK.H or NTIFS.H.

Type Information Disclosure

However NTDEF.H gets included by the kernel’s source code, it’s included early. This is true also for nearly all kernel-mode source code since a standard header is typically the first inclusion and it in turn includes NTDEF.H very early. What type information is present in Microsoft’s public symbol files for kernel-mode executables therefore begins identically in almost all of them and in very much the same way in the handful of exceptions. If you’re a reverse engineer—or even if you’re a Microsoft programmer who is concerned about what’s revealed by public symbol files or has an opinion about what can only be learnt from source code—you might do well to familiarise from NTDEF.H as the readiest example of type information getting into symbol files from headers.

In the public symbol files for the kernel, and in most others that have type information, NTDEF.H is the first header to be named in the PDB stream (4) that tells which user-defined types came from which headers, but it is only the second to contribute type information (to stream 2). How this happens is that before NTDEF.H defines any class, structure, union or enumeration, it includes BASETSD.H, which in turn defines several inline routines that use built-in types with just enough elaboration to need their own entries in the type information. Use of void const * and void const * __ptr64 by PtrToPtr64 and Ptr64ToPtr in BASETSD.H accounts for the first three type-information entries in public symbol files for the 32-bit kernel. These routines are macros when building for 64-bit Windows. Instead, the first seven entries in public symbol files for the 64-bit kernel record that the inline routines HandleToULong, ULongToHandle, LongToHandle, IntToPtr, UIntToPtr and Ptr32ToPtr use void const *, unsigned long const, long const, int const, unsigned int const and void const * __ptr32. Why should you trouble about simple compounds of built-in types? Not for themselves, of course, but to learn from the start that just using a type in the body of an inline routine is enough to get type information created, regardless of whether the inline routine is ever referenced anywhere else in any code.

Next come the first entries from NTDEF.H itself. These demonstrate the same point but now for user-defined types, here specifically structures. Type information is created for LIST_ENTRY64 and LIST_ENTRY32 because of their use by the inline routine ListEntry32To64. The only way the kernel is unusual in this respect is that its symbol file would eventually pick up these structures from later use, .e.g., from the TlsLinks member of the _TEB64 structure. Almost every kernel-mode executable whose public symbol file has any type information at all has it for LIST_ENTRY32 and LIST_ENTRY64 just for including NTDEF.H. See that it doesn’t matter that the executable’s own code makes no use of the structures, just that NTDEF.H uses them in inline routines even if these routines never are inlined into any of the executable’s code. For these particular structures, the immediate result is nothing but a small waste of space in the symbol files. For other structures, it’s a possibly unwanted disclosure.

Unnoticed use of a type by otherwise unused inline routines in headers seems all too possible as the main mechanism by which Microsoft’s programmers intend that a structure is their internal plaything, yet the structure’s name and the names, types and offsets of its members end up in public symbol files and then as common knowledge. Microsoft’s programmers have even opined in public that inline routines in headers that Microsoft doesn’t publish are secrets whose knowledge outside Microsoft is explained only by leaked source code. The reality is that if a routine is declared in a header and then is used anywhere else, even just in the same or another header, then if the public symbol file has type information, the routine’s type is disclosed. So too is its name, if building with the compiler from Visual Studio 2012 or later. This is very much the sort of disclosure that might be missed by programmers, even Microsoft’s, but also by reverse engineers (since their work seems to depend ever more on what their tools tell them and less on actually knowing their craft).

This disclosure of inline routines can be seen at work just from NTDEF.H as the most ready example. Indeed, it is shown by the very next entries in the type information in the public symbol files for the kernel. It comes about because NTDEF.H includes GUIDDEF.H which in turn includes STRING.H from the kernel-mode implementation of the C Run Time (CRT). This STRING.H defines strnlen as an inline routine. Its use of char const * creates type information, as pointed out above. What’s new in our quick survey of simple examples is that strnlen is called from another inline routine, strnlen_s, not much further into STRING.H. That the one routine is called from another counts as use of a type. Specifically, it creates type information for a pointer to a function with the prototype of the referenced routine. Moreover, the referenced routine gets named in the PDB stream that tells which headers supplied which types. See that it doesn’t matter whether either routine is used anywhere else in the kernel’s source code. Mere inclusion of STRING.H is enough to leave a trace of the inline routine.

Extraction

NTDEF.H is also a ready illustration of how some, if not many, headers in the WDK and SDK are created from some sort of script or master header that extracts from yet more headers. This applies especially to some of the most prominent headers: WDM.H, NTDDK.H and NTIFS.H for kernel-mode programming and WINNT.H for user-mode programming. As noted above, those three standard headers for kernel-mode programming include NTDEF.H directly. Some specialised drivers, known as minidrivers, interact with the kernel through a port driver. Ideally, they have no direct interaction with the kernel and so their source code does not need any of the big three standard headers. What they instead include from the WDK is either MINIPORT.H or MINITAPE.H. Rather than include NTDEF.H, these headers duplicate much of it. This duplication can also be seen in the WUDFWDM.H that is a standard inclusion in source code for user-mode drivers and in WINNT.H which almost all user-mode Windows source code includes indirectly through WINDOWS.H.

Each of these headers has one contiguous region in which each line is a duplicate or slight edit of a corresponding line in NTDEF.H. For the headers from the WDK and SDK for the original release of Windows 10, these regions are:

Moreover, the correspondence is well-ordered: for each line in succession in these headers, the corresponding line is further into NTDEF.H. These corresponding lines in NTDEF.H make disjoint regions. Since NTDEF.H is not very large, the full map is perhaps instructive without being too tedious:

NTDEF.H MINIPORT.H MINITAPE.H WINNT.H WINNT.RH WUDFWDM.H
24-25         34-35    
35-317     37-319 43-325      
  58       36    
  62-201       37-176    
  209-317       177-285    
329-418     320-409 326-415      
  329-404       286-361    
  407-417           38-48
422-453         362-393    
457-868     410-821 416-827      
  457-619       394-556    
    571       16  
    595       17  
    597       18  
    614       19  
    616-617        20-21  
  641-642       557-558    
  701-842       559-700    
  860       701    
875-877         702-704    
880-894         705-719    
907-981             49-123
986-990         720-724    
1011-1036         725-750    
1041-1136         751-846    
  1041-1119   822-900 828-906      
1140-1150     901-911        
1154-1378         847-1071    
1382-1390             133-141
1413-1438             142-167
1455         1072    
1458-1477             168-187
1480         1073    
1484-1485         1074-1075    
1488-1505             188-205
1512-1519     912-919 907-914      
  1517-1518       1076-1077    
1523-1542         1078-1097    
1577-1594         1098-1115    
1653-1667             206-220
1696-1729             221-254
1743-1761     920-938 915-933      
1765-1776         1116-1127    
1780-1788         1128-1136    
1796-1803             255-262
1806-1981         1137-1312    
  1806-1873   939-1006        
  1880-1884   1007-1011        
  1891-2041   1012-1162        
2021-2063         1313-1353    
2066-2078     1163-1175        
  2067-2077           263-273
2120-2139         1354-1373    
2143-2282         1374-1513    
2286     1176 934 1514    
2290-3047         1515-2272    
  2290-2755         22-487  
  2883-3065   1177-1359 935-1117      

The map is consistent with a process of extraction. At some point in preparing each output header, the extraction selects NTDEF.H as input, parses successive lines in NTDEF.H, and extracts some to the output. Be aware, though, that this is only the simplest process that is consistent with the files as observed. The input could instead be another header that is also the input for constructing NTDEF.H. Either way, the choosing of which lines to extract is fully accounted by directions within NTDEF.H.

Output Selection

These directions for selecting which lines are in both NTDEF.H and another header are keywords in single-line comments. What can be seen in NTDEF.H are keywords to

It is observed that the keywords for these three cases are begin_key, end_key and key, where the placeholder key differs for each output header:

Begin Range End Range Same Line Versions Output Header
begin_ntminiport end_ntminiport   all MINIPORT.H
begin_ntminitape end_ntminitape   4.0 and higher MINITAPE.H
begin_ntndis end_ntndis ntndis 3.51 to 5.2 NDIS.H
6.0 and higher  
begin_ntoshvp end_ntoshvp   6.2 and higher  
begin_r_winnt end_r_winnt r_winnt 4.0 and higher WINNT.RH
    windbgkd 3.10 to 4.0 WINDBGKD.H
begin_windbgkd end_windbgkd   5.0 and higher
begin_winnt end_winnt winnt all WINNT.H
begin_wudfwdm end_wudfwdm   6.2 and higher WUDFWDM.H

Whether Microsoft still has a header named WINDBGKD.H is not known. None is supplied with any WDK or SDK nowadays, but a header with this name was supplied among the directories of sample code up to and including the DDK for Windows NT 4.0 and in the ordinary INC directory in the DDK for Windows 2000. Its contents in these versions are consistent with extraction directed by windbgkd comments. It is not impossible that the windbgkd comments remain in NTDEF.H even though no WINDBGKD.H is ever created or that they now govern extraction to some unpublished header that superseded WINDBGKD.H.

Comments for extraction to NDIS.H must be vestigial. They evidently were active for the NDIS.H that Microsoft supplied first among the NETWORK samples in the DDK for Windows NT 3.51 and later among the general headers in the DDK for Windows 2000. Before Windows Vista, NDIS.H was not just a standard inclusion for network drivers: it also aimed to limit these drivers to interacting with NDIS.SYS, and thus only indirectly with the kernel, much as if network drivers are miniport drivers and NDIS.SYS is the corresponding port driver. Much like MINIPORT.H still, NDIS.H in these early versions has its own knowledge of the kernel, substantially less than defined in WDM.H. This reduction shows in NDIS.H as its own section of INTERNAL DEFINITIONS. The first thousand lines or so are duplicated from NTDEF.H, consistently with extraction according to ntndis comments. For Windows Vista, however, this section of NDIS.H was reworked to include NTDDK.H.

No header from any WDK or SDK is known to have lines in common with NTDEF.H such as selected by ntoshvp comments. An obvious guess is that Microsoft has a header named NTOSHVP.H and even that it’s something like NTOSP.H but for Hyper-V components. The guess may even be sound. Headers other than NTDEF.H have comments that are similarly suggestive of an NTHAL.H and the public symbol files for the HAL do indeed confirm that an NTHAL.H is compiled when building the HAL. No such sign, however, is known of an NTOSHVP.H. If it exists, Microsoft is keeping it very private.

Translation

Lines in MINIPORT.H and MINITAPE.H that have corresponding lines in NTDEF.H are exact duplicates from NTDEF.H, but some lines in WINNT.H and WUDFWDM.H differ very slightly from their corresponding lines in NTDEF.H. What editing, if any, is done of each line in the output is apparently specified as part of the process, not from directions in the input. No evidence is known for the mechanism, only for what it must be capable of.

One option for editing the output concerns the comments that direct which lines to extract. Inasmuch as this extraction is a detail of construction, an ideal might be that these comments stay in master headers, which Microsoft keeps private, and are eliminated from headers that Microsoft publishes. Instead, some such comments are in plain sight all the way back to the DDK for Windows NT 3.1. Just as plain is that elimination is provided for but also that it is applied imperfectly. Exactly how it works is unclear. At one extreme, all lines that MINIPORT.H and MINITAPE.H have in common with NTDEF.H have the comments intact. At the other extreme, WUDFWDM.H has none. For instance, where NTDEF.H has begin_wudfwdm and begin_ntoshvp comments on successive lines, the second contributes to WUDFWDM.H only as an empty line: the comment is stripped. Contrast with WINNT.H, which does not have this filtering: where NTDEF.H has a begin_winnt and begin_ntoshvp on successive lines, WINNT.H has the whole begin_ntoshvp line.

WINNT.H demonstrates a much more significant translation. All Microsoft’s literature for kernel-mode programming uses UCHAR, USHORT and ULONG for unsigned integral types. The headers do not even define the BYTE, WORD and DWORD that were long preferred in user-mode programming (even before any Windows NT existed). Since NTDEF.H is written for kernel-mode programming, it uses UCHAR, etc., and never the others. For all the lines that WINNT.H has in common with NTDEF.H, every UCHAR, etc., in NTDEF.H is instead a BYTE, etc., in WINNT.H. Again, how the translation is specified is unclear. It even translates TUCHAR and MAXUCHAR to TBYTE and MAXBYTE.