CURRENT WORK ITEM - PREVIEW ONLY

EAX From CPUID Leaf 1

Executing the cpuid instruction with 1 in eax loads a processor identification signature into eax. This is a broad description of the processor in terms of its family, model and stepping. It developed from what the 80386 processor loads into the dx register as an initial state and is still what later processors have as their initial edx. If exposing it without the contortions of somehow regaining control after resetting the processor was a primary motivation for the cpuid instruction, then this processor identification signature is the identifier that gives cpuid its name.

Family, Model and Stepping

As far as interests the Windows kernel, the processor identification signature comprises a family, model and stepping in decreasing order of significance. For reasons that may have as much to do with Microsoft and the early history of Windows as with any manufacturer of processors, the physical layout of the signature is not as clear as the logical:

Mask Interpretation Versions
0x0000000F stepping all
0x000000F0 model all
0x00000700 (3.10 to very late 4.0);
0x00000F00
family all
0x000F0000 extended model 5.1 and higher
0x0FF00000 extended family 5.1 and higher

All Windows versions since 5.1 recognise that the family and model can each be expanded from 4 bits to 8 by combining with the extended model and extended family. Expansion is indicated for both the family and model when the 4-bit family is full, i.e., contains 15:

Starting with version 5.1 from Windows XP SP2 and version 5.2 from Windows Server 2003 SP1, expansion is also indicated for the model only, i.e., not the family, if the 4-bit family is 6—but only for processors from Intel and, in version 6.2 and higher, from Centaur.

A Revision History in Intel® Processor Identification and the CPUID Instruction (Application Note 485, apparently no longer available online from Intel) dates Intel’s documentation of the extended model and family fields to November 2000. If this is roughly when they were introduced to the processor, then Windows XP will indeed have been the first major release that could support them. Recognition of them is not back-fitted into the chronologically later service packs of Windows 2000.

There was perhaps no great hurry. Family and model numbers for Intel’s processors at the time were still some way from overflowing four bits each. Yet Intel will have known well before late 2000 that having only 4 bits for the family presented more than the general problem of how the family might ever increase beyond 15. See that late in 1999, Windows NT 4.0 SP6—which is specifically what’s meant by “very late 4.0” in the table above—corrected the kernel’s interpretation of the family as having only 3 bits.

That Windows ever had the family as just 3 bits instead of 4 surely caused a lot more trouble than first appears. It seems at least plausible as the reason that Intel’s processors for Windows don’t have families between 8 and 14 (with the Xeon Phi as perhaps the lone exception). It can’t be much of a concern now, but it would have been in 1999: a new processor with any such family would have looked to recently contemporary Windows versions like an old processor—and even like an impossibly old processor. Intel’s very particular design for forming the 8-bit family by a numerical addition, rather than extending with high bits elsewhere as for the model, ensures for all practical effect that new processors that use the extended family have family 15 or higher in such a way that they will look to those early Windows versions as if they are family 7, presumably advancing on any processors these early versions expected.

Family

There’s no rule that a higher family number implies a more advanced processor, but it had been roughly true as the 80386 (family 3) gave way to the 80486 (family 4) and then to the Pentium (family 5), Pentium Pro, Pentium II and Pentium III (all family 6) and Pentium 4 (family 15). Then the progression becomes less clear. Some would even say muddled as the brand names Xeon and Celeron have models in both families 6 and 15. With multiple families in production concurrently, especially with no sense that one supersedes another, it’s inevitable that advanced features from some models of one family turn up also on some models of another family.

Much of the point to the cpuid instruction’s extensibility is that additional leaves can tell with some precision which features are present. Much as software that depends on an operating system would better test for this or that capability by querying the operating system for the existence of relevant components (such as exported functions) rather than assume availability in this or that version, operating-system software (and less usually other software, too) should not infer the presence of this or that processor feature from supposed correlations with the family, model and stepping but should query the processor through cpuid.

Querying the processor for precise feature support is not always possible, of course. This applies especially to early Windows versions which realistically faced being run on processors that pre-date the Pentium. For instance, version 3.10 knows of no assurance that the WP and AM bits (16 and 18) exist in cr0 except that the processor family is greater than 3, or that the cr4 register exists at all except that the family is at least 5. Is such dependence on the family number avoidable? That the test for bits in cr0 was retained up to and including version 6.2 was more plausibly not that the test is unavoidable but just that the code wasn’t pruned away when the 80386 lost support for version 4.0. Inferring existence of cr4 just from the family was done away with in version 4.0, but the replacement is not obviously cleaner: cr4 is inferred to exist if any of the VME, PSE or PGE bits are set in the feature flags that cpuid leaf 1 returns in edx.

Architectural Differences

As modern Windows sheds ever more of its support for old processors and as Intel’s families 6 and 15 live ever longer, this sort of reliance on the family number to learn of architectural detail is tending to go away. Two examples that remain in Windows 10 stand out, if only for their age. The one that has stayed in Windows the longest concerns the update signature that may be readable from the Model Specific Register (MSR) that Intel names IA32_BIOS_SIGN_ID (0x8B). When Windows first cares, in version 4.0, to have the update signature, it’s only for Intel processors in family 6. Not until version 5.1 does it allow processors from higher families, still only from Intel. When 32-bit version 6.2 extends this to AMD, it’s only for AMD’s family 15 and higher. There plausibly is no better-defined way to know whether this MSR exists than to infer from the family. For 64-bit Windows, the update signature is still Intel-specific before version 6.2, but family never matters, presumably now because the update signature and its MSR are architectural to all processors that have the 64-bit instruction set.

Errata

The other stand-out example of Windows 10 varying behaviour for a processor family dates from Windows NT 4.0 SP4 and hints at the main reason that operating systems will forever have an occasional need to infer from the family, model and stepping. Processors, like operating systems, have bugs. The first that the Windows kernel is known to work around, as long ago as Windows NT 3.50 SP1, showed as defective floating-point arithmetic by some early Pentium processors. To know which processors is unnecessary for detecting this defect: if it’s present, known inputs to the fdiv instruction produce a result that is easily seen to be incorrect. Code to check this was retained up to and including version 6.0.

The dubious advantage of being discoverable by experiment is not available for the second processor defect that the Windows kernel is known to work around, starting with Windows NT 4.0 SP4. For this defect, execution of a lock cmpxchg8b instruction that is encoded as if to take its explicit operand from a register instead of memory hangs the processor. The encoding is invalid and should cause an Invalid Opcode exception but never does, the defect being that the lock induces the processor to wait for a write that will never come. In Intel® Pentium® Processor Invalid Instruction Erratum Overview, Intel presents two workarounds which “should only be implemented on Intel processors that return Family=5 via the CPUID instruction.” A detailed account of whether the defect is fixed in any particular model and stepping surely will have existed, and Microsoft surely would have had access. Indeed, the Pentium® Processor Specification Update (order number 242480-041, dated January 1999 and apparently long gone from Intel’s website) lists the bug as fixed for model 8 stepping 2. But perhaps because the workaround has next to no cost for normal execution Windows activates it for any Intel processor whose family is 5. The code for this is still present at least as recently as the 1803 release of Windows 10.

Model and Stepping

More usual now is that if Windows must resort to the processor identification signature, then the need is precise enough to depend not just on the family but on the model and even the stepping. Within a family, models may vary signficantly. It’s easily imagined that models released roughly concurrently may have very different feature sets, but as the model number increases over time there plausibly is a tendency that higher model numbers mean greater functionality. Within a model, steppings are something like bug fixes, which should mean that a higher stepping is reliably an advance.

The kernel often takes the model and stepping together as working roughly like a major and minor version number. Perhaps the most widely known example affects software that wants to know if the processor has the sysenter and sysexit instructions. The ideal is to query directly for the feature, as represented by the SEP bit (11) in the feature flags that cpuid leaf 1 returns in edx, but Intel documents that in family 6 before model 3 stepping 3 the feature flag may be set without the feature being unusable. And so from version 5.1 until at least the 1803 release of version 10.0, the 32-bit kernel ignores SEP if either: the family is less than 6; or it equals 6 but the model and stepping as a pair are less than 3.3.

WORK IN PROGRESS

Persistence

All versions of the Windows kernel keep each processor’s family, model and stepping in the processor’s control block (KPRCB):

Offset (x86) Offset (x64) Definition Versions
0x18 (3.10 to 6.0);
0x14
0x05F0 (5.2 to 1607);
0x40
CHAR CpuType;
all
0x19 (3.10 to 6.0);
0x15
0x05F1 (5.2 to 1607);
0x41
CHAR CpuID;
all
0x1A (3.10 to 6.0);
0x16
0x05F2 (5.2 to 1607);
0x42
USHORT CpuStep;
3.10 to 5.2
union {
    USHORT CpuStep;
    struct {
        UCHAR CpuStepping;
        UCHAR CpuModel;
    };
};
6.0 and higher

See that their offsets into the structure have been strikingly stable. They are shared not just with the HAL and with other kernel-mode software, but also with code that is written in assembly language.

For reasons that may date back to before Intel talked of its processors as belonging to families, the member that holds the family is named CpuType. In 32-bit Windows up to and including version 6.2, the kernel has code that can confect CpuType as 3, 4 or even 5 for processors that do not have a usable cpuid instruction. That this has happened, with the model and stepping also confected, is then shown by CpuID being zero. Except for sufficiently early versions, this state can exist only briefly before the processor’s inadequacy stops Windows. Ordinarily, CpuID is 1 to record that CpuType, CpuModel and CpuStepping all come what cpuid leaf 1 returned in eax as the kernel initialised its use of the processor.

Registry

All versions of the Windows kernel save each processor’s family, model and stepping in the registry:

Key: HKEY_LOCAL_MACHINE\Hardware\Description\System\CentralProcessor\index
Value: Identifier
Type: REG_SZ

The string data has decimal representations of the family, model and stepping as read from cpuid leaf 1 and kept in the KPRCB. The enclosing text starts with what seems intended as naming the instruction set and varies because Microsoft is perhaps no more able or willing to settle on one name than is everyone else:

It is presently not understood how this string data is formed for HygonGenuine processors in Windows 10 Version 1803. The 64-bit kernel’s code for writing this registry value looks like it should stop with the bug check UNSUPPORTED_PROCESSOR (0x5D) unless the vendor is one of the three listed above.

Archaeology

How can it be that the Windows kernel ever thought the family is just 3 bits? To answer just that Microsoft’s programmers were somehow inept, as Jeff Atwood seems to for Nasty Software Hacks and Intel’s CPUID, would fall well short of satisfactory. Microsoft certainly wasn’t on the ball not to have recognised until 1999 that its interpretation had for many years not matched Intel’s documentation, but nobody coding this for the Windows kernel in 1993 (or earlier) will have been the slightest bit incompetent at working with bits in bytes and neither will they have gratuitously lopped off a bit. No, that they coded for 3 bits instead of the 4 that Intel documented in the formally published  Pentium™ Processor User’s Manual from 1993 has a much more plausible reason: I think they coded from pre-release descriptions.

In pre-release builds of Windows NT 3.1 such as can be found easily on the Internet now that they are treated by hobbyists as abandon-ware, it is apparent that Microsoft was working to a volatile specification of the processor identification signature to expect from cpuid—assuming that what they had was formal enough to count as a specification. Five pre-release builds of NTKRNLMP.EXE from Windows NT 3.1 have yet been found for study:

The oldest does none of its own processor identification at all. It instead trusts what the loader passes in the LOADER_PARAMETER_BLOCK, not that the loader in this build knows of cpuid. What the build from October 1992 knows of cpuid is that once the instruction’s existence is confirmed by changing the ID bit in the eflags, execution with no input prepared for eax loads eax with the processor identification signature. This significant difference from the familiar behaviour of cpuid is not directly relevant, but is background to a picture of Microsoft’s programmers (and Intel’s too) dealing with changing specifications. What is directly relevant is that this build of the kernel extracts the family as 4 bits. The first two pre-release builds from 1993 know that the instruction has developed further into taking a function number in eax as input and that the processor identification signature is in eax after executing with 1 as the input. Again, the family is extracted as 4 bits.

The narrowing of the family to 3 bits was done very late. The last of the pre-release builds listed above has it, barely seven weeks before the first known public release (3.10.5098.1) was built on 24th July 1993. It’s scarcely credible that Microsoft made this late change without some sort of guidance from Intel, except perhaps in reaction to confusion from Intel.

WORK IN PROGRESS