System Monitor Rounds Down to Thousands

Windows 95 introduces a scheme for presenting statistics on system performance. The essential component in this scheme is a VxD named PERF.VXD, which is supplied by Microsoft in the standard Windows package. PERF acts as a performance statistics server. A VxD may register with the server as a performance statistics provider. A performance statistics client is an application that retrieves statistics from PERF more or less regularly for presentation to the user. The particular client that Microsoft supplies in the standard Windows package is called the System Monitor.

A statistic is any 32-bit performance measure that a VxD cares to provide. A statistic may be specified as requiring differentiation by the client, meaning that instead of reporting the statistic as provided by the VxD, the client is to compute and present the average rate per second at which the provided statistic has changed since the last sampling. In what follows, the term counter is used for the statistic as provided by the VxD and the term rate (of counted events per second) for the differentiated statistic presented to the user by the client.

Problem

When differentiated statistics are presented by the particular performance statistics client known as System Monitor, the rate may be rounded down to whole thousands.

Cause

Inspection reveals a coding error in the SYSMON.EXE program. Specifically, if the increase in the counter between samples is 65535 or more, then to get the rate, SYSMON first divides by the elapsed time in milliseconds and then multiplies by 1000. The intention appears to be the avoidance of overflow in 32-bit registers, but a consequence is to pick up a rounding error instead.

Observation of the effect is more likely and more significant when SYSMON is configured to sample at longer intervals. For instance, when sampling once per second, rounding down to whole thousands occurs only if the rate is at least 65535 events per second; but when sampling every 10 seconds, rounding down to whole thousands occurs if 65535 or more events were counted over the 10 seconds between samples, with the consequence that an average rate of 6554 events per second over the 10 seconds is presented to the user as just 6000 events per second.

Applicable Versions

The coding error is observed in SYSMON.EXE versions from Windows 95 and Windows 98. File sizes, dates and times for the versions inspected are:

Version Size Date and Time Package
4.00.950 65,024 09:50, 11th July 1995 Windows 95 upgrade
11:11, 24th August 1996 Windows 95 OSR2
4.10.1998 81,920 19:01, 11th May 1998 Windows 98

Fix

The problem can be corrected by patching better arithmetic into the SYSMON.EXE file. The three patch sites, given as offsets in bytes from the start of the file, vary with the version:

Version File Offsets
4.00.950 (Windows 95) 262Fh
263Fh
2645h
4.10.1998 (Windows 98) 30B9h
30C9h
30CFh

At the first site, the expected byte is 72. It is to be changed to EB.

At the second site, the expected bytes are 69 C0. They are to be changed to C7 C2.

The third site is more complicated. The expected bytes are 2B D2 for the Windows 95 version and 33 D2 for the Windows 98 version. They are to be changed to F7 E2.

If you are even slightly uncertain how to patch a file, do not try it.

Patch Details

The following table presents on the left some instructions from near the patch site and on the right the instructions that change by applying the patch. Differences in version are accommodated by use of some symbols: zero stands for the sub or xor instruction in the Windows 95 and Windows 98 versions respectively; time and result are both ebx for Windows 95 but are ecx and esi respectively for Windows 98. to be sure they’re not missed, the three patches are highlighted:

        cmp     eax,0000FFFFh 
        jb      @f 

        zero    edx,edx 
        div     time 
        mov     result,eax 
        imul    result,result,1000 
        jmp     done 

@@: 
        imul    eax,eax,1000 
        zero    edx,edx 
        div     time 
        mov     result,eax 
        jmp     done 
        cmp     eax,0000FFFFh 
        jmp     @f 







@@: 
        mov     edx,1000 
        mul     edx 
        div     time 
        mov     result,eax 
        jmp     done 

The effect of the patch is therefore first to render redundant the set of instructions that would divide first then multiply, and second, to change from using the imul instruction to mul. The imul instruction, in the form shown above, multiplies a 32-bit variable by a 32-bit constant and stores the result in a 32-bit register. If the result is too large for the 32-bit register, then the overflow is lost. The mul instruction multiplies a 32-bit variable by the contents of the 32-bit register eax and stores the result in a 64-bit combination of edx and eax. There can be no overflow to lose.