EXTRACT.EXE Misses Files in Cabinet

Background

With Windows 95, if not earlier, Microsoft introduced a scheme by which a set of files may be compressed into one or more cabinet files (with the .CAB extension) for easy distribution. Windows 95 is itself distributed as a series of these cabinet files. Windows 95 also supplies a program with which users may operate on cabinet files, especially to get a directory of the cabinet’s contents or to extract files of particular interest. This utility program is called EXTRACT.EXE. That this program is a standard component of a Windows 95 installation makes it very convenient to distribute files in cabinets.

EXTRACT.EXE is a DOS program with a command line syntax. Of special interest to this bug note is that the syntax allows users to follow the name of the cabinet with one or more templates for the names of files that the program is to search for in the cabinet. These templates may simply be filenames or they may include wildcard characters. If the user supplies no template, the program uses *.* by default.

Problem

Some versions of the EXTRACT.EXE program do not give *.* the conventional meaning of matching all filenames. Instead, the program interprets *.* as matching files whose names contain exactly one period. Whenever the EXTRACT program uses *.* as the template for matching files, it will miss files whose names have no extension (and files whose names have more than one period).

Applicable Versions

This note applies to versions 1.00.0520 and 1.00.0530. The latter is the more common, being distributed with Windows 95 (both the original release and OSR2) and with many Microsoft applications. Dates and times for the file vary with the package. The file size varies also, depending on whether the executable is compressed. Version 1.00.0520 was distributed with Microsoft Office for Windows 95. To determine the version of a given EXTRACT.EXE file, execute it with either no command-line arguments or with the /? switch.

The version 1.00.0603 distributed with Internet Explorer 4.0 does not exhibit the problem described here.

Work Around

There is no template that the faulty versions of EXTRACT.EXE will interpret as matching all possible filenames. However, all filenames that are valid under traditional DOS rules will be matched by running the EXTRACT program twice—once in the usual way with either *.* or with no template, and once with the template * in case the cabinet contains files that have no extension.

Example

If a user is given a cabinet file, say ARCHIVE.CAB, then the command extract archive.cab *.* (or its equivalent extract /e archive.cab) cannot be relied on to extract all the contents of the cabinet. However, the command extract archive.cab *.* can be followed by extract archive.cab * to pick up any files whose names happen not to contain a period.

Cause

The cause is confirmed to lie in the EXTRACT.EXE code—specifically in the routine that matches a given filename against a given template. The implementation of how characters in the template are to match characters in the filename is:

In particular, a period in the template matches only a period in the filename and conversely, a period in the filename can be matched only by a period in the template. The template *.* is therefore interpreted as matching filenames that contain exactly one period.

Fix

In version 1.00.0603, the routine that matches a given filename against a given template starts with new code—almost certainly just one line of C—that returns a successful match, whatever the filename, if given *.* as the template.

The Windows problem described in this note has therefore been fixed by Microsoft, the solution being to upgrade Windows by installing Internet Explorer 4.0. Careful inspection of the Microsoft Knowledge Base may one day show that the problem is not only fixed but documented (as one would think it would be, at least while faulty versions of EXTRACT.EXE remain in circulation).