cabextract is Free Software for extracting Microsoft cabinet files, also called .CAB files. cabextract is distributed under the GPL license. It is based on the portable LGPL libmspack library. cabextract supports all special features and all compression formats of Microsoft cabinet files.

Microsoft cabinet files are used by Microsoft and others to distribute all kinds of data and software: core Web fonts, Longhorn videos, operating system updates and video codecs, to give some examples. Microsoft cabinets are also used as the installation format for Windows CE software. Some people would pay $14.99 to extract this format, you can have it for free.

Download

The latest version of cabextract is version 1.4, released 11 May 2011. It is a minor release which fixes bugs in cabextract. See the changes section for more information.

In most cases, cabextract has already been packaged for your operating system. You should consult your OS documentation on how to obtain and install packages. However, you may have to download the source code and compile it yourself to get the latest version.

Platform Download
All: cabextract source code cabextract-1.4.tar.gz
Linux cabextract-1.4-1.i386.rpm
cabextract-1.4-1.src.rpm
cabextract on rpmfind.net
Debian packages
Ubuntu packages
Gentoo package
Slackware packages
T2 SDE package
BSD FreeBSD package
NetBSD package
OpenBSD packages *
Mac OS X Standalone disk image
Fink package
Macports package
Solaris Solaris SPARC package
Solaris x86 package
Microsoft Windows™ / MS-DOS Cygwin x86 package
Cygwin x86_64 package
DOS / FreeDOS package
Others Amiga m68k / GeekGadgets
Amiga PPC / OS4
BeOS M$CAB package *
NeXTStep / Openstep package *

Entries marked with an asterisk (*) may not have the latest version (1.4). Please let me know of any additions or changes to the list above.

Old distributions of cabextract are still available: 1.3 [i386 RPM, src RPM, Solaris SPARC pkg, Solaris x86 pkg], 1.2 [i386 RPM, src RPM, Solaris 10 x86 pkg], 1.1 [i386 RPM, src RPM, Solaris 10 x86 pkg], 1.0 [RPMs: i386, src], 0.6, 0.5, 0.4, 0.3, 0.2, 0.1.

To install an existing cabextract package, consult your operating system's documentation. To install the RPM, use the command rpm -i cabextract-1.4-1.i386.rpm. To install from the source code tarball:

$ gzip -cd < cabextract-1.4.tar.gz | tar xf -
$ cd cabextract-1.4
$ ./configure
$ make
# make install

More detailed instructions are included in the INSTALL file found in the cabextract-1.4 directory.

Bundled extras

The cabextract source tarball contains extra documentation and software that is not installed by default. If you use a distribution-specific package, these extras may or may not be installed.

  • doc/ja/cabextract.1: the Japanese manual page for cabextract 1.1.
  • doc/magic: Some magic entries for the file command, describing Microsoft cabinet files, InstallShield cabinet files and Windows CE install cabinet header files.
  • doc/wince_cab_format.html: a technical specification of the Windows CE install cabinet file format. This is also available online.
  • src/cabinfo: A program for dumping the raw data fields of Microsoft cabinet files.
  • src/cabsplit: A perl script for splitting a cabinet file into one file per folder.
  • src/wince_info: A perl script for dumping the raw data fields of a Windows CE install cabinet header file.
  • src/wince_rename: A perl script for renaming the files of an extracted Windows CE install cabinet to their true installed names, and extracting the registry entries made by the cabinet.

Using cabextract

Enter man cabextract to read the cabextract manual page. Also, running the cabextract command with the --help option gives a brief summary of usage.

In regular usage, just enter cabextract and the name of the cabinet or executable file you want to extract. cabextract will extract all files in all cabinets to the current directory, preserving any internal directory structure, file permissions and file dates. To list files rather than extract them, use the --list option. To test the archive integrity (doing the work of extracting the files, but not saving the results anywhere), use the --test option. This also prints the MD5 checksum of each file in the archive.

cabextract automatically searches files for embedded cabinets, and extracts all of them. If any multi-part cabinets are present, cabextract automatically searches for those parts and links them in. To suppress this behaviour, use the --single option.

cabextract can repair some kinds of corrupt cabinet files. Perhaps a better word for this is "salvage", as the corrupted data is lost forever. Using the --fix option, lost data will be replaced with zeroes, and cabextract will attempt to continue to later data blocks, which are hopefully not corrupt.

You can make cabextract extract files into a specific directory with the --directory option, and you can force extracted filenames to lowercase with the --lowercase opetion. You can control which files are extracted using the --filter option. For example, cabextract --filter '*.wav' music.cab will extract only '.wav' files from music.cab.

Changes since cabextract 1.3

  • A bug in the LZX decompressor was fixed.
  • cabextract is now more tolerant when processing cabinet sets.
  • cabextract is now compatible with even more compilers, and now supports 64-bit file I/O on platforms where it's completely native, like Mac OS X 10.6 and Fedora x86_64.
  • cabextract will no longer print "library not compiled to support large files" while reading small files.

Frequently Asked Questions

Q: I can't extract this DATA1.CAB file...
A: There are two different "cabinet" file formats in popular use. Some are Microsoft cabinets, which can be unpacked with cabextract. Others are InstallShield cabinets, which can be unpacked with unshield. You can distinguish the two files like so:

  1. InstallShield cabinets are normally called data1.cab and have a matching data1.hdr file.
  2. InstallShield cabinets begin with the magic ID "ISc(". Microsoft cabinets begin with the magic ID "MSCF".
  3. Unpacking an InstallShield cabinet with cabextract 0.6 gives the error message "not a Microsoft cabinet file". cabextract 1.0 or later gives the warning "WARNING; found InstallShield header. This is probably an InstallShield file. Use UNSHIELD from www.synce.org to unpack it." and the error message "no valid cabinets found".

Q: Can I license cabextract for use in my non-GPL software?
A: Yes, you can. Contact me for further details. However, you may prefer to use libmspack, as it has been explicitly designed for reuse.

Q: Where can I get software to create Microsoft cabinet files?
A: There are several options:

  • You can use Microsoft's own CABARC.EXE.
  • You can use Rien Croonenborghs' LCAB.
  • Future releases of libmspack will include a cabinet file creator. It is currently being designed.

Q: Is cabextract a circumvention device, as defined in the DMCA?
A: Perhaps it is, according to Microsoft. The linked article shows Microsoft citing WinRAR and WinZip as circumvention devices, as they allow people to extract this executable cabinet file, which contains a document describing how Microsoft have embraced and extended the Kerberos protocol to prevent interoperability with Unix-based Kerberos servers. The executable gives you a click license to agree to, which includes a Non Disclosure Agreement. Obviously, if you don't run the executable, you will never see the NDA, and will not be bound by it. The irony is that WinRAR and WinZip rely on Microsoft's own CABINET.DLL to do the extraction, so really it is Microsoft's own software acting against them!

Q: Do you hate Microsoft?
A: No. There is nothing wrong with being a big software corporation. What I dislike is Microsoft's illegal abuse of its monopoly to harrass its competitors. I would like to see Microsoft pressured into continuous technical innovation in a competitive marketplace, rather than engage in product dumping to financially cripple its competitors, and locking in its users with incompletely documented and constantly changing file formats, then letting software stagnate once it achieves dominance.

Q: Is reverse engineering illegal?
A: Reverse engineering for interoperability is protected under international copyright law. You do not have the explicit right to copy software, which is why you need to agree to a license that gives you those rights, but you do have the explicit right to reverse engineer software for interoperability purposes. Your right to reverse engineer could only be stopped if you signed a contract agreeing not to reverse engineer the software. In the UK, this must be a fair contract, which is a contract that both parties have the opportunity to amend and agree to. A shrinkwrap license or EULA is not a fair contract.

CAB History

In 1977, Abraham Lempel and Jacob Ziv devised and published a paper on their new compression method, LZ77. In 1982, James Storer and Thomas Szymarski released their LZSS variant. In the early 1980s, Microsoft required some form of data compression for their installation media to cut down on the number of disks needed to install MS-DOS and Microsoft Windows, so they took Haruhiko Okumura's implementation of LZSS. Their compressed files had a SZDD signature.

In 1989, Phil Katz put the deflate method in the public domain. Microsoft started using the algorithm to compress their installation media. The signature changed to KWAJ.

In the early 1990s, various people invented new forms of disk formatting for the IBM PC, increasing the amount of space on a disk despite the PC's inflexible floppy disk controller. Once again, Microsoft products were getting bigger, so Microsoft took one of these disk formats and called it DMF, or Windows formatted disks.

For most of the early 1990s, Jonathan Forbes had been writing fast versions of LZH archivers on the Amiga. In 1995, he and Tomi Poutanen devised an LZH adaption known as LZX. Its main benefits beyond deflate were a compact way of encoding large match offsets, and ramping up the size of the LZ sliding window. Furthermore, their Amiga implementation included file merging (known as solid archiving in RAR), where file data was grouped into large blocks, instead of files being individually compressed. This file merging technique also appeared in other new archivers around that time. By coincidence, Microsoft devised a new installation media which used file merging! This time, they were cabinet files or CABs. They included two compression methods - MSZIP (aka deflate) and Quantum, a large-window LZ compressor using arithmetic coding, licensed from its author David Stafford.

In 1997, Jonathan Forbes went to work for Microsoft. Soon enough, cabinet files started supporting a modified form of LZX. But finally, Microsoft published an official specification for cabinet files, MSZIP LZX. They did not detail Quantum, and their LZX specification contained to such extent that it was not possible to create a working compressor or decompressor from the specification.

In 2000, Stuart Caie embarked on writing a CAB unpacker for Dirk Stöcker's XAD system. He discovers all of the above, including the LZX specification errors, but eventually comes up with a working LZX extractor. Being a generous devil, and wanting help with the remaining Quantum extractor, he converts his XAD client into a command-line CAB decompressor. In 2002, Matthew Russotto kindly researches and writes the Quantum extractor.

In 2003, Stuart Caie launches a new library designed to support all major Microsoft compression formats, called libmspack.

Credits

cabextract is written primarily by Stuart Caie. The Quantum decompressor was researched and implemented by Matthew Russotto. The original adaption of InfoZip's inflate to MS-ZIP was done by Dirk Stoecker, who also provided lots of support, testing, and cabinet files. The fast Huffman table generator is taken from unlzx by Dave Tritscher.

Thanks to Eric Sharkey for Debian packaging and the original manual page. Thanks to the Ben Collver for NetBSD packaging, and some useful patches. Thanks to Maxim Sovolev for the FreeBSD packaging. Thanks to Siarzhuk Zharski for BeOS packaging. Thanks to Pawel Chwalowski for the Amiga packaging. Thanks to Stefan Dirsch for using cabextract in SuSE. Thanks to Apostolos Syropoulos for the Solaris packaging. Thanks to Rudá Moura for the Mac OS X disk image. Thanks to Robert Riebisch for the DOS version. Thanks to Katsumi Saito for the Japanese manual page. Thanks to Soos Peter for the RPM spec file. Thanks to Jae Jung and Igor Glucksmann for LZX decompressor fixes. Thanks to Larry Frieson for an important Quantum decompressor fix. Thanks to Markus Nullmeier for native IRIX compiler support. Thanks to Jonathan Forbes for creating LZX and other Amiga compression tools. Finally, thanks to the many other people who have sent in email, suggestions and code.