:: printable version ::

PKZIP 2.0 compatible archive handler

Description

Meet a couple of pure-Qt/C++ classes capable of handling PKZIP 2.0 compatible zip archives.

This is not a "port" of some other existing implementation, everything has been written from scratch (althought some code was actually inspired by existing public domain projects) and it's all pure C++/Qt. Please note that this is not a complete stand-alone library, it's just a bunch of classes. You will have to add them to your project and modify them to best fit your needs.

It supports basic features like file extraction and compression (with optional password encryption) and archive comments. There are methods to extract single files or the whole archive and methods to compress the contents of a whole directory. Nevertheless, it should be quite trivial to add other features. The public API only has a few methods because this is what I was looking for. This does not mean you can't write a few more lines of code (it really shouldn't take more than a few lines!) to add more features.

The classes are great if you only need to use the PKZIP format for loading/saving your application's data. Just remember that you will need to customize a few parts of the code, i.e. to add your own password retrieval method.

zlib is used for actual compression and decompression.

Please refer to the example application's main.cpp file or to the class comments in the source files for details and more usage examples.

The distributed example application does not link against the zlib library but it uses the QtCore exported zlib functions (please see the technical details section).

Changelog

  • 2012-09-10 Use data type defined in zlib/zconf.h for CRC table pointer; silently attempt to create missing output directories (thanks to Paul Tarr for the bug reports).
  • 2012-04-04 Fixed encryption routine on certain platforms (e.g. GNU Linux/GCC x64) (thanks to anunakin for the bug report).
  • 2012-01-31 Regression: fixed setting of compression level flag in local zip entry (thanks again to Anton Rekshinsky for the bug report).
  • 2012-01-29 Fixed corrupted entries when using Zip::Store compression level (thanks to Anton Rekshinsky for the bug report); added UnZip::verifyArchive(); minor improvements.
  • 2012-01-08 Implemented Zip::IgnoreRoot flag; added addFile() and addFiles() convenience methods; added Zip::CheckForDuplicates and Zip::SkipBadFiles flags; added (very) minor improvements.
  • 2011-06-25 Bug entry #1 - checking of "version needed to extract" flag (thanks to Dominik for reporting).
  • 2011-06-25 Fix compile errors on GCC/Linux.
  • 2011-03-27 Major changes:
    • added support for namespace
    • added support for shared library builds
    • added basic support for time zones
    • no longer take ownership of QIODevices (MAJOR API CHANGE!)
    • unzip will update last modified time on win32 and posix compliant OS
    • code clean up
  • 2010-07-08 Bug fix: extractFile() was permanently updating the compressed file size in the zip entry in case of an encrypted entry. Thanks to Serge Kolokolkin for finding and reporting the exact bug.
  • 2008-09-07 Bug fix: rare failure to locate EOCD when archive had a comment.
  • 2007-02-01 New IgnorePaths compression option and two more "addDirectoryContents()" convenience methods to use this option.
  • 2007-01-28 Major changes:
    • bug fix: there was a big problem with directory names
    • API changes: the Zip::addDirectory() method is now easier to use; the password can now be set using a setPassword() method and a new flag allows to preserve absolute paths
    • added an "encrypted" flag to the Unzip::ZipEntry struct
    • removed QObject inheritance. Internationalization is now achieved through QCoreApplication::translate()
  • 2006-11-24 Bug fix: under certain circumstances, an additional directory was being created in the root directory of the zip file.
  • 2006-10-23 Minor API changes; QIODevice support added; better binary compatibility; "long long" issue with older compilers solved.
  • 2006-06-09 Minor API changes.
  • 2005-10-03 First public release.

Requirements

  • Qt 4.0.x (QtCore module)
  • zlib library

Features

  • Pure C++/Qt based, clean & oo implementation.
  • Retrieve archive contents information before extracting any file.
  • Fast (but less robust with corrupt archives) parsing of the ZIP file format.
  • Traditional PKWARE password encryption (strong encryption as introduced by PKZip versions 5.0 and later is NOT available).
  • Support for archive comments.
  • Optional namespace and shared lib support (see below).
  • Time zone support (see below).

Missing features and restrictions

  • Needs to be modified to fit into an existing project (i.e.: you might need to add your own password handling routine).
  • Weak support of corrupt archives (althought some files could be extracted even if the archive is corrupted).
  • No support for filesystem specific features like unix symbolic links.
  • No support for spanned archives.
  • No support for strong encryption or features introduced after PKZIP version 2.0 (see the PKWARE specs for details).

Namespace support

If you need the classes to be inside a namespace, simply define OSDAB_NAMESPACE in the project file. The classes will then be in a Osdab::Zip namespace.

Building as a shared library (.DLL / .so / .dylib)

There are two ways to achieve this.

The easiest and cleanest way is to edit the zipglobal.h file and remove the "#ifndef OSDAB_ZIP_LIB" block.

The other way consists in #defining OSDAB_ZIP_LIB in both the project you create for the shared zip library and in the projects linking the library. An example in the "Example.SharedLib" directory contains such a sample build.

Time zones

Time zone support is implemented only on Windows and Unix compatible systems.

It can be disabled by defining OSDAB_ZIP_NO_UTC in the project file.

The currentUtcOffset() method in zipglobal.cpp is where the current UTC offset is calculated. Qt has no proper time zone support (Qt 4.7.x) so the code relies on gmtime and localtime.

::top::

Usage example

Always refer to the example application's main.cpp file or to the class comments in the source files for details and more (and updated) usage examples.

Using the UnZip class:

#include <QStringList> #include "unzip.h"
// ...
UnZip uz; UnZip::ErrorCode ec = uz.openArchive("myArchive.zip");
// Read compressed entries
if (ec != UnZip::Ok)
// ERROR HANDLING
// Any comment?
QString comment = uz.archiveComment();
// Need to list the files in this archive?
QStringList list = uz.getFileList(); if (uz.contains("directory/file.ext"))
// DO SOMETHING
/* Need more details? Each ZipEntry contains info like file comment, (un)compressed size, CRC32, last modified date/time, compression method, file type and encryption status (coming soon) */
QList list = uz.entryList(); ec = uz.extractAll(myDestinationDirectory); if (ec != UnZip::Ok)
// ERROR HANDLING
ec = uz.closeArchive();
// Close the zip file and free used resources
if (ec != UnZip::Ok)
// ERROR HANDLING
// ...

Using the Zip class:

#include "zip.h" // ... Zip uz; Zip::ErrorCode ec = uz.createArchive("myArchive.zip");
// Do some init and prepare for adding entries
if (ec != UnZip::Ok)
// ERROR HANDLING
// Basic usage
ec = uz.addDirectory("myDirectory");
// Create a custom root directory in the zip file
ec = uz.addDirectory("myDirectory", "myRoot");
// Don't create the root directory (new!)
ec = uz.addDirectory("myDirectory", QString(), Zip::IgnoreRoot);
// Replace the root directory (new!)
ec = uz.addDirectory("myDirectory", "myRoot/data/", Zip::IgnoreRoot);
// Add specific files (new!)
ec = uz.addFiles(QStringList() << "myDir/myFile.txt" << "myOtherDir");
// Preserve absolute paths
ec = uz.addDirectory("myDirectory", Zip::AbsolutePaths);
// ...with encryption :)
uz.setPassword("helloWorld!"); ec = uz.addDirectory("myDirectory");
// Any comments?
uz.setArchiveComment("Hello comment!"); ec = uz.closeArchive();
// Writes the zip file epilogue and frees used resources
if (ec != UnZip::Ok)
// ERROR HANDLING
// ...

A last tip: zlib's inflateInit() and deflateInit() methods check if the library's major version is the same as the version in the header file you include. You can do this test when your application starts thus notifying an error before the user starts working! Here is some trivial example code:

const char* my_version = ZLIB_VERSION;
// ZLIB_VERSION is #defined in zlib.h
const char* ret_version = zlibVersion(); if (my_version[0] != ret_version[0]) { QString req(ZLIB_VERSION); QString found(ret_version); qFatal(tr("Wrong zlib version! Required: %1, found: %2").arg(req).arg(found)); }
::top::

Technical details

zlib linking

Linking the zlib library is not really required as it is statically linked into the Qt libraries. Qt's QtCore library exports all the zlib functions, so you only need to let your INCLUDEPATH point to the zlib.h and zconf.h header files. Anyway, if you want to use a specific version of the zlib library you will need to link against it as usual.

The Bitter Bitty flag

The first versions of the UnZip class used to first parse the local header records to retrieve the archive contents. However this introduced some problems with entries that have the third general purpose bit flag set. The original purpose of this flag is to allow to write the CRC32, compressed and uncompressed size after the compressed data in a so-called "data descriptor" record. This was necessary because some devices (maybe some ice-age tape drive) might not be able to seek back and set the three fields in the local header record. Locating the data descriptor is however a big issue since we need to skip the compressed data without knowing its length. And btw.: the data descriptor has an OPTIONAL signature.

One day one of you guys out there realized that the class failed in parsing OpenOffice archives (plain PKZIP 2.0 zip files!). After a short coffeine powered debug session, I realized that the problem was related to my beloved 3rd general purpose bit flag (friends call him Bitty, but his full name is Bitter Bitty - ok, I will quit it with these poor geek jokes). OpenOffice sets the Bitty flag for empty files (or under some other circumstance.. I can't remember now) but the routine I wrote to locate the data descriptor was slow and buggy (very slow; and very buggy).

This was a good reason for me to write the whole archive parsing routines back from scratch. The latest versions start parsing the central directory records (at the end of the zip file) instead of parsing the local header entries. This allows me to always retrieve CRC32, compressed and uncompressed size, besides of the offset of the local header entry for each file. I suppose most zip routines use this second approach (at least some open source ones do) as it is quite faster. However, this means that incomplete zip files (files with a corrupted or missing end) won't be parsed, because I didn't add any routines to handle these files by attempting to parse the local header.

Btw. the local headers are still read to do some basic redundancy checks and detect if the values in the central directory differ from the ones in the local header.

Encrypted archives

About encrypted archives: only the last byte in the decrypted encryption header is used for testing the password as to PKZip version 2.0 and later. Earlier versions used the last two bytes and it seems that many zip implementations still use two bytes (at least when writing the encryption header). The Zip class still writes the last two bytes for testing the password at extraction time (using part of the file modification time as we don't know the CRC yet).

:: index :: top ::