Archive for category Technology

Metazip 0.8 released

Metazip

Metazip is an Apache 2.x module that allows you to serve dynamically-constructed, uncompressed zip files. Thus, your users can choose to download individual files or entire collections, but you don’t have to reserve disk space for both.

For example, let’s say we have a directory with some .mp3 files. This directory also has a .zip file containing the very same .mp3 files. Note that it must be an uncompressed .zip file, so that it contains exact copies of the .mp3 files:

Now, by running the Metazip command-line tool, we can “compress” this .zip file by removing the duplicate copies of the .mp3 files. The resulting .zip.mz file has all of the original .zip file’s metadata (e.g. the header at the beginning of every .zip file), but the .mp3 data has been replaced with a simple filename:

We can reconstruct the original .zip file by combining the .zip.mz file with the .mp3 files. The mod_metazip Apache module does exactly that, and it does this without adding any significant load on the server! Internally, Apache uses a data structure called a “bucket brigade,” which allows it to efficiently combine different pieces of data into a single logical stream.
Features:

  • Extremely efficient, by hooking into Apache’s filter system.
  • Works equally well for .zip, .tar, and .rar archive files.
  • Supports resumption of aborted downloads.
  • Lightweight, self-contained C implementation.
  • Free, open source license (Apache).

Restrictions:

  • Works for uncompressed archives only.
  • Requires command line utility to construct .zip.mz file.
  • Currently only supported on Linux with GNU bash, gcc, and make.

Building and Installing Metazip

Building metazip should be as simple as typing “make all”. This checks prerequisites, compiles all source files, links the executables, and builds the mod_metazip.so shared library.

Optionally, you can test Metazip prior to installing it. Just run “make test”. This invokes Apache in debug mode, and you can point your browser to http://localhost:8000/ to test metazip interactively.

If all goes well, you can install Metazip using “make install”. However, this may require root access, depending on where Apache is installed. Type “make -n install” to see what commands are run during installation.

Apache Configuration

Metazip is easy to configure. Just add a few lines to your httpd.conf:

# Load the metazip module.
LoadModule metazip_module modules/mod_metazip.so

# Associate .mz files with Metazip.
AddType application/x-metazip .mz

# Enable the filter for Metazip files.
FilterDeclare METAZIP
FilterProvider METAZIP METAZIP resp=Content-Type application/x-metazip
FilterChain METAZIP

# Set the "X-Metazip: enabled" header.
RequestHeader append X-Metazip enabled

# Set the relative directory for .mz files.
<Directory /path/to>
MetazipDirectory /path/to/dir
</Directory>

Note: When a user requests a .mz file, Metazip sets the Content-Disposition header with the archive filename (without the .mz extension). Therefore, your users will always be prompted with the “correct” filename.

Optionally, you can use mod_rewrite to “hide” the .mz extension. If a user requests a non-existent archive (say, example.zip), then mod_rewrite can replace this with a request for the corresponding .mz file (example.zip.mz). This allows you to integrate Metazip without having to change existing links to archives.

To use mod_rewrite with metazip, just add the following to the .htaccess file in your document root (or within a <Directory> section in httpd.conf):

# If the requested archive doesn't exist,
# try the corresponding .mz file.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME}    !-f
RewriteCond %{REQUEST_FILENAME}.mz  -f
RewriteRule \.zip|\.tar|\.rar$ %{REQUEST_URI}.mz
</IfModule>

Note: You may need to dynamically load mod_rewrite, using the LoadModule directive in httpd.conf:

LoadModule rewrite_module modules/mod_rewrite.so

Requirements

Metazip depends on several GNU extensions, so GNU bash, gcc, and make are required. Of course, it is an Apache 2.x module, so the Apache development environment is required. On Red Hat systems, the required package is “httpd-devel”. On Debian systems, it is “apache2-dev”.

DiggFacebookStumbleUponTwitterRedditShare

metazip 0.1 Released

So I finally got around to releasing a project I’ve had sitting around for more than a year. It’s metazip (or mod_metazip):

metazip is an Apache 2.0 module that allows you to serve dynamically-constructed, uncompressed zip files. Thus, your users can choose to download individual files or entire collections, but you don’t have to reserve disk space for both.

I originally wrote metazip to solve a problem at Archive.org. They have an amazing collection of freely-downloadable music, most of it being complete live shows of bands that allow free distribution. It’s really the logical, modern extension of the good ol’ days of Grateful Dead tape-trading. And, indeed, Archive.org has thousands of Dead shows free for the listening. Check it out!

Anyway, Archive.org allows you to download an entire show as a single .zip file. You can also select individual tracks. They employ a Perl script that dynamically creates these zip files on the fly. This seems logical, since otherwise they would consume double the disk space (and their archive is immense)!

Unfortunately, the script has a rather fatal flaw: it doesn’t support resumption of downloads, a topic of much consternation in their forums. (Also see my post on the subject, which includes a quirky workaround I concocted.) This can be mighty frustrating if your Internet connection craps out 95% of the way through a 1GB download! Most browsers and other HTTP clients support resumption, which would allow you to download that last 5% without starting over.

For more information about metazip, how it works, and how to set it up, see the metazip web site. It’s an open-source project, so feel free to contribute to its development at the SourceForge project page.

DiggFacebookStumbleUponTwitterRedditShare

Tags: ,

Don’t go there: Keeping the Unix ‘find’ command out of your CVS and Subversion directories.

You’ve avoided learning the ins-and-outs of the Unix find command because it doesn’t play nice within your Subversion and CVS working directories? Well then, I’ve got just the solution!

Don’t want to read my ridiculous blathering? No problem! Just download the free, open source code that “fixes” find.

Read the rest of this entry »

DiggFacebookStumbleUponTwitterRedditShare

Tags: