|innerjoin | apache-compression|
Want to hold gzip'd files on the webserver and hand out compressed data to the browsers that can cope and uncompress it for browsers which cannot?
Mod_gunzip is Helge Oldach's Apache module which provides this functionality for static data.
mod_gunzip.tar.gzfile from Helge Oldach's http://www.oldach.net/ site, unpack it with a
tar -zxvf mod_gunzip.tar.gz
Tested with an Apache 1.3.12 probably from a RedHat 7.0 install.
mod_gunzip-1-2version. This works but it will always decompress the data, it will not send the compressed data to browsers which say they can handle it. Version 2 of mod_gunzip gives Apache the ability to return the uncompressed data only when needed.
apxsutility which is part of the Apache Development Environment (in the way that Red Hat packages things anyway)
apxs -c -lz mod_gunzip.c
mod_gunzip.somodule in the
/usr/httpd/modulesdirectory (or the
usr/lib/apache/directory, depending on your version or distribution). This
.sofile is read into Apache through dynamic loading/linking; the file is read from disc and made part of the running Apache system.
httpd.conffile to reference it:
apxs -i -a -c -lz mod_gunzip.c
Decide how you want to handle your compressed files. There is a choice of:
.gzfile back to html, option A, for example by:
gzip -9 index.html && mv index.html.gz index.html
gzip -9 index.html && mv index.html.gz index.htmz
.htmzfiles is that you need to change your links.
The next step is to edit the Apache configuration files. Add the lines
LoadModule gunzip_module /usr/httpd/modules/mod_gunzip.so
in amongst the similar looking lines of the
addmodule directives need to be included in
server configuration file, loading a module is thus the responsibility of the server admin.
Once loaded however, the behaviour of the mod_gunzip module can be controlled on a
per-directory basis in the
No, you have not copied a mod_gunzip.c file anywhere. You may have
notes in the
httpd.conf file saying you need to add both these
lines. Yes, these are correct, the module does not work without the
LoadModule brings the module code 'into' apache through the dynamic
loading process and the
AddModule tells Apache that it can use it.
If you are compressing the html files and renaming them back to
.html you need to add a line:
AddHandler send-gunzipped .html
to your config or
Handlers are briefly described in the Apache What is a handler? documentation.
AddHandler tells Apache that there is module wanting
to have a look at files with a .html extension. The mod_gunzip code
picks up the reference because it is looking for 'send_gunzipped'
If the file is not compressed then mod_gunzip will not do anything with the file, it just 'passes' and lets Apache or another module hand back the html to the browser.
There is a slight overhead in serving pages if using mod_gunzip and the data is not compressed. There is also the distinctly human difficulty of remembering whether your source file is compressed or not.
It is also possible to hold compressed data in
(using this extension as an example).
The configuration files would then need to include two extra lines instead of the one:
AddType text/html .htmz
AddHandler send-gunzipped .htmz
where the first is saying that the MIME type of
is html text. Without this line the browser is likely
to show the raw HTML.
It is then useful is to edit the DirectoryIndex line to include
DirectoryIndex index.htmz index.html index.htm....
without this Apache will not serve the compressed index file if it is given a URL without the 'file' component -
This directive has to be set in the Apache Configuration file (not the
A telnet session is fine for testing. Connect to the webserver with:
telnet localhost 80
and type the HTTP commands:
GET /index.html HTTP/1.1
should give you uncompressed text whether or not the original is compressed (this is assuming a setup as per option A).
GET /index.html HTTP/1.1
should give you the compressed data (assuming index.html is a compressed file) with a header line of
The entry in the Apache
access-log will show the size of the data
sent down the wire, so the size of the gzip'd data if the browser accepts it or
the size of the uncompressed data if mod_gunzip has to uncompress it.
It is better to hold files compressed and uncompress then when needed or the
other way round? That is, is it better to try to set up a
Most likely the first option,
mod_gunzip, with the reasons being that most browsers can
handle gzip compressed data now and the overhead for the browsers, and robots,
which cannot is fairly low - it is easier on the server to uncompress data when required
than to compress it,
gunzip runs faster then
you are able to use higher levels of compression (
gzip -9) as
you are compressing offline.
the two modules have different strengths.
Mod_gzip compresses dynamically generated
content on the fly (which includes Server Side Includes and CGI output)
and, since version 1.3.19a, can also return compressed
copies of pages on disc
If looking to
mod_gunzip to reduce the storage requirements on the webhost then
compare the disc space allocated (with
du) rather than the size of the
file in bytes. File systems allocate space in terms of blocks or segments of several
Kbytes so compressing small files may not reduce the quota.
This howto concentrates specifically on html files, the
AddHandler commands only tell Apache to
.htmz files as compressed.
This lightly sidesteps some of the traps presented by some
which have trouble with compressed
Mod_gunzip recognises the header lines 'Accept-Encoding: gzip' and 'Accept-Encoding: x-gzip' On incoming requests and responds with a 'Content-Encoding: gzip' if it passes back gzip compressed data.
The practice amongst older browsers was to send 'x-gzip' rather than 'gzip'. See the HTTP 1.1 spec, RFC 2616:
In the meantime the term "gzip" seems to have been adopted as a
description of the compression algorithm as much as the name of
the program and Apache itself treats
.htmz rather than
This is a matter of personal preference but concatenated extensions
add additional complication to the Apache configuration.
.htmz extension it is practical to set the MIME
type to be returned with
AddType text/html .htmz
Whereas a file with an appended
.gz extension is
considered deliberately compressed and sent with the MIME type
application/x-gzip. The browser then prompts you
to save to disc. With Apache 1.3.12, at least, adding
AddType text/html .html.gz
line does not solve the problem (presumably as the
AddEncoding config is still saying
.gz files are gzipped
but this should not really affect the MIME type?)
Do any browsers make use of the
TE: gzip request and
Transfer-Encoding? This is mentioned in the Mozilla Perfomance
HTTP Compression page
but does not seem to be used by Mozilla. Opera sends both the
The difference seems to be that using
Transfer-Encoding is completely
unambigous, the compression done is simply for the transport and the data
would be presented uncompressed in the browser.
Accept-Encoding does not carry clear implication
that the browser will uncompress the data or offer to save it compressed.
Mod_gunzip makes an assumption that the browser will uncompress and display any content
flagged with a
Content-Encoding: gzip reponse.
According to a description of the differences between HTTP 1.0 and 1.1 there was not a formal specification of HTTP 1.0, RFC1945 being a description of 'common usage' at the time rather than a formal standard. See the comparisions of: