Get Apache to act as a gzip proxy

Let’s say you want to repeatedly get a large amount of data (text, or something not in an already compressed format) from a RESTful webservice. But the webserver doesn’t compress the data when transferred over HTTP, and you have a slow connection on your end machine (such as your development machine). And so it takes your end client a while to load the data on every iteration, thus slowing down whatever you’re doing.

In this scenario you can use a server in the middle which has a fast connection, to act as a proxy and gzip the data for you. This server could be hosted in the cloud somewhere.

First you’ll need to make sure that Apache is installed on this server, and the firewall allows access to port 80, or 443 if you’re going to be using HTTPS. You’ll need to make sure that the following Apache modules are installed: mod_proxy and mod_deflate.

Let’s say the data from the webservice you’re trying to get to is located at the URL http://www.webservice.com/data/json, and returns a giant block of json data. The configuration for this would be simple. Edit /etc/httpd/conf/httpd.conf (or wherever httpd.conf is according to your system) and add the following:

<IfModule mod_proxy.c>
    ProxyRequests On

    <Proxy *>
        Order deny,allow
        Allow from all
    </Proxy>

    ProxyPass /compressed http://www.webservice.com/data/json
    ProxyPassReverse /compressed http://www.webservice.com/data/json

    <Location /data>
        Order allow,deny
        Allow from all
        AddOutputFilterByType DEFLATE application/json
    </Location>
</IfModule>

And voila! Now when you visit the /compressed path on your middle-man Apache server, it will proxy and compress the json data from the upstream server before it ships it to you. So if your servers’s IP is 1.2.3.4, you’d use the URL http://1.2.3.4/compressed, and that will proxy to http://www.webservice.com/data/json and return the data compressed to your end client with the slow Internet connection, which will load much faster. Let’s say it’s 1mb of regular json text data, which should easily compress to 200kb or so, which will load 5x faster!

Note: you’ll of course need to make sure that the HTTP client you’re using supports gzip compression. If you’re doing this programatically, whatever HTTP API you’re using may allow you to do this. Or you’ll need to manually add the “Accept-Encoding: gzip” header, so the middle-man server knows to compress the data, and whatever content you get back, you’ll need to first decompress manually.