Streamed File Zipping and Downloading in PHP

Originally published on http://engineering.weebly.com/

Did you notice that on Dropbox.com, you can select a folder and start downloading it while it’s being zipped? This on-the-fly zipping feature comes handy for both the user and the server — as a user you don’t have to wait until the files are zipped on the back end before the downloading starts, and it saves the server from creating a temporary zip file and deleting it afterwards.

In this case the browser has no idea when the streamed zip file ends or how big the whole zip file will be, so what the user will see is something like:

In the Network tab, this is what such a request looks like:

Request Headers (excerpt):
accept:text/plain, */*; q=0.01
accept-encoding:gzip, deflate
content-length:205
content-type:application/x-www-form-urlencoded; charset=UTF-8

Response Headers (excerpt):
content-disposition:attachment; filename=”batch_download.zip”
content-type:application/x-zip
content-transfer-encoding:binary
date:Fri, 16 Dec 2016 20:01:47 GMT
server:nginx
Status:200

Note that the content-disposition option is set to attachment. This indicates that the browser should interpret the response as a file download, rather than attempting to display the response as a web page. For details on how this option works, please refer to this article on MDN.

To implement this feature with PHP, we handle the zipping and streaming work on the back end using ZipStream. Assuming your files are on the server and can be loaded with file_get_contents given a path:

​Under the hood, when you call $zip->finish(), ZipStream takes care of setting the response headers (content-disposition, content-type, etc.) and makes sure that the browser treats this response as a file download. Finally it calls fwrite to send the data stream.

If your files are hosted on cloud services such as AWS/S3, you’ll need to use the appropriate method to load the file’s content, and call ZipStream’s lower-level method $zip->addFile($file_name, $file_content). See the following example with AWS/S3:

$file_name = ‘my_awesome_picture.jpg’;
$file = $client->getObject([
‘Bucket’: ‘my_pictures_bucket’,
‘Key’: $file_name
]);
$file_content = (string) $file[‘Body’];
// similar to above, except the second parameter being file content
$zip->addFile($file_name, $file_content);

On the front end, all we need to do is to send a request with a payload of identifiers of the user-selected files and/or folders. This depends on your app. It’s also worth mentioning that this request cannot be made via Ajax because of security reasons — in this case all Ajax could do is acquiring the zipped file in the form of a binary string, but cannot initiate file downloading. We have to resort to lower-level HTML <form>. Insert the following auxiliary <form> anywhere in your DOM:

​Note that although this request involves fetching a zipped file from the backend, we use a POST method here simply because the request payload can get pretty long if the user selected a load of files/folders. In that case, a GET method may end up with an HTTP 414 error.

The next thing is to load data into this form (by inserting <input>s) and submit it via Javascript:

var $form = $(‘#downloader-form’);
// Append CSRF token for POST method
$form.append($(‘<input>’).attr({
name: ‘_token’,
value: ‘<your_CSRF_token>’
}));
// User-selected files, assuming files are identified by UUIDs
$form.append($(‘<input>’).attr({
name: ‘file_ids’,
value: [‘df491ae4–9d00–4674-b565-e4e5943f55b4’, …]
}));
$form.submit();

A couple follow-up issues that worth mentioning:

(1) As the form submission completes, the browser interprets the response of your form as a document, but one of the response’s headers, content-type, indicates that the MIME type is application/x-zip instead of a valid document MIME type such as text/html. So you’ll probably see this log in your console:

Resource interpreted as Document but transferred with MIME type application/zip: “http://yoursite.com/batch-download".

This misinterpretation seems pretty reasonable, but there doesn’t seem to be a workaround for this. <form> was intended in the first place to submit a form and refresh the page to display the server response as a document.

(2) You may be tempted to listen on the complete event of the form submission, so that you can re-enable your Download button, for instance. However, this cannot be achieved. As a summary of discussions on this issue here, when you post a form, the form inputs are sent to the server and your page is supposed to be refreshed.

Ye Tian

Ye Tian

Full Stack Software Engineer at Weebly