Sending big file with minimal memory in Golang

If you are also looking for how to handle large files in multipart POST on server side, please see this post.

A hungry gopher

Most common way of uploading file(s) by http is splitting them in multiple parts (multipart/form-data), this structure helps greatly as we can also attach fields along and send them all in one single request.

A typical multipart request (example from Mozilla):

POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: multipart/form-data; boundary=---------------------------974767299852498929531610575
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description"
some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt"
Content-Type: text/plain
(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

We will start with this simple implementation, standard package mime/multipart got our back:

buf := new(bytes.Buffer)
writer := multipart.NewWriter(buf)
defer writer.Close()
part, err := writer.CreateFormFile("myFile", "foo.txt")
if err != nil {
return err
}
file, err := os.Open(name)
if err != nil {
return err
}
defer file.Close()
if _, err = io.Copy(part, file); err != nil {
return err
}
http.Post(url, writer.FormDataContentType(), buf)

multipart.Writer will automatically enclose the parts (files, fields) in boundary markup before sending them ❤️ We don’t need to get our hands dirty.


Above works great until you do some benchmark and see the memory allocation grows linearly with the file size, so what went wrong? It turns out that buf is fully buffered the whole file content, buf sequentially reads a modest 32kB from the file but won’t stop until it reaches EOF, so to hold the file content buf needs to be at least at the file size, plus some additional boundary markup.

HTTP/1.1 has a method of transfer data in chunks unboundedly, without specifying Content-Lengthof request body, this is an important feature we can utilize.

So buf causes problem, but how can we synchronize file content to the request without it? io.Pipe is born for these tasks, as the name suggests, it pipes writer to reader:

r, w := io.Pipe()
m := multipart.NewWriter(w)
go func() {
defer w.Close()
defer m.Close()
    part, err := m.CreateFormFile("myFile", "foo.txt")
if err != nil {
return
}
    file, err := os.Open(name)
if err != nil {
return
}
defer file.Close()
    if _, err = io.Copy(part, file); err != nil {
return
}
}()
http.Post(url, m.FormDataContentType(), r)

If you dump the request above, the header reads:

POST / HTTP/1.1
...
Transfer-Encoding: chunked
Accept-Encoding: gzip
Content-Type: multipart/form-data; boundary=....
User-Agent: Go-http-client/1.1

Yep, net/http has handled the Transfer-Encoding and remove Content-Length without us manually doing anything. Wanna see the difference in memory allocation?

Benchmark sending 16MB file
33471060 B/op
84767 B/op

Guess which one is the second approach? 😉