File Mime types in Django

Allowing people to upload files to your server can be fraught with peril, we don’t want people to upload malicious files or binaries. In some cases we want to limit uploads to very specific file types, PDF or DOCX for example.

Python has a mimetype package that can be used to check the mime of a file lets say we upload a new jpeg file:

import mimetypes
file_mime = mimetypes.guess_type('a_file.jpg')
>> ('image/jpeg', None)

This is great! Python has guessed that the file is a JPEG. But what if we upload a file that is actually a pdf:

import mimetypes
file_mime = mimetypes.guess_type('sneaky_pdf_file.jpg')
>> ('image/jpeg', None)

Yup. Mimetypes only uses the filename to guess its type, which okay, but not good enough. Python-Magic to the rescue. Python magic is a wrapped for libmagic which you may need to install, python-magic can be installed with PIP:

pip3 install python-magic

And we can now check a file’s mimetype in various ways:

def file_path_mime(file_path):
mime = magic.from_file(file_path, mime=True)
return mime


def check_in_memory_mime(in_memory_file):
mime = magic.from_buffer(in_memory_file.read(), mime=True)
return mime
file_path_mime('/path/to/sneaky_pdf_file.jpg')
>> 'application/pdf'
file_path_mime('path/to/a_file.jpg'):
>> 'image/jpg'

The check_in_memory_mime function works for Django file uploads pulled directly from request.FILES on POST.

This post was inspired by Martin Eve who uploaded some arbitrary files to Janeway during testing of our new Preprints module