How to Detect File Type Using JavaScript?

Bytefer
Programming of EarthOnline
5 min readJul 24, 2022

--

(photo via https://pixabay.com/)

In daily work, file uploading is a very common function. In some cases, we want to be able to restrict the type of file upload, such as limiting the upload of images in PNG format. For this problem, we will think of limiting the uploaded file types through the accept attribute of the input element:

<input type="file" id="inputFile" accept="image/png" />

Although this solution can meet most scenarios, if the user changes the suffix of the JPEG format image to .png, it can successfully break through this limitation. So how should this problem be solved? We can identify the correct file type by reading the file's binary data. Before introducing the actual implementation scheme, let's introduce some related knowledge.

How to view the binary data of a picture?

To view the binary data corresponding to the image, you can use some editors, such as WinHex under the Windows platform or Synalyze It! Pro hex editor under macOS platform. However, here we use the Binary Viewer extension in the Visual Studio Code editor to view the binary data corresponding to the personal avatar.

How do distinguish the types of pictures?

The computer does not distinguish different picture types by the suffix name of the picture, but by the “Magic Number”. For some types of files, the contents of the first few bytes are fixed, and the type of the file can be judged according to the contents of these bytes.

The magic numbers corresponding to common image types are shown in the following figure:

common image types magic number

Next, let’s use the Binary Viewer extension to verify that the image type of my avatar is correct?

As can be seen from the above figure, the first 8 bytes of the PNG type image are 0x89 50 4E 47 0D 0A 1A 0A. When you change the bytefer-avatar.png file to bytefer-avatar.jpeg, and open it with an editor to view the binary content of the image, you will find that the first 8 bytes of the file remain unchanged. But if you use the input[type=”file”] input element to read the file information, the following results will be output:

File
lastModified: 1658647747405
lastModifiedDate: Sun Jul 24 2022 15:29:07
name: "bytefer-avatar.jpeg"
size: 47318
type: "image/jpeg"
webkitRelativePath: ""
[[Prototype]]: File

The correct file type is not recognized by the file extension or the file’s MIME type. Next, we will introduce how to ensure the correct image type by reading the binary information of the image when uploading the image.

How do detect the type of a picture?

1. define the readBuffer function

After getting the file object, we can read the contents of the file through the FileReader API. Because we don’t need to read the complete information of the file, we encapsulate a readBuffer function to read the specified range of binary data in the file.

function readBuffer(file, start = 0, end = 2) {
return new Promise((resolve, reject) => {
const reader = new FileReader();
reader.onload = () => {
resolve(reader.result);
};
reader.onerror = reject;
reader.readAsArrayBuffer(file.slice(start, end));
});
}

For PNG type images, the first 8 bytes of the file are 0x89 50 4E 47 0D 0A 1A 0A. Therefore, when we detect whether the selected file is a PNG type image, we only need to read the first 8 bytes of data, and determine whether the content of each byte is consistent one by one.

2. define the check function

To achieve byte-by-byte comparison and better code reuse. Let’s go ahead and define a check function:

function check(headers) {
return (buffers, options = { offset: 0 }) =>
headers.every(
(header, index) => header === buffers[options.offset + index]
);
}

3. detect PNG image type

Based on the previously defined readBuffer and check function, we can implement the function of detecting PNG images:

Html Code

<div>
Choose File:<input type="file" id="inputFile" accept="image/*"
onchange="handleChange(event)" />
<p id="realFileType"></p>
</div>

JavaScript Code

const isPNG = check([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]); 
const realFileElement = document.querySelector("#realFileType");
async function handleChange(event) {
const file = event.target.files[0];
const buffers = await readBuffer(file, 0, 8);
const uint8Array = new Uint8Array(buffers);
realFileElement.innerText = `The type of ${file.name} is:${
isPNG(uint8Array) ? "image/png" : file.type
}`;
}

After the above example is successfully run, the corresponding detection results are shown in the following figure:

file type detect result

The complete code is as follows:

If you want to detect the JPEG file format, you need to define an isJPEG function.

const isJPEG = check([0xff, 0xd8, 0xff]);

However, what if you want to detect other types of files, such as PDF files? Here we first use the Binary Viewer extension to view the binary content of the PDF file:

As can be seen from the above figure, the first 4 bytes of the PDF file are 0x25 50 44 46, and the corresponding string is %PDF. To allow users to identify the type of detection more intuitively, we can define a stringToBytes function:

function stringToBytes(string) {
return [...string].map((character) => character.charCodeAt(0));
}

Based on the stringToBytes function, we can easily define an isPDF function as follows:

const isPDF = check(stringToBytes("%PDF"));

Using the isPDF function, you can implement the function of PDF file detection. But in actual work, there are various types of files encountered. For this situation, you can use an excellent third library to implement the function of file detection, such as the file-type library.

Ok, that’s it for how to detect file types using JavaScript. In actual work, for file upload scenarios, for security reasons, it is recommended that you limit the types of file uploads during the development process. For more stringent scenarios, you can consider using the method described in this article to verify the file type.

If you want to learn TypeScript, then don’t miss the Mastering TypeScript series. This series will introduce the core knowledge and techniques of TypeScript in the form of animations.

Mastering TypeScript Series

63 stories

--

--