Powering the Internet with Base64
Diving into the mechanics of base64 encoding
--
Base64 is ubiquitous on the Internet right now. Sometimes it seems like every request, every URL and every file is being encoded in base64 format! As a programmer, base64 certainly feels like a daily fact of life.
Every time I use base64, I can’t help but wonder how it works. What is the dark magic hiding underneath the hood when you type base64_encode()? Today, we’ll dive into the specifics of base64 and find out.
What is base64 encoding?
Base64 is a binary to ASCII encoding scheme. It is designed as a way to transfer binary data reliably across channels that have limited support for different content types.
A base64 encoded string looks like this:
V2hhdCBoYXBwZW5zIHdoZW4geW91IGJhc2U2NCgpPw==
Base64 characters only use the same 64 characters that are present in most character sets. They are:
- Upper case alphabet characters A-Z.
- Lower case alphabet characters a-z.
- Number characters 0–9.
- And finally, characters + and /.
- The = character is used for padding.
These characters are generally implemented by most character sets and are not often used as controlled characters in Internet protocols. So when you encode content with base64, you can be fairly confident that your data is going to arrive uncorrupted.
Whereas when you transfer your data in their original, “bits and bytes” state, the data might be screwed up due to protocols misinterpreting special characters.
What is it used for?
The original use case for base64 was simply as a safe way to transmit data across machines. Overtime, base64 has been integrated into the implementation of certain core Internet technologies such as encryption and file embedding.
Data transmission: Base64 can simply be used as a way to transfer and store data without the risk of data corruption. It is often used to transmit JSON data and cookie information for a user.