Emojis and Unicodeย ๐Ÿ’ฉ

How and why you can include emojis in just about anything.

When I was working on my personal website, I learnt that I could use emojis in my text like so <p>๐Ÿ’ป Projects</p>. I was flooredโ€Šโ€”โ€ŠI thought emojis were images or icons. Iโ€™m a nerd ๐Ÿค“โ€Šโ€”โ€ŠI couldnโ€™t just accept this fact and move on with my life. I wanted to know why emojis are special and I decided to do some research ๐Ÿ‘ฉโ€๐Ÿ’ป.

So why does this work?ย ๐Ÿค”

Since 2010, emojis have been included into Unicode. With every version release, the Unicode Consortium (fancy word for committee) continues to add more emojis ๐Ÿ™€ ๐Ÿ˜ฑ ๐Ÿ˜.

But what is Unicode?ย ๐Ÿ˜•

Computers fundamentally deal with binary numbers, a series of 1โƒฃ๏ธโ€™s and 0โƒฃ๏ธโ€™sโ€Šโ€”โ€Ša combination of bits. They donโ€™t deal with text or characters the same way humans do.

To use bits to represent anything more than just bits, we need rules. These rules are also known as encoding schemes, or encoding for short. We use them to convert a sequence of bits into something like letters, numbers and pictures and vice versa. There are many different ways to encode a character and they differ in efficiency and compatibility โ˜ .

Thatโ€™s where Unicode comes in. Prior to it, there were many other encoding systems. However, Unicode is a standardized character setโ€Šโ€”โ€Ša universal way of translating characters into bits a computer can understand. This set includes a majority of characters from the human language (Mandarin, English, Russian, etc). Since emojis have been included in Unicode, computers can interpret emojis the same way it does regular characters ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰.

But Unicode is not an encoding. It is simple a table that maps values to characters. Itโ€™s a fancy way of saying: โ€œ65 is A, 66 is B, 67 is C, and 9,729 stands for โ˜โ€.

So, Unicode needs an encoding to translate characters to bits. The encoding they use is UTF, or Unicode Transformation Format. There are three encoding forms: UTF-8, UTF-16, and UTF-32. Although the most common one on the web is UTF-8. For details on the differences check out this post.

In short: Human Characters โžก๏ธ Unicode Character Sets โžก๏ธ Encoded Character Sets using UTF โžก๏ธ Bits that computers can understand (and vice versa ๐Ÿ”„).

TL;DRโ€Šโ€”โ€ŠEmojis are interpreted the same way characters are by the computer and thatโ€™s ๐Ÿ”ฅ. This means that you can use all your favorite emojis anywhere ๐Ÿ™Œ ๐Ÿ™Œย ๐Ÿ™Œ

Show your support

Clapping shows how much you appreciated Audrey Setiadarmaโ€™s story.