Dev 101: text versus bytes

Tom Deneire
Analytics Vidhya
Published in
7 min readMar 6, 2021

--

In the Dev 101 series I cover some basic concepts of computer programming for a broad audience. I guess it’s the explanation I was looking for myself, when I first started out as a programmer…

Photo by Hope House Press — Leather Diary Studio on Unsplash

TL;DR: There is no such thing as text, only collections of bytes which can be displayed as characters based on an encoding.

Ones and zeros

A computer is an electronic device, which really only “understands” on and off. Think of how the light goes on and off when you flip the switch. In a way, a computer is basically a giant collection of light switches.

This is why a computer’s processor can only operate on 0 and 1 , or bits, which can be combined to represent binary numbers, e.g. 100 = 4 . It is these binary numbers that the processor uses as both data and instructions (a.k.a. “machine code”).

It makes sense to group bits into units; otherwise, we would just end up with one long string of ones and zeros and no way to chop it up into meaningful parts. A group of eight binary digits is called a byte, but historically the size of the byte is not strictly defined. In general, though, modern computer architectures work with an 8-bit byte.

Bytes

This binary nature of computers means that on a fundamental level all data is just a collection of bytes…

--

--