Dev 101: text versus bytes
In the Dev 101 series I cover some basic concepts of computer programming for a broad audience. I guess it’s the explanation I was looking for myself, when I first started out as a programmer…
TL;DR: There is no such thing as text, only collections of bytes which can be displayed as characters based on an encoding.
Ones and zeros
A computer is an electronic device, which really only “understands” on and off. Think of how the light goes on and off when you flip the switch. In a way, a computer is basically a giant collection of light switches.
This is why a computer’s processor can only operate on 0
and 1
, or bits, which can be combined to represent binary numbers, e.g. 100
= 4
. It is these binary numbers that the processor uses as both data and instructions (a.k.a. “machine code”).
It makes sense to group bits into units; otherwise, we would just end up with one long string of ones and zeros and no way to chop it up into meaningful parts. A group of eight binary digits is called a byte, but historically the size of the byte is not strictly defined. In general, though, modern computer architectures work with an 8-bit byte.
Bytes
This binary nature of computers means that on a fundamental level all data is just a collection of bytes…