Bits, Bytes and IPs
All About Internet Protocol addresses
As developers we interact with IP addresses daily, perhaps without a solid understanding of what these numbers represent. Internet Protocol addresses are numeric identifiers for computers or devices on a network. In order to get in touch with a person, you need to know a street address or a phone number for them. IP addresses are the computing equivalent: using that address you can always reach the same computer. There are plenty of websites that allow you to look up your own IP address, in case you are curious.
IP Addresses and Binary
IPv4 addresses all have the same structure of four numbers separated by periods. Each of those four numbers (called octets) is in the range of 0–255 (256 total possibilities). 256 may seem like an arbitrary limit, but it will make more sense if we take a detour into the binary numeral system.
In our base-10 numeric system, a digit in any column of a number (hundreds, tens, ones, etc) can have a value between zero and nine. When a value in any column exceeds nine, we ‘carry the one’ and increase the value in the column to its immediate left. The number after 29 is 30 because we increase the tens column by one because the ones column has the maximum value of nine.
Many values in computing are based on a binary or base-2 numeral system because that corresponds to the two states of an electrical signal, on or off. In base-2 the only value any single digit can have is 0 or 1. Base-10 columns each increase by a factor of ten: ones, tens, hundreds, etc. Base-2 columns increase by a factor of two: ones, twos, fours, eights. To represent the number two, you need two digits because it exceeds the value a single digit can hold. To make two in binary, carry a one into the next column and the result is 10. This may look like the base-10 number ten (which you can think of 1*10 + 0*1), but in Binary 10 is 1*2 + 0*1. Three would be 11 or 1*2 + 1*1.
Let’s say we want to calculate the number 23 in binary. Creating a number in binary can be accomplished by figuring out the largest powers of two that can be added to equal it. Then put a 1 in the column of every power you add and a 0 in all the remaining columns. The below table shows some powers of two (our binary digit columns), evaluates whether adding each power to the current total exceeds our goal of 23 and decides whether to include or exclude each digit from our binary number as a result. The new total is then used by the next row for evaluation.
We looked at 6 powers of two, so our number will be 6 columns. The columns go from left to right in descending order of value (just like in base-10). For reference our columns represent the values 32, 16, 8, 4, 2 and 1 in order. Every included value gets a 1 in the column, excluded values get 0. Our result is 010111, but we can leave off the leading zero. 23 is represented as 10111 in binary. If you want some further practice understanding binary, I’ve included a few links at the end of this article.
So our octets have 256 possible values because they contain eight binary digits. Each binary digit can represent 2 values, leading to 2⁸ or 256 possible values per octet. A grouping of 8 bits of information is also referred to as a byte. A full IPv4 address is 32-bits or 4 bytes. So if an IP address is
126.96.36.199 in what we refer to as dot-decimal notation, in binary it would be
00011000.10010010.01100000.00001010. While dot-decimal notation is common for IPv4, addresses can also be written in hexadecimal or octal, in addition to binary. IPv4 has 2³² or over 4 billion possible IP addresses, but that isn’t enough!
The v6 protocol was created because of the exhaustion of available v4 addresses. When IPv4 was established it didn’t take into account the large number of internet-capable devices modern users would operate. IPv6 was not designed to be interoperable with IPv4; it is a completely separate system and exchanging data between these protocols requires special gateways.
IPv6 addresses contain more data, meaning there are more available addresses. Each address is 128 bits or 16 bytes. The IPv6 format is 8 groups of four hexadecimal digits. Each group represents 16 bits or two octets separated by colons. With this structure there are 2¹²⁸ or approximately 3.4×10³⁸ available addresses.
These addresses have so many characters that there are ways to shorten them. Leading zeros can be omitted, but each group must have at least one digit. The address
2001:0db8:0000:0042:0000:0000:0370:7334 can be shortened to
2001:db8:0:42:0:0:370:7334 just by omitting the leading zeros. One or more groups that are just zeros can be replaced with an empty group; however, this can lead to ambiguous notation so it is only allowed to be executed once in any address. In our example, if I replaced both group 3 and groups 5–6 with an empty group, you wouldn’t know which of those sections had two consecutive
0000 groups. Instead we write
Moving the internet to IPv6 has been in the works since 1998. Companies are slowly moving to the new protocol, but IPv6 is still under 30% adoption worldwide.
Networks and CIDR Notation
IP addresses contain two distinct parts. The first section of the address identifies the network (group of computers) the address belongs to and the second identifies the distinct host. (Think of this as your street name and your unique house number on that street.) There are different classes of IPv4 addresses based on the digits they contain. Each of these classes divides the address between network and host differently.
To differentiate between these classes, we need to consider their binary notation. The addresses are grouped by how many leading 0s they have in their first binary octet. Class A addresses start with a leading 0 in binary. These numbers can be from 00000000 to 01111111 in binary (between 0 and 128). Class B addresses begin with 10 (octets between 128 and 191.) Class C addresses begin with 110 (octets between 192 and 233). Class D addresses are between 244–239 and Class E addresses are between 244 and 255.
Class E addresses are mostly unused, but they remain reserved for future use. Class D addresses are used as multicasting protocols. Multicasting is often used for streaming to large audiences because it optimizes the data packets sent so that the central server receives fewer requests. Classes A-C make up the majority of use.
In our earlier example,
188.8.131.52, 24 bits provide host information and 8 bits represent networking. Networks can be divided into smaller sections called subnets which allow hosts to be isolated in specific groups. Each network has only one subnet by default, which contains all of the host addresses within it. The subnet mask is a method for dividing IP addresses: it specifies how much of the IP address is used to identify the network versus host and tells us what IP addresses are available in the network. For Class A addresses, the network section is the leading 8 bits (because the last 24 bits are reserved for the host). If we think of that in binary, we can represent it as 11111111.00000000.00000000.00000000 or 255.0.0.0. All the 0 bits for this netmask represent variable parts of the address that can change based on the host and the 1 bits represent the network information.
CIDR or Classless Inter-Domain Routing was developed as an alternative to traditional subnetting (which was contributing to IPv4 address depletion). CIDR gives us control over how we network IP addresses. Instead of having network identifiers only in 8 bit chunks, CIDR allows for netmasks of variable length. Now a portion of the same octet can represent the host while the remaining portion represents the network. This means more combinations for how IP addresses can be grouped together in networks and assigned to organizations. It also resulted in supernets or larger aggregations of networks which simplify routing.
With CIDR you notate how many leading bits of the address are devoted to networking information with a trailing slash and number:
192.168.102.97 would have Class C netmasks of 24 bits and would thus be in different networks because they have different third octets. However CIDR allows us to apply a subnet mask of 22 bits, notated by
192.168.102.97/22. In order to decide if these two addresses are in the same network, we need to convert them to binary to see if they share the same leading 22 bits. Since they have the same numbers for their first two octets, we know they share the first 16 bits. Now we must examine the third octets. In binary they are 01100100 and 01100110 respectively, which share the first 6 binary digits. Those 6 bits, combined with the previous 16 shared bits, makes for 22 common binary digits. These addresses share their leading 22 bits, so under a 22-bit netmask they are on the same network with a subnet mask of
255.255.252.0. (Remember we calculate the subnet mask by adding all the binary digits that identify the network and leaving out the host digits.)
There is a lot of information packed into the binary digits of an IP address and now we know a bit more about what each of these digits represents.
For more resources on learning binary, check out these articles:
The reason computers use the base-2 system is because it makes it a lot easier to implement them with current…computer.howstuffworks.com
Introduces the concepts behind different number bases, and shows how to convert between decimal (base ten) and binary…www.purplemath.com
Thanks for reading! Want to work on a mission-driven team that loves making nerdy jokes about binary? We’re hiring!