Networking for the Software Engineer: Connectivity
As the token operator amongst developers with a little bit of networking experience, I get a lot of requests from developers that sound a bit like this:
Hi! You know a little bit about networking, right? What does _____ mean? Can you read through this and help me understand?
Networking is an intriguing field with fantastic acronyms that rival our modern texting lexicon and works like magic. However, for the software engineer, it’s not always clear why the network matters and how it affects an application. We take it for granted that when our application sends a GET request to another application, we just receive a response. But under the hood, why does it matter? What simple terms should we know to communicate about our application networking?
This is not an overview of the OSI reference model or networking layers. It is not intended to train anyone for a network certification, help configure a switch, or outline every detail about software-defined or container networking. It will not help answer the omnipresent interview question, “What happens when you type a URL into the browser and press enter?” (see link for the answer).
What I will try to cover are:
- Networking concepts we run into when we deploy an application
- How that affects our application
- The physical, software-defined, and container technology equivalents.
When I started this story, I realized that each domain of networking needs to be split into its own. What I consider the three domains of networking are:
- Connectivity: How we connect applications
- Network Policy: How we stop applications from communicating
- Service Discovery: How we allow applications to be easily resolved
In this story, we will learn about how we connect to our application when it is running locally and in the datacenter. We will discuss some of the basic terms for how we describe network connectivity.
My Application, Locally
In order for anything to communicate with our application, it has to reside on a device that has network connectivity via a hardware interface. The hardware for this interface is called a Network Interface Card (NIC). It has a unique address called a MAC address, which looks like
xx:xx:xx:xx. This is like a national identification number (in the United States, a social security number). It must also have an Internet Protocol (IP) address. This is a logical address that often looks like
xx.xx.xx.xx (in IPv4 format). A simple equivalent would be a legal name or nicknames that we might like to be called.
We can have multiple names but generally only have one national identification number. Similarly, we can have multiple IP addresses (although ill-advised) for one MAC address. For more information, check out this article. Many describe the network using the Open Systems Interconnection (OSI) model, a model for the layers of networking. When we make a request to an application, our request gets peeled like an onion. The top Application layer (Layer 7) of a request gets peeled down to the Physical layer (Layer 1) and transmitted through some kind of medium, whether it be a wire or fiber.
The TCP/IP model, a different representation of the layers, has gained a bit more favor recently. In the end, it helps to know that when we access our application, we tend to interact with networking logic above the Network layer (Layer 3).
My Application, in the Datacenter or Cloud
Back in the early days of networking, plugging in a computer was legitimately enough to give us connectivity….to our friend’s computer. However, we wanted to connect all of our friends’ computers together to play games. This is a local area network (LAN). Once we started playing the games though, we found out we could not go outside of this network. This is basically how a datacenter network is configured for an application. A bunch of servers are connected together and create the local area network (LAN).
It’s pretty awesome that we can communicate within our friend group but what if we had another group of friends in the next town over? We want to connect our two LANs together so we can all enjoy the benefits of communicating with each other. This is what is called a wide area network (WAN). Wide area networks (WANs) connect hospital branches together as if they were on the same network or a company’s many offices to the “company” network. In the case of the datacenter, we might have other datacenters that are part of the company’s WAN. The public Internet is effectively a large WAN that everyone can access and join. The diagram shows what is in the scope of a LAN versus a WAN.
When our application resides in the datacenter, our friendly network engineers have to do quite a bit of configuration to make it externally accessible from anyone’s laptop or internally accessible from another application — all as securely as possible.
Network Connectivity Technologies
There are many devices and technologies in the datacenter that contribute to connectivity. Some of these we may never see unless taking a datacenter tour, while others we tend to learn for troubleshooting.
These are usually switches or routers. They are the main power houses in the datacenter that provide connectivity and route traffic. Switches tend to be associated with Data Link layer (Layer 2) while routers control Network layer (Layer 3) traffic. There are a variety of protocols that we use to tell switches and routers how and where to send traffic. For more information, check them out here.
Software-defined networking (SDN) effectively mimics a network by virtually defining rules on how traffic is to be routed. For a deeper dive, check this out. Most public clouds (Amazon Web Services, Google Cloud Platform, or Microsoft Azure) use software-defined networking to provide users a network on-demand. In the public cloud, a software-defined network is called a Virtual Network or Virtual Private Cloud (in the case of Amazon Web Services). A software-defined network is a logically isolated virtual network that has connectivity between each of the instances attached to it.
Container runtimes, like Docker, use Linux bridges by default to provide connectivity between containers. A Linux bridge acts as a virtual switch on the Linux host machine via a kernel module. Check out this article for more detailed information. An application in a container on the Linux bridge cannot be directly accessed without some additional configuration.
We can also use other container networking technologies to provide network connectivity for containers across different hosts. These act as “plugins” to the container runtime. A few examples are Weave, flannel, and Calico. Check out my previous rundown on container networking for more detailed descriptions.
All of these technologies will overlap with certain layers of the OSI model mentioned above.
SDN and container networking have blurred the lines on what layer they control, from Layer 2 to Layer 7. With more networking control moving to the Application layer (Layer 7), we’re bound to see some new technologies coming out to make it easier to manage.
My Application, Checking Connectivity
If we’ve got an application that uses Hypertext Transfer Protocol (HTTP), it’s pretty straightforward to check for connectivity. Most applications use HTTP to request and respond to information. Basically, any time we start our application on a port, we can use
curl to send a request to the endpoint and get a response back via
127.0.0.1. The IP address
127.0.0.1 redirects the request back into its host, called
$ curl 127.0.0.1:8080
Simple, right? We might also hear network engineers or others say they are going to “ping” the endpoint.
ping is a tool we can use to send an Internet Control Message Protocol (ICMP) message to determine if an endpoint is up. It is not the same protocol as HTTP. Instead, it is much lower-level at Layer 3. For more about the networking protocols and which network layer they are associated with, check out this list.
ping is not associated with specific ports, so when we issue it, we’re only checking for connectivity to the host. Keep in mind that some public cloud deployments will have this disabled, so we might find that the
ping might not be sent. The example below shows that I am sending 3 ICMP messages to my localhost (127.0.0.1). I get a response!
$ ping -c 3 localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.027 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.155 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.067 ms
--- localhost ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.027/0.083/0.155/0.053 ms
Network Connectivity Cheatsheet
At one point, I was asked to read a networking debrief and summarize how it would affect an application. It did not change much from an application perspective but the terms were rather networking-specific and difficult to understand. Coming out of that, I put together a simple cheatsheet for the next time someone hears certain networking terms and needs an understanding of how it affects their application.
- If I want to connect to my application sitting in my company’s datacenter, I have to be on my company’s network in order to access it. That could be a local area network (LAN) or wide area network (WAN), depending on how the routing is set up.
- If I want to connect to my application hosted on a nice Software-as-a-Service or third party API, I have to transmit through the public internet (a really big WAN) to access it.
- I can use physical devices, software-defined networking, or container networking to generate connectivity to my application, whether it be from a user laptop or another application.
Once we have set up connectivity, though, how do we prevent our application from connecting to something it should not? How do we prevent something from connecting to our application, injecting some bad information, and wreaking havoc? The answer? Network policy.