Moving Bits: Networking Fundamentals, Part 3

This is the third part of the Networking Fundamentals series, in which I aspire to simplify networking by using a consistent mental model based on postal systems, while tying abstractions to concrete examples of networking relevant to software engineering. In the previous two parts, I described the analogy of a postal system and the three relevant layer abstractions I use in this fundamentals series, and expanded on that analogy to discuss addressing and provide a very simple HTTP request/response example. If you haven’t reviewed those parts, please do, as they provide valuable context for this and future posts.

Private Addresses and Network Address Translation

In this post, we will focus on the ideas of private (also called unroutable) addresses and Network Address Translation (often simply called NAT). In most networking courses, these concepts are discussed much later in the material, but as they are critical to modern engineering, I am going to expand our analogy to support them now, rather than waiting. Additionally, I have found these areas to be particularly troublesome, with lack of understanding leading to wasted work and system failures that could be avoided.

The first thing to recall from Part 2 is that the most common addressing used on the Internet is IPv4, which consists of 4 bytes. This is a total of 4 billion addresses (more or less), although some of these are reserved for various purposes. However, even without reserving any, a 32-bit value is not enough to provide a unique address to each and every device that might want to communicate on the Internet. This is, from our previous discussion, a problem — if you don’t have a unique address, how can someone communicate with you? Even if there is some way for you to send your letter to them (i.e., contact the server you want to request information from), how can it send your response if you don’t have a unique address? And what happens when we really do run out of unique addresses?

The above questions are real challenges in the context of global networking. The solutions that have been leveraged so far rely on a piece of lucky forethought from the early inventors of the Internet. Back in the early days of the Internet, when there were many networks that wanted to interconnect and a universal addressing scheme was required, the inventors of that scheme foresaw that some people would want to build networks of computers that would talk to each other using the standard protocols, but would not talk to anyone else. In order to enforce this, they created blocks of IPv4 addresses that could be used exclusively for this purpose, and which could not be used to carry information between networks.

When data is shared between networks (i.e., when it leaves your home network and moves to your ISP’s network), it is similar to when a letter moves from your local post office to the wider network of post offices, or when it moves from your national post to another national post. If you are using one of the private addresses, those types of transfers are simply impossible.

As a conceptual example, let us assume there is a closed neighborhood that is built by a single builder. When this neighborhood was built, each house was given a name — “Blue House,” “Red House,” “House of Steve,” etc. Also, when this neighborhood was built, there was a local post system set up, so that someone living at “Blue House” could write a letter to “Red House,” with no more address information than that, and the neighborhood post could deliver it.

Obviously, in this case, “Blue House” is not a useful address outside of this neighborhood. In fact, another neighborhood down the street, built by the same builder, could have exactly the same house names, and as long as the residents are only sharing information within their neighborhood, the fact that two (or more) “Blue Houses” exist is not important.

In Internet terms, these house names are private addresses, or unroutable addresses. The term “unroutable” comes from the fact that data cannot be “routed” (or shared between networks) with these addresses because they are meaningless outside of the specific network in which they exist. These are hugely useful, because it means a group of computers can be set up to communicate with each other, and even if they are, somehow, directly connected to the wider Internet, they won’t cause any problems. Alternatively, if we were to create our private network (our neighborhood) and borrow addresses already in use by someone else (say, street names and house numbers identical to the neighborhood down the block), we could cause real problems if we were connected to the larger world. After all, how would you send a letter to your friend if their address was not unique?

Okay, you think, so we have these private addresses. But that doesn’t solve the not-enough-addresses problem when we do, in fact, want to communicate with the rest of the Internet. This is a good observation — private addresses do not, by themselves, solve this problem. However, with the use of private addresses, a technology was developed that does solve the problem: Network Address Translation, or NAT.

To understand NAT, we will go back to our private neighborhood. Let us imagine a letter sent from “The Blue House” in “FirstHood” neighborhood to Joan Smith (from Part 2). The address on the envelope would be for Joan Smith, who has a unique address, so that’s well and good. The return address, though, just says “The Blue House,” and we already know that isn’t enough. So, the local post for FirstHood takes the envelope and, before sending it out of the neighborhood, they modify the return address so it says “FirstHood.” They then make a note in their book that The Blue House sent a letter to Joan Smith and is probably expecting a response.

Joan then receives the letter, constructs the reply, and sends it to FirstHood. The FirstHood post person sees this, checks their book, and finds that The Blue House is expecting a reply from Joan Smith. They then forward this response to The Blue House. Given this model, The Blue House in SecondHood could also send a letter to Joan Smith, and everything will work, even though the house addresses are the same. What the post in each neighborhood is doing is Address Translation.

In the networking world, Network Address Translation essentially works just like the hypothetical neighborhoods used in our analogy above. Your machine with a private address sends data to the outside Internet through a Network Address Translation (NAT) Gateway, which modifies the return address so that it uses the unique (shared) address of the Gateway, and internally makes a note so it can track the data flow and send the returned information to the right internal host.

There are a few side effects of this model. First, all traffic for the broader Internet does have to go through this one Gateway, which can introduce performance penalties. Second, since only the Gateway has a valid “public” address, if someone in the broader network wishes to contact a machine in the “private neighborhood,” there is no good way to do so. In practice, most NAT systems provide some mechanism for limited exposure, but that is beyond the basics presented here.

In the next part of this series, we will tackle ports, SSL/TLS, and connection reuse, all while extending the same postal analogy.

Justin, of The Aspiring Principal