Distributed Systems: Building Software for the Real World/

...

Programming for Multiple Networks

Learn about application listening from a socket, multihomed servers, and physical hosts.

We'll cover the following...

Listening on a socket
Outbound connections
Physical hosts, virtual machines, and containers
Physical hosts

Press + to interact

To determine which interfaces to bind to, the application must be told its own name or IP addresses. This is a big difference with multihomed servers. In development, the server can always call its language-specific version of getLocal- Host(), but on a multihomed machine, this simply returns the IP address associated with the server’s internal hostname. This could be any of the interfaces, depending on local naming conventions. Therefore, server applications that need to listen on sockets must add configurable properties to define to which interfaces the server should bind.

Outbound connections

Under exceedingly rare conditions, an application also has to specify which interface it wants traffic to leave from when connecting to a target IP address. For production systems, I would regard this as a configuration error in the host: it means multiple routes reach the same destination, hooked to different NICs. The exception is when two NICs connected to two switches are bonded into a single interface. Suppose “en0” and “en1” are connected to different switches, but also bonded as “bond0.” Without any additional guidance, an application opening an outbound connection won’t know which interface to use. The solution is to ensure that the routing table has a default gateway using “bond0.”

With that under our belts, we now have enough networking knowledge to talk about the hosts and the layers of virtualization on them.

Physical hosts, virtual machines, and containers

At some level, all machines are the same. Eventually, all our software runs on some piece of precisely patterned silicon. All our data winds up on glass platters of spinning rust or encoded in minute charges on NAND gates. That’s where the similarity ends. A bewildering array of deployment options force us to think about the machines’ identities and lifespans. These aren’t just packaging issues, either. A design that works nicely in a physical data center environment may cost too much or fail utterly in a containerized cloud environment. In this section, we’ll look at these deployment options and how they affect software architecture and design for each kind of environment.

Physical hosts

The CPU is one place where the data center and the development boxes have converged. Pretty much everything these days runs a multicore Intel or AMD x86 processor running in 64-bit mode. Clock speeds are pretty much the same, too. If anything, development machines tend to be a bit beefier than the average pizza box in the data center these days. That’s because the story in the data center is all about expendable hardware.

This is a huge shift from just ten years ago. Before the complete victory of commodity pricing and web scale, data center hardware was built for high reliability of the individual box. Our philosophy now is to load-balance services across enough hosts that the loss of a single host is not catastrophic. In that environment, you want each host to be as cheap as possible.

There are two exceptions to this rule. Some workloads require large amounts of RAM in the box. Think of it as graph processing rather than ordinary HTTP request/response applications. The other specialized workload is GPU computing. Some algorithms are embarrassingly parallel, so it makes sense to run them across thousands of vector-processing cores.

Data center storage still comes in a bewildering variety of forms and sizes. Most of the useful storage won’t be directly on the individual hosts. In fact, our development machine probably has more storage than one of our data center hosts will have. The typical data center host has enough storage to hold a bunch of virtual machine images and offer some fast local persistent space. Most of the bulk space will be available either as SAN or NAS. Don’t be fooled by the similarity in those acronyms. Bloody trench wars have been fought between the two camps. It’s easier to make trenches in a data center than we might think. Just pop up a few raised floor panels). To an application running on the host, though, both of them just look like another mount point or drive letter. Our application doesn’t need to care too much about what protocol the storage speaks. Just measure the throughput to see what we’re dealing with. Bonnie 64 will give us a reasonable view with a minimum of fuss. $^{1}$

All in all, the picture is much simpler today than it once was. Design for production hardware for most applications just means building to scale horizontally. Look out for those specialized workloads and shift them to their own boxes. For the most part, however, our applications won’t be running directly on the hardware. The virtualization wave of the early 2000s left no box behind.

Living in Production

The Exception That Grounded an Airline

Stabilize Your System

Stability Antipatterns

Failures And Blockages

Force Multiplier

Stability Patterns

Launching An Online Store

Foundations

Processes on Machines

Interconnect

Control Plane

Security

Design for Deployment

Handling Versions

Case Study: Trampled by Your Own Customers

Adaptation

System Architecture

Information Architecture

Chaos Engineering

Bibliography

Programming for Multiple Networks

Listening on a socket

Outbound connections

Physical hosts, virtual machines, and containers

Physical hosts