Introduction to Azure Load Balancers
Explore how Azure Load Balancers improve service reliability by routing traffic to healthy backend instances. Learn core components, types, and configurations to build load-balanced web services in Azure.
Services fail, web servers go down, databases go offline, and network glitches happen; it’s inevitable. To ensure services to end-users remain up amongst these failures, you must add a resource that intelligently “routes” traffic to redundant services. One way to add high-availability to services in Azure is to use Azure Load Balancer.
Azure Load Balancer is an Azure resource that sits in front of various internal and public-facing services that accept inbound connections. Rather than having a web service or database accept inbound connections directly, Azure Load Balancer is the front line that accepts connections for them.
What does offloading incoming connections to Azure Load Balancer prove? Not a whole lot if that were the end of it. To see true benefit from load balancing, admins will bring up multiple instances of the service that serve up the same content. The load balancer can then be configured to detect problematic instances and only send traffic to the instances that are operational.
In this chapter, we’re going to dive into Azure Load Balancers. We’ll speak to how they work, what their purpose is, and how to build a load-balanced web service in Azure.
The basics of Azure Load Balancer
Like other load balancers, Azure Load Balancer’s primary purpose is to sit in front of a group of identical services and route traffic accordingly. It does this using six primary components:
-
Frontend IP Configurations
-
Load Balancing Rules
-
Inbound NAT Rules
-
Outbound Rules
-
Backend Pools
-
Health Probes
Azure Load Balancer comes in two different SKUs: Standard and Basic. Not all components discussed in this chapter will be available with the Basic SKU.
Let’s say you have a website running in Azure on a virtual machine. As of now, your website is accepting connections directly from the Internet.
One day, the VM goes down and no one can access the website anymore. To prevent this in the future, you need to “replicate” that website to another VM. You then need to introduce an Azure Load Balancer sitting in front of the website to route traffic only to the website instances that are online.
Creating duplicate copies of a web service
The first step is to get an exact replica of your website on more VM instances. There are multiple ways to do this in Azure; the most common ways are to build a VM scale set or an availability set. In this course, you’ll learn all about building a VM availability set, but we’ll leave VM scale sets for another day.
Here’s a quick point to mention about VM scale sets and availability sets. An availability set is commonly deployed to ensure a failure of one or more VMs does not affect service. You are not required to have identical VMs. A scale set is similar but is typically deployed to quickly provision and deprovision VMs based on load. All VMs within a scale set must be identical.
In Azure Load Balancer terms, this set of VMs created with identical copies of your website is called a backend pool. The load balancer knows about each VM instance in the backend pool and can then route traffic accordingly based on health probes.
Deciding what instance to route traffic to
A load balancer’s primary goal is to route traffic to the most appropriate service instance (a VM in this case). How does the load balancer know what “most appropriate” means with Health probes?
Azure Load Balancer uses health probes to continually assess the health of each instance of the backend pool. If an instance is found to be “unhealthy” in Azure Load Balancer terms, then the health probe detects this.
You can see below the menu in the Azure Portal, there is a page to create a new health probe. The load balancer uses this probe to “ping” a network port (Protocol and Port) every so often (Interval). If the probe fails a certain number of times (Unhealthy threshold), the instance is deemed unhealthy which prevents future traffic from being routed to it.
Accepting requests from clients
You now have multiple instances of your website running on different VMs. The Azure Load Balancer also knows how to determine if a website instance is healthy or not. Next, the Azure Load Balancer must be able to accept connections for clients going to your website. Your client’s first point of contact with an Azure Load Balancer is the Frontend IP configuration.
The Frontend IP configuration is the entry point for clients. It is the component of the load balancer where you define the “incoming IP address.” In this chapter’s example, the connections are coming from the Internet. In that case, the Azure Load Balancer will be public, meaning that it has a public IP address associated with it.
Azure Load Balancers have two types: public and internal. Public load balancers proxy traffic to and from the Internet. Internal/private load balancers proxy internal traffic inside of a virtual network.
When you assign a public IP address to an Azure Load Balancer, the IP is then informally called a frontend IP address.
Distributing traffic to the backend pool
When the request comes into the load balancer destined for your website, the traffic needs to be directed to a healthy backend pool instance. Azure Load Balancer makes this happen by using load-balancing rules.
A load-balancing rule “maps” an incoming connection request to a particular backend pool instance. If the health probe comes back successful, the load balancer then, by default, creates a 5-tuple hash. The method of mapping an incoming connection to a backend pool instance is called a distribution mode.
When using the default distribution mode, Azure Load Balancer will combine five different elements of the connection into a hash, creating a single flow. This flow then has all of the attributes of the original connection request as well as information about the destination backend instance.
When using the 5-tuple hash distribution mode, the hash will contain:
-
Source IP address
-
Source port
-
Destination IP address
-
Destination port
-
IP protocol number to map flows to available servers
Once the hash has been calculated and the flow created, the load balancer then opens up the connection and allows the original client to connect to the backend pool instance.