Achieve Scalability with Load Balancing

Load balancing

It would be best always to load balance our production application with at least two servers available to serve requests, so our application can stay online when a server restarts. At its simplest, load balancing is the act of making sure that all servers receive roughly the same number of requests over a given time. A well-balanced application will be less likely to develop hot nodes with stressed resource usage than other nodes. We can also add new servers to a well-balanced application to help reduce the load on all other servers in the cluster.

We’ll discuss the basics of load balancing before looking at how WebSockets can make achieving a well-balanced system more complex than a traditional HTTP-powered application.

The basics of load balancing

A load balancer is a specialized software that acts as a proxy between a client and servers that respond to requests. Requests are sent relatively to back-end servers in the round-robin, least connections, or based on the criteria we define. Load balancers provide many benefits, such as the ability to add or remove back-end servers quickly, create a fair distribution of work, and increase redundancy.

Here’s an example of a load that is not correctly balanced. The top application server has received many more requests than the other servers in the application.

Get hands-on with 1200+ tech skills courses.