Load balancers 101

The fundamentals of load balancers

Sep 10, 2024

Introduction

What is a load balancer?

When I first learned about the topic I thought a load balancer is an "exact thing." For example, an nginx config that balances load between servers. But as I learned more about this topic I realized that load balancer is a concept.

It's not an exact thing/technology. It can be applied on multiple levels:

Your operating system's CPU scheduler is a load balancer. You probably have 8 or 16 cores but if you run htop you should see 100s of tasks. The most important task of the scheduler is to balance the load across the cores. It works on the hardware level.
PHP FPM is a load balancer. It has one master process and many worker processes. The master process accepts requests and forwards them to the worker processes. So it's a load balancer that works on the process level.
nginx is a load balancer. I mean, the internal workings of nginx. It has a master process and many worker processes. It works the same way as FPM but it handles all kinds of HTTP requests. It balances requests across worker processes. It also works on the process level.
docker-compose is a load balancer. Yes, you can have multiple replicas of the same container with docker-compose. For example, you can have multiple replicas of your API container listening on the same port and then balance the load with a simple nginx load balancer. It also works on the process level. Or I should say "container-level."
Your Laravel queue is a load balancer. Let's say you have 5 workers and 100 jobs. The jobs are distributed across the workers in order.
nginx is a load balancer. I mean, the user-facing part of nginx. You can implement a load balancer that distributes the traffic across multiple servers. You can run your load balancer on a dedicated server. It works on the server level.
Cloud providers offer managed load balancers. They are dedicated servers and usually run nginx or Traefik (which is a fancy nginx). It works on the server level.
And of course, Docker Swarm and Kubernetes have their own load balancer implementations (ingress). They work on the server/process level.
Your own algorithm can be a load balancer as well. For example, if you're working on an inventory management system. If a customer orders a product you need to choose a warehouse. You might implement a FIFO or LIFO algorithm. Or you select the warehouse that is closest to the customer. And of course, you need to check the quantities. Either way, you're balancing the "load" (orders) across warehouses. It works on the "business logic level."

So load balancing is not an exact thing or technology but a concept that can be applied in various situations. In this chapter, we're going to explore three different load balancers:

Plain nginx balancing the load across VPSs
docker-compose balancing the load across containers
Managed load balancers balancing the load between VPSs

nginx as a load balancer

This is the bare minimum nginx configuration for a load balancer:

user www-data;

events {}

http {
  upstream backend {
    server backend1.example.com;
    server backend2.example.com;
    server backend3.example.com;
  }
  
  server {
    listen 80;
    
    location / {
      proxy_pass http://backend;
    }
  }
}

The proxy_pass directive just forwards a request to a given location. It accepts every request that comes in to / and forwards them to the backend, which is an upstream. It is used to define a group of servers that will handle incoming requests.

In this example, the backend upstream defines a group of three servers: backend1.example.com, backend2.example.com, and backend3.example.com. Whenever you refer to http://backend you refer to the upstream which represents the three backend servers. nginx will balance the load across the three servers using the round-robin algorithm. Each backend server gets an equal amount of requests in circular order.

As you can see, it goes through the servers in a "circular order:"

R1 gets handled by Backend1
R2 gets handled by Backend2
R3 gets handled by Backend3
And then it starts over
R4 gets handled by Backend1
R5 gets handled by Backend2
R6 gets handled by Backend3

In order to test it I created three DigitalOcean servers:

One load balancer and two actual servers. I didn't buy a domain for this, so I'm using IP addresses:

upstream backend {
  server 167.99.133.163;
  server 64.227.127.32;
}

On the servers, I'll run a very simple PHP script:

<?php

echo json_encode([
  'data' => [
    'hostname' => gethostname(),
  ],
]);

It returns the hostname of the server such as node1 or node2.

To serve this PHP script we need a minimal nginx config on the servers:

server {
  listen 80;
  index index.php;

  location / {
    root /var/www/html;
    include /etc/nginx/fastcgi.conf;
    fastcgi_pass unix:/run/php/php8.0-fpm.sock;
    fastcgi_index index.php;
    fastcgi_param PATH_INFO $fastcgi_path_info;
  }
}

Every request is served by index.php which always returns the hostname.

If I now hit node1 it always returns node1 and node2 responds with node2:

But if I hit the load balancer's IP address you can see it starts to distribute the traffic between the two nodes:

This is round-robin in action. Of course, it doesn't matter if you have only a single PHP file or a whole Laravel application. Load balancing is the same. You have a dedicated server with an nginx config that distributes the incoming traffic across your nodes.

Managed load balancers

Every cloud provider offers managed load balancers. The main benefits are:

You don't have to maintain your load balancer configuration. By maintaining, I mean, adding new servers to the upstream, changing domain names, etc. And of course having the actual server, upgrading it, restarting nginx when the configuration changes, etc.
Better/easier monitoring. Managed load balancers offer load balancer-specific metrics/graphs by default.
Built-in health checks.
Cheap. On DigitalOcean it costs $12/month to have a managed load balancer that handles 10000 concurrent connections. Of course, you can achieve the same with a $6 droplet and a custom nginx configuration, but $12 is still pretty cheap, and in this case, 10000 is guaranteed.
Scaling, high availability. You can have multiple replicas of your load balancer. So if you know that you need ~100000 concurrent requests you can just increase the number of nodes to 10. For $120/month you now can handle 100000 concurrent connections.

I'm going to use DigitalOcean again. You can find load balancers in the Networking nav item. This is what the scaling configuration looks like:

And this is the "upstream" configuration:

You can select particular droplets or tags as your upstream servers. In this example, I use the tag lb-example. As you can see, there are two droplets tagged as lb-example. Tags are great because you can just add or remove them to other servers and your load balancer will automatically apply these changes.

Forwarding rules are quite straightforward. In this case, the load balancer forwards incoming requests from port 80 to port 80 on the target droplets. Of course, you need to use port 443 if you set up a certificate (which I didn't do for this example).

And that's it! With just 3 minutes of setup, we have a load balancer that handles 20000 concurrent connections.

This is the dashboard:

It figured out the exact droplets based on the tag which is a pretty good feature. You don't have to manually update any nginx configuration when you add a new server. You just tag your server which is pretty easily scriptable.

And of course, we have graphs as well:

HTTP requests per second and connections are probably the most important ones.

Load balancing with docker-compose

docker-compose can also be used as a load balancer. If you check out the documentation there are some commands under the deploy key that can be used with docker-compose even when you're not in Swarm mode. replicas is one of them:

version: '3.8'
services:
  app:
    build:
      dockerfile: Dockerfile
      context: .
    deploy:
      replicas: 2
    ports:
      - "8000"

If you run docker-compose up this is what happens:

The app service runs two containers: lb-app-1 and lb-app-2. We specified two replicas so docker-compose started two containers. Just like Swarm or Kubernetes.

Both containers bind to a random port on the host machine. In my case, they are 64919 and 64965. When you define ports in a compose you have a number of options. The most popular is this: HOST:CONTAINER. This is exactly what we used earlier in the book. But you can omit the HOST port and specify only the container port with "8000". This means the container listens on port 8000 and docker-compose binds it to a random (high) port on the host machine. This is exactly what we need if we want to run a service in multiple replicas.

As you might expect, now we access the application on two different ports:

Now everything is ready to create a load balancer service that distributes the traffic between the two app containers. In this case, I'm going to use a simple nodejs server. I'm going to explain why at the end of the chapter, however, there's no real difference from the load balancer point of view.

But in this case, we don't need an upstream because we don't have multiple servers. It's just a simple proxy_pass to the app service:

http {
  server {
    listen 3000;
    
    location / {
      proxy_pass http://app:8000;
    }
  }
}

So nginx forwards the requests to app:8000. Since we wrote replicas: 2 in the compose file, docker-compose distributes the requests between the two replicas. We need another service for the load balancer as well:

load-balancer:
  image: nginx:1.25-alpine
  restart: unless-stopped
  volumes:
    - ./load-balancer.conf:/etc/nginx/nginx.conf:ro
  ports:
    - "3000:3000"
  depends_on:
    - app

This is what the whole process looks like:

docker-compose acts like an internal load balancer for the containers. From nginx point of view, it only sees a regular domain name with a port that refers to the compose service.

If you now try to hit the load balancer you can see the two containers serving the requests in a round-robin fashion:

This is the complete docker-compose file:

version: '3.8'
services:
  load-balancer:
    image: nginx:1.25-alpine
    restart: unless-stopped
    volumes:
      - ./load-balancer.conf:/etc/nginx/nginx.conf:ro
    ports:
      - "3000:3000"
    depends_on:
      - app

  app:
    build:
      dockerfile: Dockerfile
      context: .
    deploy:
      replicas: 2
    ports:
      - "8000"
    volumes:
      - ./index.js:/usr/src/app/index.js

That’s it for today. These are the basics every software engineer should know about load balancers, in my opinion. In a future post, I’ll build one.

If you have questions, don’t forget to

Computer Science Simplified

Discussion about this post