Building a load balancer in Go part 2

Learning Golang by building an HTTP load balancer

Martin Joo

Oct 03, 2024

Introduction

This is the continuation of part 1 where we built a basic, working HTTP load balancer.

There are two important features every load balancer should implement:

Health checks
Connection pools

Announcement

I’m working on a new project called ReelCode. It’s a coding platform such as LeetCode but with real-world problems. You can learn important computer science topics while solving real-world challenges. Supported by high quality descriptions and tutorials.

Health checks

When the load balancer forwards a request to a backend server it needs to know if it’s available or not. If the server is down the load balancer should skip it and send the request to another server.

This is what the workflow looks like:

The load balancer sends a /healthcheck request to each server every X seconds
If the response is 200 the given server is considered healthy and it can accept requests
If the response is not 200 or there’s no response the server is marked as unhealthy and it should be skipped

In order to do that, we need to extend the existing Server struct (which only contains a URL at the moment):

type Server struct {
    url     string
    healthy bool
}

The struct contains whether a server is healthy or not.

The LoadBalancer struct implements a function called HealthCheck that sends the /healthcheck requests to the backend servers:

func (lb *LoadBalancer) HealthCheck() {
    for _, server := range lb.servers {
    	res, err := http.Get(server.url + "/healthcheck")

    	if err != nil || res.StatusCode != http.StatusOK {
    	    server.healthy = false
    	    fmt.Printf("Server [%s] is down\n", server.url)
    	} else {
    	    server.healthy = true
    	    fmt.Printf("Server [%s] is up\n", server.url)
    	}
    }
}

It’s very straightforward.

This is Golang’s foreach loop:

for idx, value := range lb.servers

It uses the range keyword. If you don’t need to use the index, only the value you can use the _ symbol, and the compiler won’t complain about unused variables:

for _, value := range lb.servers

This is true for other cases as well. For example, if you want to ignore a potential error you can do this:

value, _ := GetValue("mykey")

func GetValue(key string) (int, error) {}

In this case, the error is simply ignored because we used the _ symbol as the variable name.

Back to the HealthCheck function. This is the important part:

res, err := http.Get(server.url + "/healthcheck")

if err != nil || res.StatusCode != http.StatusOK {
    server.healthy = false
    fmt.Printf("Server [%s] is down\n", server.url)
}

If there’s an error or the status code is not 200 it marks the server as unhealthy.

If you forget this is the LoadBalancer struct:

type LoadBalancer struct {
    servers []*Server
    idx     int
}

servers is an array of pointers. This means you can modify the fields in such ways:

for _, server := range lb.servers {
    server.healthy = false
}

The pointer holds a memory address. If you modify server.healthy it will override the value at that specific memory address. So it’s not a copy of the Server. It’s a reference to it.

The next step is to run the HealthCheck() function every 10 seconds. The following example will look weird:

func (lb *LoadBalancer) RunHealthCheck() {
    ticker := time.NewTicker(10 * time.Second)

    go func() {
    	for {
    	    select {
    	    case <-ticker.C:
    	 	lb.HealthCheck()
    	    }
    	}
    }()
}

This starts a timer, and every 10 seconds it calls the HealthCheck function.

Before understanding the Go version, let’s discuss this:

setInterval(fn => console.log("hello"), 500);

You probably used setInterval in JavaScript. Can you describe how does it work?

setInterval is pushed onto the call stack
It registers the callback function
It completes immediately and is popped off the stack
The event loop moves on to the next item in the stack
A C++ API (if we’re in Nodejs land, on the client side it’s called Web API) starts the timer
After 500ms, when the timer ticks, the callback (console.log) is pushed onto the callback queue
Meanwhile, the one-threaded event loop constantly checks if there is a new task on the call stack.
If the call stack is empty, it executes the first item from the callback queue, which is the console.log() call.

If there are new things for you in this description, and you want to learn more about how Javascript (node) works, check out this classic. It’s an awesome video.

There’s an important concept here: timers work in an async manner. Otherwise, they would block the main execution thread and the program would freeze. The event constantly executes items from the call stack until the timer waits.

I was probably using Javascript for 2-3 years before I knew anything about this event loop and callback queue thing. In other words: I was a user of the language but I had no idea how it worked or what I was doing. It’s only one line of code but so many things are happening under the hood:

setInterval(fn => console.log("hello"), 500);

Golang does not allow you to write concurrent code without understanding it. There’s no way you can come up with all that funky stuff without understanding it. I mean, there’s a go keyword, an infinite loop, a left arrow:

func (lb *LoadBalancer) RunHealthCheck() {
    ticker := time.NewTicker(10 * time.Second)

    go func() {
    	for {
            <-ticker.C
    	    lb.HealthCheck()
    	}
    }()
}

I won’t go into concurrency details in this post, because it’s going to be a dedicated post if you want to learn about it.

In a nutshell, this is how it works.

go func(){} will run an anonymous function as a go routine. Go routines are lightweight threads. You can think about them as classic threads, however, it’s not necessarily true. Sometimes, the Go scheduler will run multiple Go routines on the same OS thread.

As we know from the Javascript example, we need to start another thread, otherwise the timer would block the main one.

Go routines communicate via channels. They don’t share memory. They don’t share variables. There are no return values. All of these are great news. They use channels that are special variables. A Go routine can send a value into a channel using the syntax:

channel <- "hello"

Other go routines can read values from the channel with:

value <- channel

It’s a communication channel, kind of like a message queue between components:

The NewTicker() function returns a Ticker struct:

ticker := time.NewTicker(10 * time.Second)

Ticker will send a message to a channel every 10 seconds. The channel is available at ticker.C

The function then starts a go routine (another thread) as an anonymous function:

go func() {
    for {
        <-ticker.C
        lb.HealthCheck()
    }
}()

The go routine starts an infinite loop. This is necessary since the ticker never ends. We want to read from its channel until the program exits.

The <-ticker.C expression reads the next value from the channel. It is crucial to understand that reading from a channel is a blocking operation. So the go routine will pause and wait until a new value is pushed to the channel by the ticker. After that, it resumes, and lb.HealthCheck() is executed.

Thanks for reading Computer Science Simplified! This post is public so feel free to share it.

In the red area, the thread is blocked and waiting for the channel ticker.C

And now, in the main function, the health check process can be started:

package main

import "net/http"

func main() {
    servers := []*loadbalancer.Server{
	&loadbalancer.Server{
	    URL: "http://127.0.0.1:8000",
	},
	&loadbalancer.Server{
            URL: "http://127.0.0.1:8001",
	},
    }

    lb := &NewLoadBalancer{
    	Servers: servers,
    )

    lb.RunHealthCheck()

    err = http.ListenAndServe(":8080", lb)
    if err != nil {
    	panic(err)
    }
}

Of course, we still have to solve a problem. The round-robing balancing algorithm must respect the status of a server and skip it if it’s unhealthy.

This is what the function looks like now:

func (lb *LoadBalancer) NextServer() *Server {
    server := lb.servers[lb.idx]
    lb.idx = (lb.idx + 1) % len(lb.servers)

    return server
}

It’s a simple round-robin algorithm that always returns the next server and it goes back to the first if the index overflows.

Since it can be quite tricky to implement a round-robin while having unhealthy servers, I made a simplified version:

If there’s at least one unhealthy server go back to the first one and start over. Skip the unhealthy ones.
Otherwise, just do the same old round-robin

It’s not the most efficient but it’s the most simple for sure:

func (lb *LoadBalancer) NextServer() (*Server, error) {
    if lb.hasUnhealthy() {
    	idx := 0
    	for ; idx < len(lb.servers); idx++ {
    	    if lb.servers[idx].Healthy {
    		break
            }
    	}

    	lb.idx = idx
    }

    if lb.idx == len(lb.servers) {
    	lb.idx = 0

    	return nil, errors.New("no healthy servers")
    }

    server := lb.servers[lb.idx]
    lb.idx = (lb.idx + 1) % len(lb.servers)

    return server, nil
}

The for loop inside the if statement finds the first healthy server. If there isn’t any the function returns an error. After that, it does the same thing as earlier.

It’s a simplified version but it works just fine:

To be continued

That’s it for today. We implemented a basic, but working health check mechanism for the load balancer. In the next post, I’m going to add connection pools.

If you have any questions just

Computer Science Simplified