Building a key-value store part 3
Designing and building a Redis-like in-memory key-value tree store
Introduction
This is part 3 of a 6-part series. In the last two parts, we’ve created a simple in-memory, key-value store that starts a TCP server and accepts requests from clients. It can store binary search trees in the memory. However, the application doesn’t have any persistence. The moment you stop the server, you lose your data.
In Part 1, I introduced Redis’ persistency layers:
Append-only log (AOL)
Redis database (RDB)
If you want to try out the end product you can download the binaries here. Check out the project’s README. If there’s a problem you can build your own binaries with make build
AOL
Just a quick recap:
Whenever a create, update, or delete (CUD) command is executed, the server needs to write it into a log file
When you restart the server it should read the file entry-by-entry and re-execute the commands
This will basically reconstruct the whole database in memory.
Fortunately, the file structure can be very simple. Similar to a standard log file:
bstadd categories 10\n
bstadd categories 2\n
bstadd categories 14\n
Each line contains one command. This is the simplest file format for both reading and writing.
To efficiently read a file like this we can use bufio
which is Go’s implementation of buffered I/O. Here’s an example:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
file, err := os.Open("example.txt")
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
fmt.Println(line)
}
}
scanner.Scan()
reads a chunk from the file, stores it in an internal buffer, and then scanner.Text()
returns one line. The default buffer size is 4KB which is quite efficient since (as far as I know) this is the smallest chunk the disk can return in one read.
So scanner
reads 4KB of data (lots of lines), stores it internally, processes it, and then returns it line by line. This is an efficient way to read a whole file because it decreases the number of I/O operations from numberOfLines
to fileSize / bufferSize. fgets
in PHP or BufferedReader
in Java works similarly. They both use internal buffers and chunk reading.
Appending
Writing a single command in this format is quite easy:
func Append(command string, key string, args []int64) error {
file, err := os.OpenFile(
fileName,
os.O_APPEND|os.O_CREATE|os.O_WRONLY,
0644,
)
if err != nil {
return err
}
defer file.Close()
writer := bufio.NewWriter(file)
_, err = writer.WriteString(
fmt.Sprintf("%s;%s;%s\n", command, key, convertArgs(args)),
)
if err != nil {
return err
}
err = writer.Flush()
if err != nil {
return err
}
return nil
}
I added delimiters into the string:
fmt.Sprintf("%s;%s;%s\n", command, key, convertArgs(args)),
When replaying the AOL we need to parse these commands into Command objects. The easiest way is to use delimiter characters. So an actual line looks like this:
bstadd;categories;10
But commands can have many arguments, right? Yes, so we need to convert the integer array into a string that looks like this: “1,2,3,4,5“
This is done in the convertArgs function:
func convertArgs(args []int64) string {
strArgs := make([]string, len(args))
for i, v := range args {
strArgs[i] = fmt.Sprint(v)
}
return strings.Join(strArgs, ",")
}
Since AOL stands for Append-Only Log I used the function Append() instead of Write().
Reading
When starting the server it needs to replay every command from the AOL. To do that, we need to read every single line. We’ve already seen how bufio’s Scanner can be used.
Here’s the Read() function:
func Read() ([]command.Command, error) {
var cmds []command.Command
file, err := os.Open(fileName)
if err != nil {
return []command.Command{}, err
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
parts := strings.Split(line, ";")
argsStr := parts[2]
args := parseArgs(argsStr)
cmd, err := command.Create(
strings.ToUpper(parts[0]),
parts[1],
args,
)
if err != nil {
return []command.Command{}, err
}
cmds = append(cmds, cmd)
}
return cmds, nil
}
When reading a line it needs to split it by the delimiter character:
line := scanner.Text()
parts := strings.Split(line, ";")
argsStr := parts[2]
args := parseArgs(argsStr)
The result is:
parts[0] contains the command. “bstadd” for example
parts[1] contains the key. “product-categories” for example
part[2] contains the argument as a string. “1,2,3“ for example
We need to convert the argument string into an integer array:
func parseArgs(args string) []int64 {
argsFormatted := strings.Split(args, ",")
argsInt := make([]int64, len(argsFormatted))
for i, arg := range argsFormatted {
argInt, _ := strconv.Atoi(arg)
argsInt[i] = int64(argInt)
}
return argsInt
}
It splits the string by the delimiter character
It creates an integer array
It loops through the string array
Converts every argument to an integer and returns the results
The last part of the Read() function creates commands out of the line:
cmd, err := command.Create(
strings.ToUpper(parts[0]),
parts[1],
args,
)
if err != nil {
return []command.Command{}, err
}
cmds = append(cmds, cmd)
Replaying
Keep reading with a 7-day free trial
Subscribe to Computer Science Simplified to keep reading this post and get 7 days of free access to the full post archives.