How to Build a Fault Tolerant Redis Network With Spring Boot and Docker

Clusters with replication and failover

Published in

Better Programming

12 min readJun 10, 2020

Editor’s note: this article has been edited to use a more thoughtful primary/secondary relationship between machines rather than the former, culturally insensitive terminology. In following along, depending on the terminology you use locally, you may need to update what we reference as primary/secondary here to match your systems. Thanks!

In distributed systems, achieving fault tolerance is one of the key criteria for success.

Let’s look at achieving fault tolerance and replication in a Redis network with Redis cluster and sentinels.

In this tutorial I’ll cover:

Introduction to Redis
Creating a spring boot application with Redis cache using Docker
Different ways to build a fault-tolerant Redis network
Sharding with a redis cluster
Replication with redis cluster
Redis primary-secondary network with sentinel (no sharding)

Introduction to Redis

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.
Redis has built-in replication, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
In order to achieve its outstanding performance, Redis works in-memory.

— Introduction to Redis

Some key Redis features to note before we dig into the coding

Maximum memory: By default Redis has no memory limits on 64-bit systems and 3 GB on 32-bit systems. A large memory can contain more data and increase the hit ratio, one of the most important metrics, but at a certain limit of memory the hit rate will be at the same level.
Eviction algorithms: When the cache size reaches the memory limit, old data is removed to make space for new. Redis offer Last Recently Used and Least Frequently Used eviction algorithms. RDB point-in-time snapshots after a specific interval of time or number of writes AOF creates persistence logs with every write operation.
Durability: For various reasons, you may want to persist your cache. After startup, the cache is initially empty, it will be useful to fulfill it with snapshot data in case of recovery after outage. Redis support different ways to achieve the persistence.

Develop Spring Boot + Redis + Docker App

At the end of this article, I’ll make several updates to the Github repo for this demo. Take a look at releases if you would like to look at the code at various points in this article.

avinash10584/spring-boot-redis-cluster

spring-boot-redis-cluster. Contribute to avinash10584/spring-boot-redis-cluster development by creating an account on…

github.com

Let’s build our Redis app!

Step 1: Create a Simple Spring Boot TODO List App

First, let's create a simple TODO list app. We won’t be creating it for multiple users, just to keep it simple.

For now, we simply store one simple TODO list in our app as a cache. We’re not using a database yet.

Step 2: Download and Start Redis Docker Image

Download the official Redis image from Docker hub:

docker pull redis

After this command, the new image should be present in your local repository (type Docker images to check it).

The project in GitHub is configured to use both standalone and cluster mode.

First, let’s use Redis in standalone mode. We start the Redis image we pulled from Dockerhub:

docker run --rm -p 4025:6379 -d --name redis-1 redis redis-server

Step 3: Integrating Redis to the Spring Boot Application

We’ll be using spring boot cache to talk to Redis. Spring comes with several annotations you can add to help it work with Redis cache.

Add @EnableCaching to your Application config to enable these annotations:

Now, let’s build our Docker app to make sure spring boot app can talk to Redis. The Dockerfile for the project is located in the github repo. I’m focusing on Redis so I will avoid details of the Dockerfile for the spring boot app.

If you want to learn more about the Dockerfile in use for the project and understand how to avoid Docker build time and use caching then take a look at this article:

5 Essential Docker tips for your Spring Boot images 🐳

Keep up with the latest and best practices to build spring boot docker images.

medium.com

DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster .

Once our image is built, the next step is to run the image:

docker run --rm -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster

If you run the application and visit http://localhost:4024/app/ you’ll get the error .ConnectTimeoutException: connection timed out.

This is because we want our two Docker images to talk to each other — by default they sit on their own network and are isolated from each other.

We have two options at this point:

Use Docker network

Or,

Create a docker-compose that builds a default network

I will create the docker network For this demo, I’m creating a Docker network, but let me know in the comments if you would like a docker-compose file added to project.

docker network create spring-redis-network

Now let’s connect our Redis and spring boot images to this network:

docker network connect spring-redis-network redis-1

We can now look for IP address of our Redis instance and update in application.yml.

You can avoid this lookup if you use docker-compose as it can bind services without giving the specifics of IP addresses:

docker inspect spring-redis-network

In my case the IP address was 172.18.0.2

"Name": "redis-1", 
"EndpointID": "88b100f3569bb4ed68ac8cbf84f4b5a20493e11c5e7336a052bbbd25bb5f4205", "MacAddress": "02:42:ac:12:00:02", 
"IPv4Address": "172.18.0.2/16", 
"IPv6Address": ""

We can now update this in application.yml:

redis: 
   host: 172.18.0.2 
   port: 6379

Note: we use the port 6379 as we are in Docker network — the exposed port to the host is 4025.

Now let’s build our image for the app one more time and connect to the network we created:

DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster

We can connect to a network by passing --net when running our Docker image:

docker run --rm --net spring-redis-network -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster

You should see the todo list items at http://localhost:4024/app/

We can verify cache is created in our Docker Redis image:

docker exec -it redis-1 redis-cli --scan

Step 4: Modify Our App to Use Spring Cache Annotations

There are five basic annotations that you would normally use with Spring Cache:

@CachePut: This is used to update cache.
@Cacheable: To return a cached response for a method
@CacheEvict: To remove the cache entry no longer needed — think delete for an entity.
@Caching: Java doesn’t allow you to use the same annotation type twice in a method or class. So, if you want to say @CacheEvict in two different caches in the same method @Cacheable can be used to aggregate other cache annotations.

I have added these to our ToDoListController and it looks like this:

Let’s run our app again and check the Redis stats,

DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster . docker run --rm --net spring-redis-network -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster docker exec -it redis-1 redis-cli info stats

If you want to check the code up to this point, take a look at the tag.

Sharding with Redis Clusters

We’ve built a basic spring boot application with Redis cache.

But what if we’re dealing with large data that can’t be contained in one node? Redis supports clusters to shard your data across multiple nodes.

The entire keyspace in Redis Clusters is divided into 16384 slots (called hash slots) and these slots are assigned to multiple Redis nodes. A given key is mapped to one of these slots and the hash slot for a key is computed:

HASH_SLOT = CRC16(key) mod 16384

In most cases, you don’t need to know these internals as Redis will take care of the push and pull of data from the right cluster.

Let’s stop our Redis image and build a cluster:

docker stop redis-1 
docker stop spring-boot-redis

Let’s spin two more Redis nodes to build a cluster. We also pass a Redis config file located in the project/redis-conf.

Redis requires a minimum of three nodes for clusters to work.

Redis images have Redis cluster support disabled by default so we need to add a config file and pass that to our Redis Docker images. I’ve added this in the project root under /redis-conf.

Let’s start the shards:

docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4025:6379 -d --name redis-1 redis redis-server /redis_config/node1.conf docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4026:6379 -d --name redis-2 redis redis-server /redis_config/node2.conf docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4027:6379 -d --name redis-3 redis redis-server /redis_config/node3.conf

The Redis images have started in cluster mode but we still need to create a cluster to bind them together. We can do a primary-secondary configuration but for now, we just need data sharding and to create clusters without failover.

We will look at failover in the next section.

Run the following to create a cluster:

docker exec -it redis-1 redis-cli --cluster create 172.18.0.2:6379 172.18.0.3:6379 172.18.0.4:6379

You should see something like this in your output:

We can inspect our Docker network to get new IP address for the Redis nodes

docker inspect spring-redis-network

Our application-cluster.yml looks like this:

Let’s stop our Docker spring boot app and relaunch it with a cluster configuration:

docker stop spring-boot-redis DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster . docker run --rm --net spring-redis-network -e "SPRING_PROFILES_ACTIVE=cluster" -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster

You can refresh the app in your browser to verify the application is working.

We can also verify the cluster configuration in any node:

docker exec -it redis-2 redis-cli cluster nodes

Sharding allows our data to be distributed in multiple nodes for large datasets and reduces the lookup by hashing.

Our cluster is missing two key safety checks of distributed system: failover handling and replication.

Replication with Redis Cluster

Redis cluster allows us to achieve failover handling and replication.

We’re going to set up our nodes in a primary-secondary configuration where we have one parent and two replica nodes.

This way, if we lose one node, the cluster will still be able to elect a new primary. In this setup, writes will have to go through the primary, as replicas are read-only.

The upside to this is that if the primary disappears, its entire state has already been replicated to the secondary nodes, meaning that when one is elected as primary, it can begin to accept writes immediately.

Do we need sentinels?

Sentinels are separate Redis instances that run alongside the Redis node to decide their part in the cluster and also change primary-secondary as necessary in case of failover.

You don’t need Sentinel when using Redis cluster.

Redis Cluster performs automatic failover if any problem occur in any primary instance.

First, we should convert our cluster to primary-secondary. Let’s add three more nodes that will work as replicas to our three primary nodes.

First, we stop our nodes and then start them again:

docker stop redis-1 
docker stop redis-2 
docker stop redis-3 # Start redis nodes docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4025:6379 -d --name redis-1 redis redis-server /redis_config/node1.conf docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4026:6379 -d --name redis-2 redis redis-server /redis_config/node2.conf docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 4027:6379 -d --name redis-3 redis redis-server /redis_config/node3.conf # Start replicas docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 5025:6379 -d --name redis-1-replica redis redis-server /redis_config/node1-replica.confdocker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 5026:6379 -d --name redis-2-replica redis redis-server /redis_config/node2-replica.confdocker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf:/redis_config -p 5027:6379 -d --name redis-3-replica redis redis-server /redis_config/node3-replica.conf

Let’s inspect our Docker network as we need IP addresses to create the cluster:

docker inspect spring-redis-network docker exec -it redis-1 redis-cli --cluster create 172.18.0.2:6379 172.18.0.3:6379 172.18.0.4:6379 172.18.0.6:6379 172.18.0.7:6379 172.18.0.8:6379 --cluster-replicas 1

If you see the following error then stop your running nodes as they have in-memory data and they need to be empty when creating a cluster:

[ERR] Node 172.18.0.3:6379 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.

If all goes well you should see following output:

We can verify our cluster with:

docker exec -it redis-2 redis-cli cluster nodes

Our final application-cluster.yml looks like this:

Let’s run our app again and check the Redis stats:

DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster . docker run --rm --net spring-redis-network -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster docker exec -it redis-1 redis-cli info stats

We should now test our failover. Let’s stop one of our servers with docker stop redis-2.

If you run docker exec -it redis-1 redis-cli cluster nodes you’ll see that a secondary from earlier is now promoted as primary.

In case of a parent failure, Redis automatically promotes a secondary replica to primary.

Redis Primary-Secondary (Replica) Network with Sentinel without Sharding

In scenarios where you don’t need sharding and one node is enough for your in-memory needs, you can avoid creating clusters and build primary-secondary replicas by specifying the replicaof in the replica configurations for the Redis node.

If we’re not using Redis clusters then we need sentinels to achieve failovers.

The sentinels are different concept than Redis cluster. If the primary dies then sentinels talk to each other to decide new primary.

Since sentinel configuration is very different from clusters, I have put this config in /redis-conf-sentinel

Let’s add our sentinel servers so we have automatic failover,

Sentinels only need to look at primary nodes to decide on failover.

sentinel monitor redis-cluster 172.18.0.2 6379 2

We can use below config to decide on how long before a cluster node is considered down.

sentinel down-after-milliseconds redis-cluster 5000

We can add below to allow timeout for current replication writes to complete before a failover kick-off:

sentinel failover-timeout redis-cluster 10000

Let’s stop all our Docker containers:

docker stop $(docker ps -a -q)

docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 4025:6379 -d --name redis-1 redis redis-server /redis_config/node1.conf docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 5025:6379 -d --name redis-1-replica redis redis-server /redis_config/node1-replica-1.confdocker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 5026:6379 -d --name redis-2-replica redis redis-server /redis_config/node1-replica-2.confdocker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 6025:6379 -d --name sentinel-1 redis redis-server /redis_config/sentinel1.conf --sentinel docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 6026:6379 -d --name sentinel-2 redis redis-server /redis_config/sentinel2.conf --sentinel docker run --rm --net spring-redis-network -v /mnt/c/Development/github/spring-boot-redis-cluster/redis-conf-sentinel:/redis_config -p 6027:6379 -d --name sentinel-3 redis redis-server /redis_config/sentinel3.conf --sentinel docker logs sentinel-2

We need to update our spring boot app to also use the sentinel configuration.

Let’s add a separate application-sentinel.yml and start the application:

docker stop spring-boot-redis DOCKER_BUILDKIT=1 docker build -t learnings/spring-boot-redis-cluster . docker run --rm --net spring-redis-network -e "SPRING_PROFILES_ACTIVE=sentinel" -p 4024:4024 --name spring-boot-redis learnings/spring-boot-redis-cluster

We can stop our primary instance to verify that the sentinels are working:

docker stop redis-1

You should see logs like this in your sentinel docker logs sentinel-1:

1:X 18 Jun 2020 21:25:06.046 # +sdown primary redis-cluster 172.18.0.2 6379 1:X 18 Jun 2020 21:25:07.679 # +new-epoch 1 1:X 18 Jun 2020 21:25:07.891 # +vote-for-leader bfebc5c7d07121c78633024dcbc89a14bf1e4563 1 1:X 18 Jun 2020 21:25:07.948 # +odown primary redis-cluster 172.18.0.2 6379 #quorum 3/2 1:X 18 Jun 2020 21:25:07.948 # Next failover delay: I will not start a failover before Thu Jun 18 21:25:28 2020 1:X 18 Jun 2020 21:25:08.580 # +config-update-from sentinel bfebc5c7d07121c78633024dcbc89a14bf1e4563 172.18.0.7 6379 @ redis-cluster 172.18.0.2 6379 1:X 18 Jun 2020 21:25:08.580 # +switch-primary redis-cluster 172.18.0.2 6379 172.18.0.4 6379 1:X 18 Jun 2020 21:25:08.581 *

That’s it! We’ve implemented a spring boot Redis app and learned how to create different Redis networks.

If you have any questions or feedback please don’t hesitate to leave your thoughts in the comments section.

For issues related to code, please feel free to create an issue directly in the GitHub repository.