Member-only story
Design Considerations for Scaling WebSocket Server Horizontally With a Publish-Subscribe Pattern
Understanding the challenges in scaling WebSocket servers

If you do not have a medium membership, you can still read this here.
In my previous article, I wrote about designing and building a WebSocket server in a microservice architecture. Although the implementation works fine for a single instance of a WebSocket server, we will start facing issues when we try to scale up the number of WebSocket server instances (aka horizontal scaling). This article looks into the design considerations for scaling the WebSocket server using a publish-subscribe messaging pattern.
My Websocket Server Series
- 01: Building WebSocket server in a microservice architecture
- 02: Design considerations for scaling a WebSocket server horizontally with the publish-subscribe pattern
- 03: Implement a scalable WebSocket server with Spring Boot, Redis Pub/Sub, and Redis Streams
- 04: TBA
What is Horizontal Scaling?
First, let’s try to understand why we need horizontal scaling. As our user base grows, the load on the server grows. And when the load grows, a single server will not be able to provide high performance for all the users. Hence, it is necessary to provide the capability to increase/decrease the number of servers whenever necessary to meet the user’s demand as well as to save resources as part of our design considerations.
Horizontal scaling refers to adding more machines to your infrastructure to cope with the high demand on the server. In our microservice context, scaling horizontally is the same as deploying more instances of the microservice. A load balancer will then be required to distribute the traffic among the multiple microservice instances, as shown below:
With this, I hope you better understand why we need horizontal scaling in our infrastructure. So let’s move on to learn the design considerations for scaling WebSocket servers in a microservice architecture.