Member-only story
When Scaling Is Not An Option: A Simple Asynchronous Pattern

We live in a world of APIs, and while they are practical you may face a situation where the number of requests increases rapidly and your underlying dependencies can’t keep up.
When this happens you have to define how you want to handle it. The first reaction would be to scale your infrastructure — horizontally or vertically — to increase the capacity you can offer to your clients. But it may not be possible or desirable.
If you find yourself in such a situation one simple pattern that can be applied is to switch your API to instead of carrying on the request execution just acknowledge receiving the request and providing ways to inform back once you actually process the request.
In this article, I will share some use cases where this is a valid pattern and the trade-offs involved.
The Problem Revisited
Let’s imagine that we offer a service — public or not — via an API like the one illustrated in Figure 1.

In our example, we can highlight 3 direct dependencies:
- The compute unit responsible for receiving the…