Concurrency in Node.js

A brief look at ways to do multiple tasks simultaneously without depending on the previous results

Sina Ahamadpour
Better Programming

--

Photo by GuerrillaBuzz on Unsplash

As a person who worked with PHP for more than ten years, it was hard for me to understand Concurrency in other languages like Go and Node.js at the first.

Still, I’m curious about what’s happening under the hood. All this jibber-jabber about concurrently, time-sharing, async programming, threads, green threads, goroutines … makes me wonder if I know programming or not!

Disclaimers

  • All I’m trying to do is share what I know about writing a performant script in Node.js that handles concurrent tasks. Maybe I left some parts out.
  • There are solutions like spatie/async to write async in PHP. So it is possible (but still, I’m not a fan of that!)
  • Your comments are welcomed. Please share your thoughts.

What is the Event Loop?

Recently, I saw a really awesome video about how JavaScript’s different runtime environments handle concurrency. This cool video by Philip Roberts at JSConf EU shows explains what an event loop is, how JavaScript handles Concurrency, and what happens inside a call stack when a piece of code is async. I recommend you check out this video first:

Sync vs. Async

It’s important for us to know the difference between these two concepts.

A sync code will run line by line, and each line must be finished before going to the next line (Maybe it’s not the correct definition, but that’s what I tried to ingest while learning and tell others while they are learning).

// Example of Sync code execution

console.log(1);
console.log(2);
console.log(3);

// Output:
// 1
// 2
// 3

But an Async code will have different behavior:

// Example of Async code execution

console.log(1);
setTimeout(() => console.log(2), 1000);
console.log(3);

// Output:
// 1
// 3
// ... after a while ...
// 2

I just skipped too many concepts about callbacks, promises, and aysnc/await. There are tons of content on the web. You can read the following to get a better understanding:

Benefits of Async Programming

In the real world, things are not in sync. While you are reading this article, there are thousands of things happening around you. From your computer processing, the light bulb that lights your room, the air conditioner, people outside walking down the street, and thousands of other events happening each moment!

I really like Rob Pike’s talk Concurrency is not Parallelism. I cannot find a better explanation for this topic.

When performing tasks in an async manner, we can expect much faster performance, faster results, and happier stakeholders.

Demo Time

I’ll try to show you the benefits of Concurrency in Node.js. The scenario is really simple; we will get the data of 500 users from a slow function (For the sake of demonstration, it is just a simulation of a network’s waiting time like an HTTP GET request.)

// Simulate HTTP request (Including network delay)
const getUser = async (id) => {
await new Promise(resolve => setTimeout(resolve, 1000));
return `User ${id}`;
};

It will wait for one second and then return a simple string.

If you want to get 500 users one by one (in a series), it will take us eight minutes and 40 seconds. (500 users / 60 seconds = 8.3333333)

// Execution
(async () => {
let usersCount = 500;
let benchmarkIdentifier = `sync`

console.time(benchmarkIdentifier);
for (let i = 1; i <= usersCount; i++) {
const user = await getUser(i);
console.log(user);
}
console.timeEnd(benchmarkIdentifier);
})();

// sync: 8:20.612 (m:ss.mmm)

Now, we will change this code a little bit to accept the number of concurrent requests from the command line (Still, there is only one concurrent request).

// Execution
(async () => {
let usersCount = 500;
let chunkSize = parseInt(process.argv[2]);
let benchmarkIdentifier = `async(${chunkSize})`

console.time(benchmarkIdentifier);
for (i=1 ; i<=usersCount; i+= chunkSize) {
const promises = [];
for (j=i; j<i+chunkSize && j<=usersCount; j++) {
promises.push(getUser(j));
}
const users = await Promise.all(promises);
console.log(users);
}
console.timeEnd(benchmarkIdentifier);
})();

// async(1): 8:20.711 (m:ss.mmm)

As you see, we used a simple technique to segment the 500 users and process them chunk by chunk. We can use Promise.all() to wait for all promises to be resolved or rejected.

Now, the real magic will be shown if we pass something like two concurrent requests:

// async(2): 4:10.342 (m:ss.mmm)

We just executed the code in half the time (with x2 speed!)

Here are the benchmarks for the different inputs:

Hey, Sina? That’s it. Now, we can set the total number of concurrent requests to the max number in our scripts. Right?

My answer: No, I’ll explain why!

When and How To Use Concurrency?

It’s a really big question, and to be honest, there is not an exact answer to go with that. Everything depends on the code you are writing, the server you are running your code on, the network you are sending the requests through, and the resilience you are fighting for. You have to consider the traffic from and to your service.

We can use concurrency when the order of doing tasks does not matter, and you can do multiple tasks at the same time without being dependent on the previous results. In accumulative jobs, you can not do this because you need the result from the previous step.

Also, you need to set this number to a fair amount. Consider the traffic and barriers when setting that on a high number. For example, you can not bombard another server with too many requests because, at some point, reverse proxy, the CDN server, the server itself, or the application framework layer identify your request as a possible DDoS attack even though your intent is 100% pure good!

Have a nice time!

--

--