Member-only story
An Introduction to Database Sharding
Scaling your database as your app grows

With ever-increasing data volumes that an application has to deal with, it is not possible to keep all the data at a single node, as a single server might not be able to handle such a large throughput. The partitioning of the database into multiple nodes is one technique that is effective in this scenario.
Today, we are going to talk about database sharding (partitioning). Sharding has received a lot of attention in recent years. But, many are unfamiliar with this concept.
In this blog, we will be talking about basic concepts, why is it required, and ways to implement it.
What Is Sharding?
Sharding is partitioning the data from a single data source to multiple partitions in which the structure of each partition is identical to others. Individual partition is also referred to as a shard. Shards can be placed in the same server or different servers.
To understand this better, look at the image below. We have a users table that contains users’ info like name, age, and country. We sharded that into two tables, users_001 and users_002. (On what basis we can a shard a table will be discussed in subsequent paragraphs.)
One important thing to note is that the shards don’t share any data. So, replication needs to work along sharding to prevent data loss in case a shard goes down.

Why Do We Need Sharding?
Let us understand this with an example. Suppose, in image 1, instead of three lanes there was only one lane. What would have happened? Traffic would have been moving slowly, there might be congestion.
Similarly, in the database as well, if we read everything from one table, response time might increase with an increase in load and the database can also go down.
Sharding splits the traffic from a single table to multiple tables just like lanes in the image 1.