Member-only story

4 Practices to Plan a Large-scale Data Migration

What I learned from migrating 25 billion records

Jonathan Seow

Published in

Better Programming

8 min readMar 7, 2022

Data migration is simply the process of moving data from one location to another.

As businesses grow, data migration is almost unavoidable. You may need larger storage to support your data growth. Or, you need to change data format due to evolving requirements.

I have recently completed a data migration with Elasticsearch. There were a total of 25 billion documents, and the entire process took almost 14 days!

As easy as it sounds, data migration is no easy feat. There are many caveats to a smooth migration without system downtime.

In this article, I will share with you the knowledge I have gained after completing a large-scale data migration. They are practices that you can follow regardless of your system architecture. Let’s get started! 🏃

The Read-Write Flag

There are two problems you need to consider before performing data migration in a continuous, live system.

One, how do you ensure data in the new location stays updated? At this point, your APIs are still writing to the old storage. Any updates to data that have been migrated will be missed in the new storage.

Two, how do you smoothly migrate your APIs to use the new storage? Similarly, if anything bad happens, you need to roll back your APIs back to old storage. A smooth transition at any time between old and new storage is the key to migration without downtime.

To solve these problems, we will introduce a read-write flag. A flag is a switch that turns certain functionalities on or off during runtime, without any code change.

Better Programming

4 Practices to Plan a Large-scale Data Migration

What I learned from migrating 25 billion records

The Read-Write Flag

Create an account to read the full story.

Published in Better Programming

Written by Jonathan Seow

Responses (2)