Understand schema migration: It’s like Git, but for databases. | by Princewill Onyenanu | April 2022
I know you want to make sure the requirements are clear enough before you start development. Okay your assumptions periodically, and at the end of the day, you finally arrive at your “final” design – or maybe not.
This is real life, the finale is only while it lasts. Things change arbitrarily, feelings change and people even fall in love, how much more do database patterns change?
Well it’s not like you have a problem with changing schemas, after all all you have to do is drop the database associated with the old schema, change your schema and recreate your database. Pretty simple isn’t it? except you were in development, the dummy data you dropped into your local database had no commercial significance. Now that you’re in production, deleting the data here can literally ruin your life (and i mean it)depending on the type of customer you serve and the importance of the data you process.
Schema migrations to your rescue!
What is Schema Migration as I’m Pro?
According to Wikipedia, these are the management of incremental and reversible changes and version control of relational database schemas.
It should be pointed out that the term Schema Migration is often used interchangeably with Database migration. Although many people agree that the two terms should indeed refer to the same thing, there are also conflicting views. It should also be noted that the context also matters in the discussion, because database migration, in business discussions, probably refers to the process of selecting, preparing, extracting and transforming data and moving it permanently from one computer storage system to another.
What is Schema Migration as I’m Noob?
Think of it like Git, but for database schemas. For Git, I track changes to my source code, but for schema migrations, I track changes to my database schemas.
Let’s say I started designing a database table to hold the articles I post on my blog, where I document my travel experiences. When I started designing, I wanted my blog to be pretty simple and straightforward, nothing fancy. So I end up with an example table schema (written in Django) as below.
class Articles(models.Model): id = models.AutoField(“id”, primary_key=True) title = models.TextField(default=””) content = models.TextField(default=””) created = models.DateTimeField(auto_now_add=True) updated = models.DateTimeField(auto_now=True)
This is a simple table to hold my articles, the fields are identifier(primary key), Title of my article, the real content, created which is the date it was created and update, who is the date of its last update.
Now suppose I go ahead and settle for this design and finally launch my blog on the internet and start posting some real content and luckily I start getting a good number of readers . All seems well, until I noticed that my readers were reaching out and strongly asking for more visual content, i.e. they wanted me to at least include images in my articles to give them a better idea of my travel experiences. Now, if I was still in development, not using schema migrations, and somehow became aware of what this reader needs, all I would have to do is redesign my schema as follows:
class Articles(models.Model): id = models.AutoField(“id”, primary_key=True) title = models.TextField(default=””) content = models.TextField(default=””) image_url = models.TextField(default=””) created = models.DateTimeField(auto_now_add=True) updated = models.DateTimeField(auto_now=True)
add a new field to hold my image URLs, then delete my existing database and recreate a new one, using the new schema. As I mentioned earlier in the opening story, this is perfectly fine as I’m still in development and the test/dummy data stored on my database has little to no commercial value. or personal. But as soon as I start producing and start publishing real articles, attracting real readers, and getting those articles shared by readers who like them, I totally lose that luxury.
This is where migration comes in, it keeps track of all the changes you make to your schema in what is called a migration file and in turn uses that file as the single source of truth regarding the state of your database at any time. , when it needs to make changes.
With schema migration, the previous scenario would go as follows:
- I’m starting my travel blog project with my initial outline design
- Run my initial schema migration to track the initial state of the schema and automatically create a corresponding table in my database.
- After receiving recommendations from readers to make my articles more visually appealing by adding images, I’m modifying my schema to include a new image URL field to hold the images.
- I run another migration to capture the new schema state and in turn reflect those changes in my database.
The above process is basically how the migration workflow happens. You make incremental changes to your schema, you notify the migration engine by running a migration, it generates a migration file, which is basically a note of the current state of your database at runtime, as well as new changes you add and finally, use this note to make the necessary changes to your database. This is somewhat similar to the change staging and commit git flow.
A few points to note
- Migration engines generally work best with ORM systems that abstract database operations at a high level.
- Different languages and frameworks provide their own migration engines.
- Migration engines automatically modify your database.
- Migration engines aren’t as smart as humans, sometimes they make mistakes and sometimes you’ll need to double-check the generated migration files to make sure the actions they need to perform are what you really need, and when if not, you are expected to modify them to reflect your needs.
Do you remember more important things about schema migrations and migration engines? Feel free to add yours in the comment section.
Closing
This was a brief, non-technical introduction to the concept of schema migration. It aimed to be as code/language agnostic as possible, aside from the table schema demonstrated in Django. Hope this helps you understand what Damn migrations and why you might (definitely) need them.
Thanks for reading 🙂