We utilize an "Expand and Contract" model to decouple application deployment from schema changes. Our first application of schema change will be by using a non-locking and additive way, this will enable us to introduce new migration (like adding nullable columns); then we will do a rolling update of our Kubernetes Pods (deploying new app code that has been explicitly developed to handle both the remnant and new schema), and finally after all Pods are stable, migrate to remove the no longer needed columns. Schema changes were considered safe while under load, as both incoming and outgoing Pod Replicas maintained strict backward compatibility throughout the entire rolling update period.
President & CEO at Performance One Data Solutions (Division of Ross Group Inc)
Answered a month ago
Here's what works for me with schema migrations. I run the new version alongside the old one, writing to both so nothing gets lost during the switch. It handles the production load without breaking a sweat. Since I added scheduled backfills and consistency checks, data problems have basically disappeared. Just monitor the replication lag and run a reconciliation job before you cut over. That's the safest way to do it.
When I had to handle PostgreSQL schema changes on Kubernetes, I ran old and new fields side by side. This dual-version API approach gave us some breathing room. Each service could upgrade independently, which avoided any big-bang deploy issues with our SaaS apps. The key is watching for sync lags between fields. Catching those early saves a ton of headaches later.
The smoothest way I've handled database migrations is using a feature toggle to switch between old and new schemas while a sync job runs in the background. This lets you roll back fast if something breaks, which keeps everything stable when traffic is high. Since we set up a solid rollback plan, we stopped getting those 2 AM alert calls during peak times. My advice is to intentionally break the failover in staging to catch race conditions before they hit your users.
I use blue-green deployments for the application layer with lazy backfilling in PostgreSQL. This lets old and new code run at the same time. The dual-write logic keeps everything in sync even when a high volume of users are creating records. Since I started this, data mismatches during migrations hardly ever happen anymore. It's a good idea to log any duplicate or failed syncs to a dashboard so you catch subtle bugs before they affect production.