To achieve zero-downtime PostgreSQL schema migration on Kubernetes, use logical replication along with a phased migration strategy. Start by setting up a logical replication slot on the source database to stream changes to a new staging database with the updated schema. Introduce the new schema version alongside the existing version, allowing for careful modifications without dropping original structures.
Use an expand-contract pattern so that the app may work with both old and new schemas at the same time. To rename a column, make a new one and leave the previous one alone. Use code that writes to both columns but only reads from the new one. Use triggers or logical replication to maintain values in sync. For at least 48 hours, run this dual setup and check that both columns are the same. Take off the old column and deploy code that only uses the new schema once everything is the same. There is no requirement for a maintenance window. This works because the app and database are always in sync. You can go back to any step. Kubernetes' rolling updates lower risk by only showing a small part of traffic at first. Background checks let you know about mismatches early on. The method makes things more complicated, but it doesn't cause any downtime.
One proven tactic is the Expand-Contract pattern, using logical replication in PostgreSQL within Kubernetes. We literally have two copies of our schema running side by side. The "expand" phase consists of deploying the new schema and using a logical replication subscription to stream live data from the old "publisher" schema into the new one, backfilling the data fairly quickly. Now the new schema is perfectly in sync with live production traffic. What makes this technique safe is the gradual cutover at the application layer. Before flipping the switch we run some simple verification queries against both schemas to make sure that they're correct. Then during the "contract" phase we do a canary release of the application pods that point to the new schema. We've switched a small percentage of our traffic to the new one and we monitor for any errors. This has the effect of de-risking the migration down to a very small cohort of unlucky users who get the new schema before all is validated, and we can roll back our application deployment without hurting the database, which is still serving 99% of our traffic from the old "publisher" schema.