automate database schema changes through CI/CD and GitOps
Database DevOps: schema migrations in CI/CD pipelines
14 min read
When app deploys and schema changes run on different tracks, production breaks fast. This guide turns migrations into first-class delivery artifacts with Flyway or Liquibase, forward-safe expand-contract rollouts, and GitOps-aware execution order.
When application deploys and schema changes fall out of sync
Application releases and database schema updates are often owned by different teams, tools, and schedules. That mismatch drives production failures when code expects columns that were never migrated, painful recovery after destructive DDL, accidental re-runs of hand-written SQL, changes that bypass review and automated tests, and drift between local, staging, and production. In a microservice estate with dozens of independent schemas, manual migration management does not scale.
Treat schema changes as versioned delivery artifacts
The operating principle is simple: database schema changes belong in the software delivery pipeline, not in ad hoc DBA windows. Each change is an ordered migration file checked into Git, reviewed like application code, and applied by a runner that records state in its history table (for Flyway this is flyway_schema_history). If you use Flyway, follow its native naming convention such as V001__initial_schema.sql and V002__add_users_table.sql. Avoid one-off psql execution in production because it removes ordering guarantees, review traceability, and repeatability.
CI/CD ordering and expand-contract rollouts
Run pending migrations against a test database after unit tests and before deploying the new application version. Validate syntax, constraints, and data safety, then promote the app and run smoke tests on a production-like target. For zero-downtime deploys, use expand-contract (backward-compatible) steps: expand by adding nullable columns or tables while version N still runs; migrate data in the background while old code ignores new fields; contract by dropping deprecated objects only after version N+1 is stable everywhere. Pair schema rollout with feature flags so code referencing new structures stays off until both migration and binary are live.
Rollbacks: tool-specific reality, not one universal pattern
Rollback strategy depends on the migration tool and database risk profile. Liquibase has first-class rollback semantics. Flyway teams often run a forward-only model (especially in Community): instead of trying to reverse every failed migration, they ship a corrective follow-up migration and rely on backup or point-in-time recovery for destructive incidents. If you use Flyway Teams undo migrations, treat them as an additional control, not your only safety net. Feature branches should still carry migration files, and on merge the runner applies only pending versions in the target environment.
-- V004__alter_orders_add_status_column.sql
ALTER TABLE orders ADD COLUMN status VARCHAR(20) DEFAULT 'pending';-- V005__fix_orders_status_default.sql
ALTER TABLE orders ALTER COLUMN status SET DEFAULT 'new_pending';Example: Flyway in GitHub Actions with Argo CD sync waves
The layout below keeps migrations beside service code: db/migrations for versioned SQL, db/flyway.conf for connection settings, and a CI workflow that validates and migrates against ephemeral PostgreSQL before integration tests. In Kubernetes, Argo CD can orchestrate ordering with sync waves, but the migration Job must have both config and SQL files mounted. sync-wave only controls ordering among rendered manifests; it does not magically provide migration files unless you package or mount them explicitly.
flyway.url=jdbc:postgresql://${DB_HOST}:${DB_PORT}/${DB_NAME}
flyway.user=${DB_USER}
flyway.password=${DB_PASSWORD}
flyway.locations=filesystem:./migrations
flyway.baselineOnMigrate=true
flyway.outOfOrder=false
flyway.validateOnMigrate=truename: Database Migrations
on:
push:
paths:
- 'db/**'
- 'src/**'
jobs:
validate-migrations:
runs-on: ubuntu-latest
env:
DB_HOST: localhost
DB_PORT: 5432
DB_NAME: testdb
DB_USER: testuser
DB_PASSWORD: testpassword
services:
postgres:
image: postgres:15
env:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpassword
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Set up Flyway
run: |
curl -L https://repo1.maven.org/maven2/org/flywaydb/flyway-commandline/9.22.3/flyway-commandline-9.22.3.zip -o flyway.zip
unzip flyway.zip
sudo mv flyway-9.22.3 /opt/flyway
sudo ln -s /opt/flyway/flyway /usr/local/bin/flyway
- name: Validate migration files
run: flyway -configFiles=db/flyway.conf validate
- name: Run migrations against test database
run: flyway -configFiles=db/flyway.conf migrate
- name: Run application integration tests
run: ./gradlew integrationTest
- name: Inspect migration history
run: flyway -configFiles=db/flyway.conf infoapiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: orders-db-migrations
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/orders-service
targetRevision: main
path: db
destination:
server: https://kubernetes.default.svc
namespace: orders-db
syncPolicy:
automated:
prune: false
selfHeal: false
syncOptions:
- ApplyOutOfSyncOnly=true
retry:
limit: 3
backoff:
duration: 5s
factor: 2
maxDuration: 60sapiVersion: batch/v1
kind: Job
metadata:
name: orders-migration-v004
namespace: orders-db
annotations:
argocd.argoproj.io/sync-wave: "-1"
spec:
ttlSecondsAfterFinished: 300
template:
spec:
restartPolicy: OnFailure
containers:
- name: flyway
image: flyway/flyway:9.22.3
args:
- migrate
- -configFiles=/mnt/config/flyway.conf
- -locations=filesystem:/flyway/sql
env:
- name: DB_HOST
valueFrom:
secretKeyRef:
name: orders-db-credentials
key: host
- name: DB_PORT
value: "5432"
- name: DB_NAME
value: orders
- name: DB_USER
valueFrom:
secretKeyRef:
name: orders-db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: orders-db-credentials
key: password
volumeMounts:
- name: flyway-config
mountPath: /mnt/config
- name: flyway-sql
mountPath: /flyway/sql
volumes:
- name: flyway-config
configMap:
name: flyway-config
- name: flyway-sql
configMap:
name: orders-db-migrations-sqlBest practices for additive, reversible schema change
Prefer additive migrations first: add nullable columns, backfill, then enforce NOT NULL. Never DROP columns or tables until every instance stops reading them — use expand-contract and feature flags to separate risky DDL from feature rollout. Test against production-sized datasets; five seconds on a hundred rows can mean hours on tens of millions. Split DDL from DML so schema changes stay fast while backfills run as batched jobs. In PostgreSQL, lock behavior varies by operation and version: some ALTER TABLE statements are metadata-only, while others can still take ACCESS EXCLUSIVE locks. Set lock_timeout before risky DDL, and run heavy operations in low-traffic windows.
-- Avoid immediate NOT NULL on existing rows
ALTER TABLE orders ADD COLUMN priority VARCHAR(10) NOT NULL; -- fails
ALTER TABLE orders ADD COLUMN priority VARCHAR(10);
UPDATE orders SET priority = 'medium' WHERE priority IS NULL;
ALTER TABLE orders ALTER COLUMN priority SET NOT NULL;-- V005: schema change (fast)
ALTER TABLE orders ADD COLUMN shipped_at TIMESTAMP;
-- V006: data migration (can run async in batches)
UPDATE orders SET shipped_at = updated_at WHERE status = 'shipped';SET lock_timeout = '2s';
ALTER TABLE orders ADD COLUMN tracking_number VARCHAR(100);Tooling discipline, environment parity, and confidence at scale
Use dedicated migration tooling (Flyway, Liquibase, goose, or framework-native runners) instead of manual psql. Block application start until migrations succeed with init containers, startup checks, or deployment gates. Keep dev, staging, and production on the same migration runner and configuration model; only credentials and connection endpoints should differ. Send every migration through the same code review bar as application changes: documented intent, recovery strategy, performance impact, and backward compatibility. Monitor migration duration in production; sudden spikes often signal data growth, lock contention, or missing indexes. Schema management is not separate from DevOps: when migrations are versioned, automated, and operationally rehearsed, teams ship database changes with the same confidence as application releases and preserve a reliable audit trail.
Kubernetes ordering for migrations pairs naturally with declarative delivery in GitOps with Argo CD and Flux on Kubernetes.
When schema work blocks releases, diagnose friction alongside the release pipeline bottlenecks framework.
