automate database schema changes through CI/CD and GitOps

Database DevOps: schema migrations in CI/CD pipelines

14 min read

When app deploys and schema changes run on different tracks, production breaks fast. This guide turns migrations into first-class delivery artifacts with Flyway or Liquibase, forward-safe expand-contract rollouts, and GitOps-aware execution order.

When application deploys and schema changes fall out of sync

Application releases and database schema updates are often owned by different teams, tools, and schedules. That mismatch drives production failures when code expects columns that were never migrated, painful recovery after destructive DDL, accidental re-runs of hand-written SQL, changes that bypass review and automated tests, and drift between local, staging, and production. In a microservice estate with dozens of independent schemas, manual migration management does not scale.

Treat schema changes as versioned delivery artifacts

The operating principle is simple: database schema changes belong in the software delivery pipeline, not in ad hoc DBA windows. Each change is an ordered migration file checked into Git, reviewed like application code, and applied by a runner that records state in its history table (for Flyway this is flyway_schema_history). If you use Flyway, follow its native naming convention such as V001__initial_schema.sql and V002__add_users_table.sql. Avoid one-off psql execution in production because it removes ordering guarantees, review traceability, and repeatability.

CI/CD ordering and expand-contract rollouts

Run pending migrations against a test database after unit tests and before deploying the new application version. Validate syntax, constraints, and data safety, then promote the app and run smoke tests on a production-like target. For zero-downtime deploys, use expand-contract (backward-compatible) steps: expand by adding nullable columns or tables while version N still runs; migrate data in the background while old code ignores new fields; contract by dropping deprecated objects only after version N+1 is stable everywhere. Pair schema rollout with feature flags so code referencing new structures stays off until both migration and binary are live.

Rollbacks: tool-specific reality, not one universal pattern

Rollback strategy depends on the migration tool and database risk profile. Liquibase has first-class rollback semantics. Flyway teams often run a forward-only model (especially in Community): instead of trying to reverse every failed migration, they ship a corrective follow-up migration and rely on backup or point-in-time recovery for destructive incidents. If you use Flyway Teams undo migrations, treat them as an additional control, not your only safety net. Feature branches should still carry migration files, and on merge the runner applies only pending versions in the target environment.

SQL · forward migration
-- V004__alter_orders_add_status_column.sql
ALTER TABLE orders ADD COLUMN status VARCHAR(20) DEFAULT 'pending';
SQL · corrective follow-up migration
-- V005__fix_orders_status_default.sql
ALTER TABLE orders ALTER COLUMN status SET DEFAULT 'new_pending';

Example: Flyway in GitHub Actions with Argo CD sync waves

The layout below keeps migrations beside service code: db/migrations for versioned SQL, db/flyway.conf for connection settings, and a CI workflow that validates and migrates against ephemeral PostgreSQL before integration tests. In Kubernetes, Argo CD can orchestrate ordering with sync waves, but the migration Job must have both config and SQL files mounted. sync-wave only controls ordering among rendered manifests; it does not magically provide migration files unless you package or mount them explicitly.

flyway.conf
flyway.url=jdbc:postgresql://${DB_HOST}:${DB_PORT}/${DB_NAME}
flyway.user=${DB_USER}
flyway.password=${DB_PASSWORD}
flyway.locations=filesystem:./migrations
flyway.baselineOnMigrate=true
flyway.outOfOrder=false
flyway.validateOnMigrate=true
GitHub Actions · validate and migrate
name: Database Migrations
on:
  push:
    paths:
      - 'db/**'
      - 'src/**'
jobs:
  validate-migrations:
    runs-on: ubuntu-latest
    env:
      DB_HOST: localhost
      DB_PORT: 5432
      DB_NAME: testdb
      DB_USER: testuser
      DB_PASSWORD: testpassword
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_DB: testdb
          POSTGRES_USER: testuser
          POSTGRES_PASSWORD: testpassword
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    steps:
      - uses: actions/checkout@v4
      - name: Set up Flyway
        run: |
          curl -L https://repo1.maven.org/maven2/org/flywaydb/flyway-commandline/9.22.3/flyway-commandline-9.22.3.zip -o flyway.zip
          unzip flyway.zip
          sudo mv flyway-9.22.3 /opt/flyway
          sudo ln -s /opt/flyway/flyway /usr/local/bin/flyway
      - name: Validate migration files
        run: flyway -configFiles=db/flyway.conf validate
      - name: Run migrations against test database
        run: flyway -configFiles=db/flyway.conf migrate
      - name: Run application integration tests
        run: ./gradlew integrationTest
      - name: Inspect migration history
        run: flyway -configFiles=db/flyway.conf info
Argo CD · Application for db path
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: orders-db-migrations
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/orders-service
    targetRevision: main
    path: db
  destination:
    server: https://kubernetes.default.svc
    namespace: orders-db
  syncPolicy:
    automated:
      prune: false
      selfHeal: false
    syncOptions:
      - ApplyOutOfSyncOnly=true
    retry:
      limit: 3
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 60s
Kubernetes Job · Flyway migrate (sync-wave -1)
apiVersion: batch/v1
kind: Job
metadata:
  name: orders-migration-v004
  namespace: orders-db
  annotations:
    argocd.argoproj.io/sync-wave: "-1"
spec:
  ttlSecondsAfterFinished: 300
  template:
    spec:
      restartPolicy: OnFailure
      containers:
        - name: flyway
          image: flyway/flyway:9.22.3
          args:
            - migrate
            - -configFiles=/mnt/config/flyway.conf
            - -locations=filesystem:/flyway/sql
          env:
            - name: DB_HOST
              valueFrom:
                secretKeyRef:
                  name: orders-db-credentials
                  key: host
            - name: DB_PORT
              value: "5432"
            - name: DB_NAME
              value: orders
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: orders-db-credentials
                  key: username
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: orders-db-credentials
                  key: password
          volumeMounts:
            - name: flyway-config
              mountPath: /mnt/config
            - name: flyway-sql
              mountPath: /flyway/sql
      volumes:
        - name: flyway-config
          configMap:
            name: flyway-config
        - name: flyway-sql
          configMap:
            name: orders-db-migrations-sql

Best practices for additive, reversible schema change

Prefer additive migrations first: add nullable columns, backfill, then enforce NOT NULL. Never DROP columns or tables until every instance stops reading them — use expand-contract and feature flags to separate risky DDL from feature rollout. Test against production-sized datasets; five seconds on a hundred rows can mean hours on tens of millions. Split DDL from DML so schema changes stay fast while backfills run as batched jobs. In PostgreSQL, lock behavior varies by operation and version: some ALTER TABLE statements are metadata-only, while others can still take ACCESS EXCLUSIVE locks. Set lock_timeout before risky DDL, and run heavy operations in low-traffic windows.

SQL · additive column safely
-- Avoid immediate NOT NULL on existing rows
ALTER TABLE orders ADD COLUMN priority VARCHAR(10) NOT NULL;  -- fails

ALTER TABLE orders ADD COLUMN priority VARCHAR(10);
UPDATE orders SET priority = 'medium' WHERE priority IS NULL;
ALTER TABLE orders ALTER COLUMN priority SET NOT NULL;
SQL · separate DDL and DML
-- V005: schema change (fast)
ALTER TABLE orders ADD COLUMN shipped_at TIMESTAMP;

-- V006: data migration (can run async in batches)
UPDATE orders SET shipped_at = updated_at WHERE status = 'shipped';
SQL · lock timeout guard
SET lock_timeout = '2s';
ALTER TABLE orders ADD COLUMN tracking_number VARCHAR(100);

Tooling discipline, environment parity, and confidence at scale

Use dedicated migration tooling (Flyway, Liquibase, goose, or framework-native runners) instead of manual psql. Block application start until migrations succeed with init containers, startup checks, or deployment gates. Keep dev, staging, and production on the same migration runner and configuration model; only credentials and connection endpoints should differ. Send every migration through the same code review bar as application changes: documented intent, recovery strategy, performance impact, and backward compatibility. Monitor migration duration in production; sudden spikes often signal data growth, lock contention, or missing indexes. Schema management is not separate from DevOps: when migrations are versioned, automated, and operationally rehearsed, teams ship database changes with the same confidence as application releases and preserve a reliable audit trail.

Kubernetes ordering for migrations pairs naturally with declarative delivery in GitOps with Argo CD and Flux on Kubernetes.

When schema work blocks releases, diagnose friction alongside the release pipeline bottlenecks framework.