unify secrets across clouds without sprawl or stale credentials

Managing secrets at scale in multi-cloud and hybrid environments

14 min read

When AWS, GCP, Azure, and on-prem Vault all hold credentials, sprawl and stale rotations multiply blast radius. A central secrets plane with External Secrets Operator and short-lived dynamic credentials keeps access auditable and contained.

Why multi-cloud estates multiply secret risk

A single EKS cluster might read from AWS Secrets Manager, an on-prem Vault cluster, and GCP Secret Manager at the same time. Without a deliberate architecture, teams fall back to environment variables, long-lived keys in CI variables, and one-off scripts that never rotate. Secret sprawl spreads the same database password across a dozen microservices, three pipelines, and two monitoring tools—each with a different access model and audit trail. Rotating the password in one store leaves stale references elsewhere. Auditors ask who read production credentials last month; answering that across heterogeneous backends is painful. A leaked service account in a multi-tenant cluster can cascade across namespaces, cloud accounts, and partner integrations. The goal is a unified control plane: short-lived credentials, centralized policy, native Kubernetes and CI integration, and evidence that survives compliance reviews.

Reference architecture: Vault hub, cloud bridges, and Kubernetes sync

Treat HashiCorp Vault—or a comparable enterprise store—as the policy and audit anchor. Enable secret engines per concern: KV v2 for application configuration, database for dynamic PostgreSQL users, AWS and GCP engines for cloud IAM credentials, PKI for mTLS certificates. Bridge cloud-native stores where required: External Secrets Operator can sync Vault into Kubernetes Secrets, or read AWS Secrets Manager and GCP Secret Manager directly through ClusterSecretStore objects per provider. Prefer dynamic secrets with TTL over static passwords copied into manifests. Run Vault in HA with integrated storage and cloud KMS auto-unseal. Inject runtime material through External Secrets Operator for sync-and-mount workflows, or the Secrets Store CSI driver when pods should read files without persisting Kubernetes Secret objects. CI systems authenticate with JWT or AppRole, fetch only the paths they need, and never print secret values in logs.

Bash · install External Secrets Operator
helm repo add external-secrets https://charts.external-secrets.io
helm upgrade --install external-secrets external-secrets/external-secrets \
  --namespace external-secrets --create-namespace \
  --set installCRDs=true

Vault HA bootstrap and secret engines

Deploy Vault with Raft storage and KMS auto-unseal so restarts do not depend on manual shamir keys. Terminate TLS on the listener, pin api_addr to the service DNS name, and enable audit devices before production traffic. Mount engines per backend: KV for config, database for dynamic credentials, cloud engines where applications need federated cloud access. Scope policies by path prefix so each workload reads only its subtree.

HCL · Vault server with Raft and AWS KMS seal
storage "raft" {
  path    = "/vault/data"
  node_id = "vault-node-1"
}

seal "awskms" {
  region     = "us-east-1"
  kms_key_id = "alias/vault-unseal"
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/vault/tls/tls.crt"
  tls_key_file  = "/vault/tls/tls.key"
}

api_addr     = "https://vault.internal:8200"
cluster_addr = "https://vault-node-1.internal:8200"
Bash · enable Vault secret engines
vault secrets enable -path=secret kv-v2
vault secrets enable database
vault secrets enable -path=aws/production aws
vault secrets enable -path=gcp/production gcp
vault secrets enable pki

vault audit enable file file_path=/var/log/vault/audit.log

External Secrets Operator: Vault and multi-cloud ClusterSecretStore

ClusterSecretStore defines how the operator authenticates to Vault or a cloud API. ExternalSecret maps remote paths to a Kubernetes Secret on a refresh interval—shorter intervals pick up rotations faster; dynamic database credentials should use TTLs under one hour with refresh below TTL. Use Kubernetes auth from the operator service account to Vault. For workloads that must stay in a single cloud, add separate ClusterSecretStore resources for AWS Secrets Manager or GCP Secret Manager while Vault still issues dynamic database users shared across regions.

YAML · ClusterSecretStore for Vault KV v2
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
  name: vault-backend
spec:
  provider:
    vault:
      server: https://vault.internal:8200
      path: secret
      version: v2
      auth:
        kubernetes:
          mountPath: kubernetes
          role: external-secrets
          serviceAccountRef:
            name: external-secrets
            namespace: external-secrets
YAML · ExternalSecret for database connection fields
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-database-credentials
  namespace: production
spec:
  refreshInterval: 15m
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
    template:
      engineVersion: v2
      data:
        DB_HOST: "{{ .host }}"
        DB_USER: "{{ .username }}"
        DB_PASS: "{{ .password }}"
        DB_PORT: "{{ .port }}"
  dataFrom:
    - extract:
        key: secret/data/production/database

Dynamic credentials, Terraform policies, and CI/CD JWT auth

Configure Vault database roles with creation and revocation statements so each pod or job receives a unique user that expires automatically. Terraform can codify mounts, database connections, Kubernetes auth roles, and least-privilege policies for External Secrets and CI pipelines. In GitLab CI, exchange the job JWT for a Vault token; in GitHub Actions, use hashicorp/vault-action with OIDC—export secrets to environment variables and never echo them. Static KV secrets rotate on a schedule with version metadata; after rotation, ESO refresh propagates new values to clusters.

Terraform · database dynamic role and Kubernetes auth
resource "vault_database_secret_backend_connection" "postgresql" {
  backend = vault_mount.database.path
  name    = "production-postgresql"

  postgresql {
    connection_url = "postgresql://{{username}}:{{password}}@db.internal:5432/app?sslmode=require"
  }
}

resource "vault_database_secret_backend_role" "readonly" {
  backend       = vault_mount.database.path
  name          = "readonly"
  db_name       = vault_database_secret_backend_connection.postgresql.name
  default_ttl   = 3600
  max_ttl       = 86400
  creation_statements = [
    "CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';",
    "GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";",
  ]
}

resource "vault_kubernetes_auth_backend_role" "external_secrets" {
  role_name                        = "external-secrets"
  bound_service_account_names      = ["external-secrets"]
  bound_service_account_namespaces = ["external-secrets"]
  token_policies                   = [vault_policy.external_secrets.name]
  token_ttl                        = 3600
}
YAML · GitHub Actions Vault JWT without logging secrets
jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: hashicorp/vault-action@v3
        with:
          url: https://vault.internal:8200
          method: jwt
          role: github-actions
          secrets: |
            secret/data/production/api-key api_key | API_KEY
      - name: Deploy
        env:
          API_KEY: ${{ env.API_KEY }}
        run: ./deploy.sh

Operational practices: scanning, least privilege, audit, and failure drills

Block secrets in Git with detect-secrets or git-secrets in pre-commit and CI. Grant each service and pipeline the narrowest Vault path policy—list parent paths without read where discovery is required. Ship Vault audit logs to SIEM or Loki; query by request.path to answer auditor questions. Set max TTL on every credential class; alert when rotation jobs fail. Drill disaster scenarios in staging: scale Vault to zero and confirm pods with already-synced Secrets still start; shorten ESO refresh and verify new database creds propagate during traffic. Never store secrets in container images or echo them in pipeline output—the first contained leak pays for the architecture when blast radius is a one-hour token, not a permanent key.

HCL · least-privilege Vault policy per service
path "secret/data/production/service-a/*" {
  capabilities = ["read", "list"]
}

path "secret/data/production/*" {
  capabilities = ["list"]
}
Bash · query Vault audit log for secret access
jq 'select(.type=="response" and (.request.path|startswith("secret/data/production/db")))' \
  /var/log/vault/audit.log \
  | jq '{time, path: .request.path, identity: .auth.display_name}'
YAML · CI secret scan with detect-secrets
- name: Scan for secrets
  run: |
    pip install detect-secrets
    detect-secrets scan --all-files --exclude-files '\.lock$' > report.json
    python -c "import json,sys; r=json.load(open('report.json'))['results']; sys.exit(1 if r else 0)"

Pipeline-focused Vault patterns and CSI mounts are covered in our secrets management guide for DevOps and CI/CD.

Runtime secret injection complements cluster hardening from our Kubernetes security hardening guide for production clusters.