unify secrets across clouds without sprawl or stale credentials
Managing secrets at scale in multi-cloud and hybrid environments
14 min read
When AWS, GCP, Azure, and on-prem Vault all hold credentials, sprawl and stale rotations multiply blast radius. A central secrets plane with External Secrets Operator and short-lived dynamic credentials keeps access auditable and contained.
Why multi-cloud estates multiply secret risk
A single EKS cluster might read from AWS Secrets Manager, an on-prem Vault cluster, and GCP Secret Manager at the same time. Without a deliberate architecture, teams fall back to environment variables, long-lived keys in CI variables, and one-off scripts that never rotate. Secret sprawl spreads the same database password across a dozen microservices, three pipelines, and two monitoring tools—each with a different access model and audit trail. Rotating the password in one store leaves stale references elsewhere. Auditors ask who read production credentials last month; answering that across heterogeneous backends is painful. A leaked service account in a multi-tenant cluster can cascade across namespaces, cloud accounts, and partner integrations. The goal is a unified control plane: short-lived credentials, centralized policy, native Kubernetes and CI integration, and evidence that survives compliance reviews.
Reference architecture: Vault hub, cloud bridges, and Kubernetes sync
Treat HashiCorp Vault—or a comparable enterprise store—as the policy and audit anchor. Enable secret engines per concern: KV v2 for application configuration, database for dynamic PostgreSQL users, AWS and GCP engines for cloud IAM credentials, PKI for mTLS certificates. Bridge cloud-native stores where required: External Secrets Operator can sync Vault into Kubernetes Secrets, or read AWS Secrets Manager and GCP Secret Manager directly through ClusterSecretStore objects per provider. Prefer dynamic secrets with TTL over static passwords copied into manifests. Run Vault in HA with integrated storage and cloud KMS auto-unseal. Inject runtime material through External Secrets Operator for sync-and-mount workflows, or the Secrets Store CSI driver when pods should read files without persisting Kubernetes Secret objects. CI systems authenticate with JWT or AppRole, fetch only the paths they need, and never print secret values in logs.
helm repo add external-secrets https://charts.external-secrets.io
helm upgrade --install external-secrets external-secrets/external-secrets \
--namespace external-secrets --create-namespace \
--set installCRDs=trueVault HA bootstrap and secret engines
Deploy Vault with Raft storage and KMS auto-unseal so restarts do not depend on manual shamir keys. Terminate TLS on the listener, pin api_addr to the service DNS name, and enable audit devices before production traffic. Mount engines per backend: KV for config, database for dynamic credentials, cloud engines where applications need federated cloud access. Scope policies by path prefix so each workload reads only its subtree.
storage "raft" {
path = "/vault/data"
node_id = "vault-node-1"
}
seal "awskms" {
region = "us-east-1"
kms_key_id = "alias/vault-unseal"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/vault/tls/tls.crt"
tls_key_file = "/vault/tls/tls.key"
}
api_addr = "https://vault.internal:8200"
cluster_addr = "https://vault-node-1.internal:8200"vault secrets enable -path=secret kv-v2
vault secrets enable database
vault secrets enable -path=aws/production aws
vault secrets enable -path=gcp/production gcp
vault secrets enable pki
vault audit enable file file_path=/var/log/vault/audit.logExternal Secrets Operator: Vault and multi-cloud ClusterSecretStore
ClusterSecretStore defines how the operator authenticates to Vault or a cloud API. ExternalSecret maps remote paths to a Kubernetes Secret on a refresh interval—shorter intervals pick up rotations faster; dynamic database credentials should use TTLs under one hour with refresh below TTL. Use Kubernetes auth from the operator service account to Vault. For workloads that must stay in a single cloud, add separate ClusterSecretStore resources for AWS Secrets Manager or GCP Secret Manager while Vault still issues dynamic database users shared across regions.
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: https://vault.internal:8200
path: secret
version: v2
auth:
kubernetes:
mountPath: kubernetes
role: external-secrets
serviceAccountRef:
name: external-secrets
namespace: external-secretsapiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-database-credentials
namespace: production
spec:
refreshInterval: 15m
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: db-credentials
creationPolicy: Owner
template:
engineVersion: v2
data:
DB_HOST: "{{ .host }}"
DB_USER: "{{ .username }}"
DB_PASS: "{{ .password }}"
DB_PORT: "{{ .port }}"
dataFrom:
- extract:
key: secret/data/production/databaseDynamic credentials, Terraform policies, and CI/CD JWT auth
Configure Vault database roles with creation and revocation statements so each pod or job receives a unique user that expires automatically. Terraform can codify mounts, database connections, Kubernetes auth roles, and least-privilege policies for External Secrets and CI pipelines. In GitLab CI, exchange the job JWT for a Vault token; in GitHub Actions, use hashicorp/vault-action with OIDC—export secrets to environment variables and never echo them. Static KV secrets rotate on a schedule with version metadata; after rotation, ESO refresh propagates new values to clusters.
resource "vault_database_secret_backend_connection" "postgresql" {
backend = vault_mount.database.path
name = "production-postgresql"
postgresql {
connection_url = "postgresql://{{username}}:{{password}}@db.internal:5432/app?sslmode=require"
}
}
resource "vault_database_secret_backend_role" "readonly" {
backend = vault_mount.database.path
name = "readonly"
db_name = vault_database_secret_backend_connection.postgresql.name
default_ttl = 3600
max_ttl = 86400
creation_statements = [
"CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';",
"GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";",
]
}
resource "vault_kubernetes_auth_backend_role" "external_secrets" {
role_name = "external-secrets"
bound_service_account_names = ["external-secrets"]
bound_service_account_namespaces = ["external-secrets"]
token_policies = [vault_policy.external_secrets.name]
token_ttl = 3600
}jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: hashicorp/vault-action@v3
with:
url: https://vault.internal:8200
method: jwt
role: github-actions
secrets: |
secret/data/production/api-key api_key | API_KEY
- name: Deploy
env:
API_KEY: ${{ env.API_KEY }}
run: ./deploy.shOperational practices: scanning, least privilege, audit, and failure drills
Block secrets in Git with detect-secrets or git-secrets in pre-commit and CI. Grant each service and pipeline the narrowest Vault path policy—list parent paths without read where discovery is required. Ship Vault audit logs to SIEM or Loki; query by request.path to answer auditor questions. Set max TTL on every credential class; alert when rotation jobs fail. Drill disaster scenarios in staging: scale Vault to zero and confirm pods with already-synced Secrets still start; shorten ESO refresh and verify new database creds propagate during traffic. Never store secrets in container images or echo them in pipeline output—the first contained leak pays for the architecture when blast radius is a one-hour token, not a permanent key.
path "secret/data/production/service-a/*" {
capabilities = ["read", "list"]
}
path "secret/data/production/*" {
capabilities = ["list"]
}jq 'select(.type=="response" and (.request.path|startswith("secret/data/production/db")))' \
/var/log/vault/audit.log \
| jq '{time, path: .request.path, identity: .auth.display_name}'- name: Scan for secrets
run: |
pip install detect-secrets
detect-secrets scan --all-files --exclude-files '\.lock$' > report.json
python -c "import json,sys; r=json.load(open('report.json'))['results']; sys.exit(1 if r else 0)"Pipeline-focused Vault patterns and CSI mounts are covered in our secrets management guide for DevOps and CI/CD.
Runtime secret injection complements cluster hardening from our Kubernetes security hardening guide for production clusters.
