reduce delivery friction through a standardized internal platform

Building an internal developer platform: from scattered CI/CD scripts to a unified deployment experience

13 min read

When each team owns a different pipeline style, delivery slows and platform risk grows. This guide shows how to build an Internal Developer Platform with a deployment abstraction layer, service catalog, policy gates, and centralized secrets.

Why fragmented CI/CD slows delivery over time

Most teams start with practical scripts that solve immediate needs: a Jenkins job here, a GitHub Actions workflow there, and a custom shell deployment path maintained by two people. The problem is not any single tool. The problem is that each team evolves its own conventions for environments, secrets, approvals, and rollback steps. Over time this creates hidden deployment variance, knowledge silos, and policy gaps that are expensive during incidents. Developers spend time understanding platform differences instead of shipping product value.

IDP as an abstraction layer, not a full CI/CD replacement

An Internal Developer Platform (IDP) should wrap existing delivery systems, not force a big-bang migration. The platform exposes one stable deploy interface while mapping requests to the right underlying workflow, cluster, and approval chain. Developers express intent; the platform resolves operational details. This reduces cognitive load and lets platform teams enforce shared controls in one place.

Developer-facing deploy command
# Unified command used by developers
idp deploy api-service --version v2.3.1 --env staging

# Platform resolves behind the scenes:
# 1) environment -> cluster + namespace
# 2) service metadata from catalog
# 3) secret references from vault path
# 4) workflow trigger with validated inputs
# 5) status updates to platform dashboard

Service catalog: the contract between teams and platform

A useful IDP needs a canonical service definition. The catalog should include ownership, repository, deployment targets, required secrets, runtime constraints, and reliability expectations. This document becomes a contract: developers describe service intent and requirements, while the platform implements consistent execution against that contract. Keep the catalog in Git, review every change, and version it like application code.

Service catalog entry
apiVersion: idp.angri-tech.org/v1
kind: Service
metadata:
  name: api-gateway
  owner: platform-team
spec:
  repository:
    url: https://github.com/angri-tech/api-gateway
    branch: main
  deployment:
    targets:
      - name: staging
        cluster: eks-staging-us-east-1
        namespace: api-gateway-staging
        autoDeploy: true
      - name: production
        cluster: eks-prod-us-east-1
        namespace: api-gateway-prod
        approvalRequired: true
        approvers:
          - platform-team-leads
  secrets:
    - path: secret/api-gateway/database
      required: true
  resources:
    cpu:
      request: 500m
      limit: 2000m
    memory:
      request: 512Mi
      limit: 2Gi
  slo:
    availability: 99.9

Policy gate and centralized secrets are non-negotiable

Without enforcement, platform standards become guidelines that teams bypass under pressure. Add a policy gate that validates deployment requests before execution: recent security scan status, required resource limits, production SLO declarations, and dependency compatibility checks. In parallel, centralize secrets in Vault or cloud secret managers and inject them at runtime. Developers should reference secret dependencies, not handle raw secret values.

Policy evaluation sketch
func EvaluateDeployment(ctx context.Context, req DeploymentRequest) PolicyResult {
    var result PolicyResult

    scanStatus, err := getLatestScanStatus(ctx, req.ServiceName, req.Version)
    if err != nil || scanStatus.AgeHours > 24 {
        result.Errors = append(result.Errors, PolicyViolation{
            Policy:      "security-scan-required",
            Resource:    req.ServiceName,
            Message:     "No recent security scan found",
            Remediation: "Run: idp security-scan <service>",
        })
    }

    if req.ServiceCatalog.Spec.Resources.CPU.Request == "" {
        result.Errors = append(result.Errors, PolicyViolation{
            Policy:      "cost-tag-required",
            Resource:    req.ServiceName,
            Message:     "No CPU resource requests defined",
            Remediation: "Add resources.cpu.request to service-catalog.yaml",
        })
    }

    if req.Environment == "production" && req.ServiceCatalog.Spec.SLO.Availability == 0 {
        result.Errors = append(result.Errors, PolicyViolation{
            Policy:      "slo-required-production",
            Resource:    req.ServiceName,
            Message:     "Production services must define an SLO",
            Remediation: "Add slo.availability to service-catalog.yaml",
        })
    }

    result.Passed = len(result.Errors) == 0
    return result
}

Reference workflow: IDP-triggered deployment in GitHub Actions

The workflow should be thin and deterministic: validate inputs, authenticate to the cluster, fetch runtime secrets from your manager, deploy with immutable image tags, and report status back to the platform API. Keep this flow reusable across services so teams do not rewrite deployment logic for each repository.

GitHub Actions deployment workflow
name: IDP Service Deployment
on:
  workflow_dispatch:
    inputs:
      service:
        required: true
      version:
        required: true
      environment:
        required: true
      cluster:
        required: true
      namespace:
        required: true
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Validate inputs
        run: |
          test -n "${{ inputs.service }}"
          test -n "${{ inputs.version }}"

      - name: Authenticate to Kubernetes
        uses: azure/k8s-set-context@v1
        with:
          kubeconfig: ${{ secrets.KUBECONFIG }}
          context: ${{ inputs.cluster }}

      - name: Fetch secrets from Vault
        uses: hashicorp/vault-action@v2
        with:
          method: kubernetes
          url: https://vault.internal.angri-tech.org
          secrets: |
            secret/data/${{ inputs.service }}/${{ inputs.environment }} DATABASE_URL

      - name: Deploy with Helm
        run: |
          helm upgrade --install "${{ inputs.service }}" ./charts/${{ inputs.service }} \
            --namespace "${{ inputs.namespace }}" \
            --set image.tag="${{ inputs.version }}" \
            --wait --timeout 5m

Operate the platform as an internal product

Treat platform engineering as product development for internal users. Track adoption and friction with metrics: deployment frequency, lead time, policy failure rate, and time spent on platform toil. Build a paved road that is easier than bypassing the platform. Detect drift between catalog intent and runtime state, but avoid surprise auto-remediation in production without clear ownership. Start with one service or team, prove reduced friction, then scale patterns incrementally. The goal is not a perfect platform in one quarter; the goal is a platform that makes each next deployment safer and faster.

If your deployment flow is already inconsistent across environments, start by mapping bottlenecks with the release pipeline bottlenecks framework.

Once the platform API is stable, declarative rollout control becomes much easier with GitOps workflows using Argo CD and Flux.