luizmachado.dev

PT EN

Session 010 — CDK Pipelines: Custom Stages, ShellSteps and self-mutation in action

Estimated duration: 60 minutes
Prerequisites: session-009 — CDK Pipelines: bootstrap cross-account, OIDC connection and pipeline structure


Objective

By the end, you will be able to add a Stage with multiple stacks in sequence and in parallel, insert validation ShellSteps between stages (e.g., smoke tests post-deploy), observe the self-mutation cycle (the pipeline re-executes itself when the pipeline code changes before any application deploy), and debug a deploy that fails during the asset publishing phase.


Context

[FACT] The previous session covered the minimal pipeline structure (Source → Build → UpdatePipeline). This session dives deeper into the blocks that come after: how to organize stages, how to add validations between them, and how to use outputs from deployed stacks in subsequent steps.

[CONSENSUS] The recommended pattern for production pipelines is: deploy to dev → automated smoke tests → manual approval → deploy to prod → automated smoke tests. CDK Pipelines has native support for each of these steps via pre, post, ShellStep and ManualApprovalStep.


Key concepts

1. Sequential and parallel stages — addStage and Wave

By default, stages added with addStage are executed sequentially:

// Sequencial: dev termina antes de staging começar
pipeline.addStage(new AppStage(this, 'Dev',     { env: devEnv }));
pipeline.addStage(new AppStage(this, 'Staging', { env: stagingEnv }));
pipeline.addStage(new AppStage(this, 'Prod',    { env: prodEnv }));

For parallel stages, use Wave:

// Wave: eu-west-1 e ap-southeast-1 deployam ao mesmo tempo
const multiRegionWave = pipeline.addWave('MultiRegionDeploy');

multiRegionWave.addStage(new AppStage(this, 'ProdEU', {
  env: { account: PROD_ACCOUNT, region: 'eu-west-1' },
}));

multiRegionWave.addStage(new AppStage(this, 'ProdAP', {
  env: { account: PROD_ACCOUNT, region: 'ap-southeast-1' },
}));

// Depois da wave, um stage sequencial aguarda ambas terminarem
pipeline.addStage(new MonitoringStage(this, 'GlobalMonitoring', {
  env: { account: TOOLS_ACCOUNT, region: 'us-east-1' },
}));

Execution diagram:

Sequential:                         Wave (parallel):
  Dev  ──► Staging ──► Prod           ┌── ProdEU ──┐
  (uma de cada vez)                   │             ├──► GlobalMonitoring
                                      └── ProdAP ──┘
                                      (simultâneos)

Stack ordering within a Stage:

[FACT] Within a Stage with multiple stacks, CDK determines the order automatically based on dependencies between stacks. Independent stacks are deployed in parallel; stacks with dependencies are ordered correctly.

export class AppStage extends cdk.Stage {
  constructor(scope: Construct, id: string, props: cdk.StageProps) {
    super(scope, id, props);

    const network = new NetworkStack(this, 'Network');
    const data    = new DataStack(this, 'Data', { vpc: network.vpc });
    const app     = new AppStack(this, 'App',  { vpc: network.vpc, table: data.table });
    // CDK infere: Network → (Data e App em paralelo, depois App espera Data)
    // Network deploya primeiro; Data e App deployam depois em paralelo se não há dep entre eles
  }
}

To force explicit dependency between stacks in the same Stage:

app.addDependency(data);   // garante que Data termina antes de App começar

2. ShellStep — validations between stages

ShellStep is the fundamental building block for inserting shell commands at any point in the pipeline. It can be used as pre (before the stage deploy) or post (after the deploy).

import * as pipelines from 'aws-cdk-lib/pipelines';

// Pre-step: roda antes do deploy do stage
pipeline.addStage(new AppStage(this, 'Dev', { env: devEnv }), {
  pre: [
    new pipelines.ShellStep('LintAndTest', {
      commands: [
        'npm ci',
        'npm run lint',
        'npm test',
      ],
    }),
  ],
  post: [
    new pipelines.ShellStep('SmokeTest', {
      commands: [
        // Acessa a URL da aplicação e verifica o health endpoint
        'curl -f $APP_URL/health || exit 1',
        'echo "Smoke test passed"',
      ],
      envFromCfnOutputs: {
        APP_URL: appStageRef.appUrlOutput,  // output da stack (ver seção 4)
      },
    }),
  ],
});

Common use cases for ShellStep:

PRE-DEPLOY:
  ✅ Lint e testes unitários antes de deployar
  ✅ Validação de segurança (checkov, cfn-nag, cfn-guard)
  ✅ Geração de documentação ou artefatos
  ✅ ManualApprovalStep (tipo especial de pre-step)

POST-DEPLOY:
  ✅ Smoke tests (HTTP health checks, ping endpoints)
  ✅ Integration tests contra o ambiente recém-deployado
  ✅ Notificações (Slack, PagerDuty) de deploy bem-sucedido
  ✅ Invalidação de cache CDN
  ✅ Atualização de registro de deploy (JIRA, Backstage)

3. CodeBuildStep — ShellStep with more control

CodeBuildStep is a more powerful version of ShellStep that gives access to CodeBuild configurations: custom image, additional IAM policies, environment variables, and timeout.

import * as codebuild from 'aws-cdk-lib/aws-codebuild';

const integrationTest = new pipelines.CodeBuildStep('IntegrationTest', {
  commands: [
    'npm ci',
    'npm run test:integration',
  ],

  // Variáveis de ambiente estáticas
  env: {
    ENVIRONMENT: 'dev',
    LOG_LEVEL: 'debug',
  },

  // Imagem do CodeBuild customizada
  buildEnvironment: {
    buildImage: codebuild.LinuxBuildImage.STANDARD_7_0,
    computeType: codebuild.ComputeType.MEDIUM,
    privileged: false,
  },

  // Políticas IAM adicionais para o CodeBuild project
  rolePolicyStatements: [
    new iam.PolicyStatement({
      effect: iam.Effect.ALLOW,
      actions: ['ssm:GetParameter'],
      resources: [`arn:aws:ssm:${this.region}:${this.account}:parameter/test/*`],
    }),
    new iam.PolicyStatement({
      effect: iam.Effect.ALLOW,
      actions: ['secretsmanager:GetSecretValue'],
      resources: [`arn:aws:secretsmanager:${this.region}:${this.account}:secret:test-*`],
    }),
  ],

  // Timeout do CodeBuild project
  timeout: cdk.Duration.minutes(30),

  // Variáveis vindas de outputs de stacks deployados (ver seção 4)
  envFromCfnOutputs: {
    API_ENDPOINT: apiStack.endpointOutput,
  },
});

pipeline.addStage(new AppStage(this, 'Dev', { env: devEnv }), {
  post: [integrationTest],
});

4. Outputs from deployed stacks in ShellSteps

A very common pattern is: deploy the stack → get the endpoint/URL of what was created → use it in the smoke test. CDK Pipelines has native support via envFromCfnOutputs.

Step 1 — expose the output in the Stage:

// lib/app-stage.ts
export class AppStage extends cdk.Stage {
  // Expor os outputs que serão usados em steps
  public readonly apiEndpointOutput: CfnOutput;
  public readonly loadBalancerDnsOutput: CfnOutput;

  constructor(scope: Construct, id: string, props: cdk.StageProps) {
    super(scope, id, props);

    const appStack = new AppStack(this, 'App');

    // Os outputs são CfnOutput do stack
    this.apiEndpointOutput = appStack.apiEndpoint;       // CfnOutput definido em AppStack
    this.loadBalancerDnsOutput = appStack.lbDnsName;     // CfnOutput definido em AppStack
  }
}

// lib/app-stack.ts
export class AppStack extends cdk.Stack {
  public readonly apiEndpoint: CfnOutput;
  public readonly lbDnsName: CfnOutput;

  constructor(scope: Construct, id: string, props: cdk.StackProps) {
    super(scope, id, props);

    const api = new apigw.RestApi(this, 'Api');

    this.apiEndpoint = new CfnOutput(this, 'ApiEndpoint', {
      value: api.url,
    });

    const lb = new elbv2.ApplicationLoadBalancer(this, 'LB', { vpc, internetFacing: true });

    this.lbDnsName = new CfnOutput(this, 'LbDnsName', {
      value: lb.loadBalancerDnsName,
    });
  }
}

Step 2 — use the outputs in the pipeline:

// lib/pipeline-stack.ts
const devStage = new AppStage(this, 'Dev', { env: devEnv });

pipeline.addStage(devStage, {
  post: [
    new pipelines.ShellStep('SmokeTests', {
      commands: [
        // API_ENDPOINT e LB_DNS são populados automaticamente
        // com os valores dos CfnOutputs após o deploy
        'curl -f https://$API_ENDPOINT/health',
        'curl -f http://$LB_DNS/ping',
        'echo "All smoke tests passed"',
      ],
      envFromCfnOutputs: {
        API_ENDPOINT: devStage.apiEndpointOutput,
        LB_DNS:       devStage.loadBalancerDnsOutput,
      },
    }),
  ],
});

What CDK does under the hood:

[FACT] The envFromCfnOutputs creates an implicit dependency: the ShellStep only starts after the entire stage is deployed, and the CodeBuild project receives the variables via CodePipeline (which reads the outputs from the deployed CloudFormation stack).


5. Self-mutation in action — observing the cycle

This section documents the self-mutation behavior so you can recognize what is happening when the pipeline restarts.

Scenario: you add a new Stage to the pipeline

Estado antes: pipeline tem Dev e Prod
Você adiciona Staging entre Dev e Prod e faz push
Execução #N (com o novo código):

  Source:          ✅ Baixa código com o novo stage Staging
  Build:           ✅ cdk synth → cloud assembly com Dev + Staging + Prod
  UpdatePipeline:  ⚠️  Pipeline atual: Dev → Prod
                       Cloud assembly: Dev → Staging → Prod
                       DIFERENTE → aplica mudança → REINICIA

Execução #N+1 (mesma branch, pipeline atualizada):

  Source:          ✅ Mesmo código
  Build:           ✅ Mesmo cloud assembly
  UpdatePipeline:  ✅ Pipeline atual = cloud assembly → sem mudança → AVANÇA

  Assets:          Upload de assets
  Dev:             Deploy na conta dev
  SmokeTests:      Testes pós-deploy dev
  Staging:         Deploy na conta staging  ← novo stage funcionando
  Approve:         Aprovação manual
  Prod:            Deploy na conta prod

How to observe the restart in the Console:

CodePipeline → MeuAppPipeline → Executions
  Execution #N:   Status: Superseded  ← foi substituída pelo restart
  Execution #N+1: Status: Succeeded   ← a que realmente executou tudo

[FACT] When the pipeline restarts due to self-mutation, the previous execution gets the status Superseded (not Failed). If you see Superseded, the pipeline worked correctly — it is not an error.

Forcing a manual restart:

# Equivalente ao botão "Release Change" no console
aws codepipeline start-pipeline-execution \
  --name MeuAppPipeline \
  --profile pipeline

6. Debugging asset publishing failures

The asset publishing phase (Assets stage) is where CDK uploads Lambda code and Docker images. Failures here have specific causes.

Assets stage structure:

Assets
  ├── Publish-Asset-HASH_A (Lambda zip)
  ├── Publish-Asset-HASH_B (outro Lambda zip)
  └── Publish-Asset-HASH_C (imagem Docker)

[FACT] Each asset has its own parallel CodeBuild action. A failure in one does not immediately cancel the others — they may fail in parallel.

Common errors and diagnosis:

Error 1: AccessDenied when uploading to S3

Error: AccessDenied: Access Denied (Service: S3, Status Code: 403)

Causa: O CodeBuild do asset publishing não tem permissão para escrever no bucket
       do bootstrap da conta de destino.

Diagnóstico:
  1. Verifique se a conta de destino foi bootstrapada com --trust
  2. Verifique se a role cdk-XXXX-file-publishing-role existe na conta de destino
  3. Verifique se a trust policy da role inclui a conta da pipeline

Fix:
  cdk bootstrap aws://DEST_ACCOUNT/REGION \
    --profile dest \
    --trust PIPELINE_ACCOUNT \
    --cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess

Error 2: Docker not available for image assets

Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock

Causa: O CodeBuild project do asset publishing não tem Docker habilitado.

Fix no CDK:
  const pipeline = new pipelines.CodePipeline(this, 'Pipeline', {
    synth,
    assetPublishingCodeBuildDefaults: {
      buildEnvironment: {
        privileged: true,  // habilita Docker no CodeBuild
      },
    },
  });

⚠️  Se a pipeline já está deployada:
  1. Sete privileged: true
  2. Faça push e aguarde o self-mutation aplicar a mudança
  3. Só então adicione o asset Docker
  Motivo: mudar privileged numa pipeline existente exige recrear o CodeBuild project
          via self-mutation antes de usar Docker

Error 3: Cannot find module in Lambda after deploy

Error: Runtime.ImportModuleError: Cannot find module 'sharp'

Causa: Módulo com binário nativo foi bundled pelo esbuild para a arquitetura errada
       (ver sessão 007 — externalModules vs nodeModules).

Diagnóstico:
  1. Verifique se 'sharp' está em externalModules (errado) ou nodeModules (correto)
  2. Verifique se a arquitetura do NodejsFunction bate com o módulo compilado

Fix:
  bundling: {
    nodeModules: ['sharp'],   // compila dentro do Docker para Amazon Linux
  }

Inspecting logs from a failed asset publishing:

# Encontrar o build ID do CodeBuild que falhou
aws codepipeline get-pipeline-state \
  --name MeuAppPipeline \
  --profile pipeline \
  --query 'stageStates[?stageName==`Assets`].actionStates[?currentRevision!=null].latestExecution.externalExecutionId' \
  --output text

# Ver os logs do CodeBuild
aws codebuild batch-get-builds \
  --ids BUILD_ID \
  --profile pipeline \
  --query 'builds[0].logs.deepLink'
# Abre o CloudWatch Logs com os logs completos do build

7. Complete pipeline with all features

// lib/pipeline-stack.ts
export class PipelineStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: cdk.StackProps) {
    super(scope, id, props);

    const source = pipelines.CodePipelineSource.connection(
      'minha-org/meu-repo', 'main',
      { connectionArn: CONNECTION_ARN }
    );

    const synth = new pipelines.ShellStep('Synth', {
      input: source,
      commands: ['npm ci', 'npm run build', 'npx cdk synth'],
    });

    const pipeline = new pipelines.CodePipeline(this, 'Pipeline', {
      pipelineName: 'MeuAppPipeline',
      synth,
      selfMutation: true,
      publishAssetsInParallel: true,
      codeBuildDefaults: {
        buildEnvironment: {
          buildImage: codebuild.LinuxBuildImage.STANDARD_7_0,
        },
      },
    });

    // ── Stage Dev ─────────────────────────────────────────────────────
    const devStage = new AppStage(this, 'Dev', { env: DEV_ENV });

    pipeline.addStage(devStage, {
      pre: [
        new pipelines.ShellStep('UnitTests', {
          commands: ['npm ci', 'npm test'],
        }),
      ],
      post: [
        new pipelines.CodeBuildStep('IntegrationTests', {
          commands: [
            'npm ci',
            'npm run test:integration',
          ],
          envFromCfnOutputs: {
            API_URL: devStage.apiEndpointOutput,
          },
          rolePolicyStatements: [
            new iam.PolicyStatement({
              actions: ['execute-api:Invoke'],
              resources: ['*'],
            }),
          ],
        }),
      ],
    });

    // ── Wave: Staging multi-região (paralelo) ─────────────────────────
    const stagingWave = pipeline.addWave('Staging', {
      pre: [new pipelines.ManualApprovalStep('ApproveStaging')],
    });

    stagingWave.addStage(new AppStage(this, 'StagingUS', {
      env: { account: STAGING_ACCOUNT, region: 'us-east-1' },
    }));

    stagingWave.addStage(new AppStage(this, 'StagingEU', {
      env: { account: STAGING_ACCOUNT, region: 'eu-west-1' },
    }));

    // ── Stage Prod ────────────────────────────────────────────────────
    const prodStage = new AppStage(this, 'Prod', { env: PROD_ENV });

    pipeline.addStage(prodStage, {
      pre: [
        new pipelines.ManualApprovalStep('ApproveProd'),
        // Gating automático: bloqueia se IAM permissions aumentaram
        new pipelines.ConfirmPermissionsBroadening('CheckPermissions', {
          stage: prodStage,
        }),
      ],
      post: [
        new pipelines.ShellStep('ProdSmokeTests', {
          commands: ['curl -f $API_URL/health || exit 1'],
          envFromCfnOutputs: {
            API_URL: prodStage.apiEndpointOutput,
          },
        }),
      ],
    });
  }
}

Common pitfalls

1. envFromCfnOutputs with wrong stack output

If you pass a CfnOutput from a stack that was not deployed in this stage, the pipeline fails with an invalid reference error. envFromCfnOutputs only accepts outputs from stacks within the stage that precedes the step. To use an output from another stage, you need to pass it via SSM Parameter Store or Secrets Manager.

2. Wave with stages that have cross-dependencies

Stages within a Wave are deployed in parallel — which assumes they are independent. If Stage A depends on an output from Stage B (both in the same Wave), this creates a circular dependency that CDK detects with an error. Move the dependent stages outside the Wave.

3. Modifying privileged on an existing pipeline without doing self-mutation first

If you enable Docker assets on a project that already has a deployed pipeline, but don't wait for self-mutation to apply privileged: true before adding the Docker asset, the CodeBuild project will still have privileged: false on the execution where the asset appears for the first time. The failure will happen on that execution. The solution: make two commits — first one with only privileged: true, wait for the pipeline to self-mutate; then a second commit with the Docker asset.


Reflection exercise

You have a pipeline with dev → staging → prod. The smoke tests for the dev stage are failing intermittently: 70% of the time they pass, 30% fail with curl: (7) Failed to connect. You suspect the tests start before the application is fully ready after the deploy.

What are the possible strategies to solve this? Consider at least three different approaches (from adjustments in the ShellStep to changes in the health check architecture). For each one, describe the trade-off between implementation complexity, reliability, and impact on total pipeline time. Which one would you implement first and why?


Resources for further reading

Complete Pipelines readme:
- aws-cdk-lib.pipelines module — complete reference for ShellStep, CodeBuildStep, Wave, envFromCfnOutputs, ConfirmPermissionsBroadening and all configuration options for the CodePipeline construct.

CDK Pipelines guide:
- Continuous integration and delivery (CI/CD) using CDK Pipelines — troubleshooting section covers the most common asset publishing and self-mutation errors.

Original blog post:
- CDK Pipelines: Continuous Delivery for AWS CDK Applications — complete walk-through with the reasoning behind the self-mutation design and the Source → Build → UpdatePipeline progression.