Session 007 — CDK: Assets — Lambda bundling, Docker images and local files
Estimated duration: 60 minutes
Prerequisites: session-006 — CDK: Stacks, environments and multi-account patterns
Objective
By the end, you will be able to deploy a Lambda function with bundled dependencies (via aws_lambda_python_alpha or NodejsFunction), publish a Docker image via DockerImageFunction, and understand how assets are staged to S3/ECR by CDK.
Context
[FACT] Assets are local files or Docker images that CDK needs to publish to AWS before deploying stacks. All Lambda code, every container image, and every local file referenced in a CloudFormation template goes through the assets pipeline.
[CONSENSUS] Understanding how assets work is essential for diagnosing two common production problems: slow deploys (unnecessary re-upload of assets that haven't changed) and CI/CD pipeline failures without Docker installed (bundling that depends on Docker but the agent doesn't have it).
Key concepts
1. How assets work — the complete cycle
When CDK encounters an asset during synth, it:
- Calculates a source hash of the content (SHA256)
- Copies the asset to
cdk.out/<hash>/ - Registers in the cloud assembly's
manifest.jsonthe publishing instructions - During
deploy, checks if the asset with that hash already exists at the destination (S3/ECR) - If it doesn't exist, uploads it; if it does, skips — idempotent by hash
cdk synth
└── cdk.out/
├── MeuStack.template.json
├── asset.a1b2c3d4.../ ← conteúdo do asset (zip ou Dockerfile)
│ └── index.py
├── asset.e5f6g7h8.zip ← asset já zipado
└── manifest.json ← instruções de upload
cdk deploy
└── CDK CLI lê manifest.json
└── Para cada asset:
├── Calcula hash local
├── Verifica se s3://cdk-assets-ACCOUNT-REGION/<hash> existe
└── Se não existe: faz upload
└── Passa a chave S3/ECR como parâmetro ao CloudFormation
Asset types:
File assets → arquivos/diretórios locais → zip → S3 (bucket do bootstrap)
Docker assets → Dockerfile local → docker build → push → ECR (repo do bootstrap)
The hash guarantees immutability:
Mesmo código + mesma versão de dependências = mesmo hash = sem re-upload
Mudança em qualquer arquivo do diretório = novo hash = novo upload
2. File assets — local files to S3
The simplest type of asset: a local file or directory that goes to the bootstrap S3 bucket.
import * as s3_assets from 'aws-cdk-lib/aws-s3-assets';
// Um arquivo local referenciado no template
const asset = new s3_assets.Asset(this, 'ConfigFile', {
path: path.join(__dirname, '../config/app-config.json'),
});
// O asset expõe: asset.s3BucketName, asset.s3ObjectKey, asset.httpUrl
console.log(asset.s3ObjectKey); // hash-based key
// Exemplo: passar config para uma instância EC2 via UserData
const instance = new ec2.Instance(this, 'Server', { ... });
asset.grantRead(instance.role); // permissão de leitura para a instância
instance.userData.addS3DownloadCommand({
bucket: asset.bucket,
bucketKey: asset.s3ObjectKey,
localFile: '/etc/app/config.json',
});
Directory as asset (zips automatically):
const dirAsset = new s3_assets.Asset(this, 'AppCode', {
path: path.join(__dirname, '../src'),
// exclude: ['*.test.ts', 'node_modules'] ← arquivos a ignorar no hash/zip
exclude: ['**/*.test.ts', 'node_modules/**'],
});
3. Lambda with inline code — the simplest (no asset)
For small functions without external dependencies, the code can be inline:
import * as lambda from 'aws-cdk-lib/aws-lambda';
const fn = new lambda.Function(this, 'SimpleHandler', {
runtime: lambda.Runtime.PYTHON_3_12,
handler: 'index.handler',
code: lambda.Code.fromInline(`
def handler(event, context):
return {'statusCode': 200, 'body': 'ok'}
`),
});
[FACT] Code.fromInline has a 4 KB limit. For anything larger, use Code.fromAsset or the bundling constructs.
Lambda with local file (no bundling):
// Zipa o diretório e faz upload para S3
const fn = new lambda.Function(this, 'Handler', {
runtime: lambda.Runtime.PYTHON_3_12,
handler: 'index.handler',
code: lambda.Code.fromAsset(path.join(__dirname, '../lambda/handler')),
// ⚠️ Sem bundling: dependências precisam já estar no diretório
// Você é responsável por rodar pip install -t ./handler antes do synth
});
4. NodejsFunction — bundling TypeScript/JS with esbuild
[FACT] NodejsFunction is the high-level construct for TypeScript or JavaScript Lambdas. It uses esbuild to transpile and bundle the code locally (no Docker, very fast) or inside a Docker container (fallback when esbuild is not available).
import * as lambda_nodejs from 'aws-cdk-lib/aws-lambda-nodejs';
const fn = new lambda_nodejs.NodejsFunction(this, 'ApiHandler', {
// entry: ponto de entrada — o CDK localiza automaticamente se o arquivo
// tem o mesmo nome que o construct ID + está no mesmo diretório
entry: path.join(__dirname, '../lambda/api-handler.ts'),
handler: 'handler', // nome da função exportada
runtime: lambda.Runtime.NODEJS_22_X,
architecture: lambda.Architecture.ARM_64,
bundling: {
minify: true, // produção: sim; dev: opcional
sourceMap: true, // mapa de fontes para debugging
target: 'es2022',
externalModules: [
'@aws-sdk/*', // AWS SDK v3 já incluído no runtime da Lambda
],
// Dependências que NÃO devem ser bundled (ficam em node_modules)
nodeModules: ['sharp'], // módulos com binários nativos
},
environment: {
TABLE_NAME: table.tableName,
LOG_LEVEL: 'INFO',
},
timeout: cdk.Duration.seconds(30),
memorySize: 256,
});
What esbuild does:
Antes do bundling (diretório do projeto):
api-handler.ts
utils/logger.ts
utils/validator.ts
node_modules/
zod/
axios/
@aws-sdk/
Depois do bundling (zip enviado para a Lambda):
index.js ← tudo num único arquivo (tree-shaking incluso)
# aws-sdk foi excluído (já está no runtime)
# zod e axios foram incluídos (bundled inline)
# sharp foi excluído (vai em node_modules separado)
Local bundling vs Docker:
Bundling LOCAL (padrão quando esbuild disponível):
✅ Muito mais rápido (segundos vs minutos)
✅ Sem necessidade de Docker
⚠️ Módulos com binários nativos (sharp, bcrypt) podem ter arquitetura errada
→ use nodeModules para deixá-los fora do bundle principal
Bundling em DOCKER (fallback):
✅ Garante compatibilidade com o ambiente Lambda (Amazon Linux 2)
✅ Correto para módulos com binários nativos
⚠️ Lento — cada synth rebuilda a imagem
⚠️ Requer Docker no ambiente de build
Forçar Docker:
bundling: { forceDockerBundling: true }
5. PythonFunction — bundling Python with dependencies
[FACT] PythonFunction is an alpha construct (@aws-cdk/aws-lambda-python-alpha) that manages the bundling of Python functions, installing dependencies in a Docker container compatible with Lambda.
# Instalar o pacote alpha separadamente
npm install @aws-cdk/aws-lambda-python-alpha
# ou para Python:
pip install aws-cdk.aws-lambda-python-alpha
import * as lambda_python from '@aws-cdk/aws-lambda-python-alpha';
const fn = new lambda_python.PythonFunction(this, 'DataProcessor', {
entry: path.join(__dirname, '../lambda/processor'),
// O CDK detecta automaticamente o gerenciador de dependências:
// requirements.txt → pip
// Pipfile → pipenv
// poetry.lock → poetry
// uv.lock → uv
runtime: lambda.Runtime.PYTHON_3_12,
handler: 'handler', // função handler em index.py (padrão)
index: 'processor.py', // se não for index.py
architecture: lambda.Architecture.ARM_64,
bundling: {
assetHashType: cdk.AssetHashType.OUTPUT, // hash do output, não do source
// Image customizada para o bundling (se precisar de dependências de sistema)
image: DockerImage.fromBuild(path.join(__dirname, '../docker/build-env')),
},
environment: {
BUCKET_NAME: bucket.bucketName,
},
});
Function directory structure:
lambda/processor/
├── processor.py ← código da função
├── requirements.txt ← dependências (ou pyproject.toml, Pipfile, etc.)
└── utils/
└── helpers.py
# requirements.txt:
boto3>=1.26.0
pandas==2.0.3
pydantic>=2.0
[FACT] PythonFunction runs pip install inside a Docker container with the Lambda base image (e.g., public.ecr.aws/sam/build-python3.12). This ensures that modules with C extensions (numpy, pandas) are compiled for the correct architecture.
Why the alpha status matters:
[FACT] alpha constructs in CDK have APIs that can change between versions without deprecation notice. For production, pin the alpha package version explicitly and monitor the changelog before upgrading. PythonFunction has been in alpha for a long time — it's widely used, but the API is not stable.
6. DockerImageFunction — Lambda with a full Docker image
When you need full control over the runtime (custom binaries, languages not natively supported, multiple system files), use DockerImageFunction:
import * as lambda from 'aws-cdk-lib/aws-lambda';
const fn = new lambda.DockerImageFunction(this, 'CustomRuntime', {
code: lambda.DockerImageCode.fromImageAsset(
path.join(__dirname, '../docker/my-function'),
{
// Build args para o Dockerfile
buildArgs: {
FUNCTION_VERSION: '1.2.3',
},
// Platform para Lambda ARM64
platform: ecr_assets.Platform.LINUX_ARM64,
// Arquivo Dockerfile customizado (padrão: Dockerfile)
file: 'Dockerfile.lambda',
}
),
architecture: lambda.Architecture.ARM_64,
timeout: cdk.Duration.minutes(5),
memorySize: 1024,
});
Dockerfile for Lambda:
# Dockerfile na pasta docker/my-function/
FROM public.ecr.aws/lambda/python:3.12
# Instalar dependências de sistema
RUN yum install -y libgomp && yum clean all
# Copiar e instalar dependências Python
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copiar código da função
COPY app/ ${LAMBDA_TASK_ROOT}/
# Handler
CMD ["app.handler"]
Difference between DockerImageFunction and PythonFunction:
PythonFunction
→ Usa Docker apenas para BUNDLING (instalar deps)
→ O resultado é um zip enviado para S3
→ Runtime é gerenciado pela AWS (Lambda managed runtime)
→ Mais leve, mais rápido no cold start
DockerImageFunction
→ A imagem Docker É o runtime
→ Imagem vai para ECR
→ Você controla 100% do ambiente
→ Cold start ligeiramente mais lento para imagens grandes
→ Use quando: runtime customizado, binários de sistema, > 250 MB zip
7. How assets are staged — the complete flow
cdk synth
│
├─ NodejsFunction: esbuild roda localmente
│ └── cdk.out/asset.HASH_A/index.js (bundle gerado)
│
├─ PythonFunction: docker build roda
│ └── cdk.out/asset.HASH_B/ (libs instaladas)
│
└─ DockerImageFunction: docker build + tag
└── cdk.out/asset.HASH_C/ (imagem buildada localmente)
cdk deploy
│
├─ asset.HASH_A → zip → s3://cdk-XXXX-assets-ACCOUNT-REGION/HASH_A.zip
│ ↑ só faz upload se ainda não existe
│
├─ asset.HASH_B → zip → s3://cdk-XXXX-assets-ACCOUNT-REGION/HASH_B.zip
│
└─ asset.HASH_C → docker push → ECR (repo do bootstrap)
↑ só faz push se o digest não existe
CloudFormation deploy
└── Recebe como parâmetros:
AssetBucketName: cdk-XXXX-assets-ACCOUNT-REGION
AssetObjectKey: HASH_A.zip
DockerImageUri: ACCOUNT.dkr.ecr.REGION.amazonaws.com/cdk-XXXX-...:HASH_C
Checking the generated assets:
# Ver todos os assets no cloud assembly
cat cdk.out/manifest.json | jq '.artifacts | to_entries[] | select(.value.type == "aws:cloudformation:stack") | .value.metadata'
# Ver o tamanho dos assets antes do upload
du -sh cdk.out/asset.*
# Inspecionar o conteúdo de um asset Lambda
unzip -l cdk.out/asset.XXXX.zip
Practical example
Stack with three types of Lambda using different assets:
import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as lambda_nodejs from 'aws-cdk-lib/aws-lambda-nodejs';
import * as lambda_python from '@aws-cdk/aws-lambda-python-alpha';
import * as path from 'path';
export class LambdaAssetsStack extends cdk.Stack {
constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// 1. TypeScript com NodejsFunction (esbuild)
const apiHandler = new lambda_nodejs.NodejsFunction(this, 'ApiHandler', {
entry: path.join(__dirname, '../src/api/handler.ts'),
runtime: lambda.Runtime.NODEJS_22_X,
architecture: lambda.Architecture.ARM_64,
bundling: {
minify: true,
externalModules: ['@aws-sdk/*'],
},
});
// 2. Python com dependências (PythonFunction)
const processor = new lambda_python.PythonFunction(this, 'Processor', {
entry: path.join(__dirname, '../src/processor'),
runtime: lambda.Runtime.PYTHON_3_12,
architecture: lambda.Architecture.ARM_64,
});
// 3. Container customizado (DockerImageFunction)
const mlInference = new lambda.DockerImageFunction(this, 'MLInference', {
code: lambda.DockerImageCode.fromImageAsset(
path.join(__dirname, '../docker/ml-inference'),
{ platform: ecr_assets.Platform.LINUX_ARM64 }
),
architecture: lambda.Architecture.ARM_64,
memorySize: 3008,
timeout: cdk.Duration.minutes(5),
});
// Outputs para inspecionar
new cdk.CfnOutput(this, 'ApiHandlerArn', { value: apiHandler.functionArn });
new cdk.CfnOutput(this, 'ProcessorArn', { value: processor.functionArn });
new cdk.CfnOutput(this, 'MLInferenceArn', { value: mlInference.functionArn });
}
}
# Antes do deploy, ver o que vai ser construído
cdk synth 2>&1 | grep -E "Bundling|Building|Asset"
# Deploy e monitorar o upload dos assets
cdk deploy --verbose 2>&1 | grep -E "upload|push|asset"
# Ver o tamanho final de cada função no Lambda
aws lambda list-functions \
--query 'Functions[?starts_with(FunctionName, `LambdaAssets`)].{Name:FunctionName,Size:CodeSize}' \
--output table
Common pitfalls
1. Bundling that works locally but fails in CI
NodejsFunction uses esbuild locally but falls back to Docker when esbuild is not available. If your CI pipeline doesn't have Docker, and also doesn't have esbuild installed, the synth fails. Solution: ensure esbuild is a dev dependency in the project's package.json (npm install --save-dev esbuild), so CI installs it along with the other deps.
2. Asset hash doesn't change when a dependency changes
If you use AssetHashType.SOURCE (default), the hash is calculated over the input files — not over the bundling output. This means that updating a dependency version in requirements.txt without changing the function code doesn't generate a new hash and CDK doesn't re-upload. Use AssetHashType.OUTPUT in PythonFunction so the hash reflects the bundling result.
3. nodeModules vs externalModules in NodejsFunction
externalModules: ['sharp'] → sharp é excluído do bundle e NÃO incluído no zip
(assume que sharp já está disponível no runtime — não está)
→ A Lambda vai falhar em runtime com "Cannot find module 'sharp'"
nodeModules: ['sharp'] → sharp é excluído do esbuild mas incluído no zip como node_modules/
→ pip/npm install acontece dentro do container Docker
→ Binário correto para Amazon Linux
Use externalModules only for the AWS SDK and modules you are certain exist in the runtime. Use nodeModules for modules with native binaries that need to be compiled for Lambda.
Reflection exercise
You are implementing an image processing service in Python that uses the Pillow library (with C extensions) and numpy. The function needs to run on ARM64 to reduce cost.
You have three options: PythonFunction with default bundling, PythonFunction with a custom build image, or DockerImageFunction with your own Dockerfile.
For each option, describe: what happens during cdk synth, what is sent to AWS, the impact on Lambda cold start, and the CI/CD environment requirements. Which would you choose for production and why? Is there any scenario where the answer would change?
Resources for further study
Assets:
- Assets and the AWS CDK — covers both types (file and Docker), how the hash is calculated and how CDK decides when to re-upload.
NodejsFunction:
- aws-cdk-lib.aws_lambda_nodejs module — all bundling options with esbuild: minify, sourceMap, externalModules, nodeModules, define, banner.
PythonFunction:
- @aws-cdk/aws-lambda-python-alpha module — supported dependency managers, bundling options and how to use a custom build image.