Files
gh-zxkane-aws-skills/skills/aws-serverless-eda/references/deployment-best-practices.md
2025-11-30 09:08:38 +08:00

19 KiB

Serverless Deployment Best Practices

Deployment best practices for serverless applications including CI/CD, testing, and deployment strategies.

Table of Contents

Software Release Process

Four Stages of Release

1. Source Phase:

  • Developers commit code changes
  • Code review (peer review)
  • Version control (Git)

2. Build Phase:

  • Compile code
  • Run unit tests
  • Style checking and linting
  • Create deployment packages
  • Build container images

3. Test Phase:

  • Integration tests with other systems
  • Load testing
  • UI testing
  • Security testing (penetration testing)
  • Acceptance testing

4. Production Phase:

  • Deploy to production environment
  • Monitor for errors
  • Validate deployment success
  • Rollback if needed

CI/CD Maturity Levels

Continuous Integration (CI):

  • Automated build on code commit
  • Automated unit testing
  • Manual deployment to test/production

Continuous Delivery (CD):

  • Automated deployment to test environments
  • Manual approval for production
  • Automated testing in non-prod

Continuous Deployment:

  • Fully automated pipeline
  • Automated deployment to production
  • No manual intervention after code commit

Infrastructure as Code

Framework Selection

AWS SAM (Serverless Application Model):

# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: nodejs20.x
      CodeUri: src/
      Events:
        Api:
          Type: Api
          Properties:
            Path: /orders
            Method: post

Benefits:

  • Simple, serverless-focused syntax
  • Built-in best practices
  • SAM CLI for local testing
  • Integrates with CodeDeploy

AWS CDK:

new NodejsFunction(this, 'OrderFunction', {
  entry: 'src/orders/handler.ts',
  environment: {
    TABLE_NAME: ordersTable.tableName,
  },
});

ordersTable.grantReadWriteData(orderFunction);

Benefits:

  • Type-safe, programmatic
  • Reusable constructs
  • Rich AWS service support
  • Better for complex infrastructure

When to use:

  • SAM: Serverless-only applications, simpler projects
  • CDK: Complex infrastructure, multiple services, reusable patterns

Environment Management

Separate environments:

// CDK App
const app = new cdk.App();

new ServerlessStack(app, 'DevStack', {
  env: { account: '111111111111', region: 'us-east-1' },
  environment: 'dev',
  logLevel: 'DEBUG',
});

new ServerlessStack(app, 'ProdStack', {
  env: { account: '222222222222', region: 'us-east-1' },
  environment: 'prod',
  logLevel: 'INFO',
});

SAM with parameters:

Parameters:
  Environment:
    Type: String
    Default: dev
    AllowedValues:
      - dev
      - staging
      - prod

Resources:
  Function:
    Type: AWS::Serverless::Function
    Properties:
      Environment:
        Variables:
          ENVIRONMENT: !Ref Environment
          LOG_LEVEL: !If [IsProd, INFO, DEBUG]

CI/CD Pipeline Design

AWS CodePipeline

Comprehensive pipeline:

import * as codepipeline from 'aws-cdk-lib/aws-codepipeline';
import * as codepipeline_actions from 'aws-cdk-lib/aws-codepipeline-actions';

const sourceOutput = new codepipeline.Artifact();
const buildOutput = new codepipeline.Artifact();

const pipeline = new codepipeline.Pipeline(this, 'Pipeline', {
  pipelineName: 'serverless-pipeline',
});

// Source stage
pipeline.addStage({
  stageName: 'Source',
  actions: [
    new codepipeline_actions.CodeStarConnectionsSourceAction({
      actionName: 'GitHub_Source',
      owner: 'myorg',
      repo: 'myrepo',
      branch: 'main',
      output: sourceOutput,
      connectionArn: githubConnection.connectionArn,
    }),
  ],
});

// Build stage
pipeline.addStage({
  stageName: 'Build',
  actions: [
    new codepipeline_actions.CodeBuildAction({
      actionName: 'Build',
      project: buildProject,
      input: sourceOutput,
      outputs: [buildOutput],
    }),
  ],
});

// Test stage
pipeline.addStage({
  stageName: 'Test',
  actions: [
    new codepipeline_actions.CloudFormationCreateUpdateStackAction({
      actionName: 'Deploy_Test',
      templatePath: buildOutput.atPath('packaged.yaml'),
      stackName: 'test-stack',
      adminPermissions: true,
    }),
    new codepipeline_actions.CodeBuildAction({
      actionName: 'Integration_Tests',
      project: testProject,
      input: buildOutput,
      runOrder: 2,
    }),
  ],
});

// Production stage (with manual approval)
pipeline.addStage({
  stageName: 'Production',
  actions: [
    new codepipeline_actions.ManualApprovalAction({
      actionName: 'Approve',
    }),
    new codepipeline_actions.CloudFormationCreateUpdateStackAction({
      actionName: 'Deploy_Prod',
      templatePath: buildOutput.atPath('packaged.yaml'),
      stackName: 'prod-stack',
      adminPermissions: true,
      runOrder: 2,
    }),
  ],
});

GitHub Actions

Serverless deployment workflow:

# .github/workflows/deploy.yml
name: Deploy Serverless Application

on:
  push:
    branches: [main]

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm ci

      - name: Run tests
        run: npm test

      - name: Setup SAM CLI
        uses: aws-actions/setup-sam@v2

      - name: Build SAM application
        run: sam build

      - name: Deploy to Dev
        if: github.ref != 'refs/heads/main'
        run: |
          sam deploy \
            --no-confirm-changeset \
            --no-fail-on-empty-changeset \
            --stack-name dev-stack \
            --parameter-overrides Environment=dev

      - name: Run integration tests
        run: npm run test:integration

      - name: Deploy to Prod
        if: github.ref == 'refs/heads/main'
        run: |
          sam deploy \
            --no-confirm-changeset \
            --no-fail-on-empty-changeset \
            --stack-name prod-stack \
            --parameter-overrides Environment=prod

Testing Strategies

Unit Testing

Test business logic independently:

// handler.ts
export const processOrder = (order: Order): ProcessedOrder => {
  // Pure business logic (easily testable)
  validateOrder(order);
  calculateTotal(order);
  return transformOrder(order);
};

export const handler = async (event: any) => {
  const order = parseEvent(event);
  const processed = processOrder(order); // Testable function
  await saveToDatabase(processed);
  return formatResponse(processed);
};

// handler.test.ts
import { processOrder } from './handler';

describe('processOrder', () => {
  it('calculates total correctly', () => {
    const order = {
      items: [
        { price: 10, quantity: 2 },
        { price: 5, quantity: 3 },
      ],
    };

    const result = processOrder(order);

    expect(result.total).toBe(35);
  });

  it('throws on invalid order', () => {
    const invalid = { items: [] };
    expect(() => processOrder(invalid)).toThrow();
  });
});

Integration Testing

Test in actual AWS environment:

// integration.test.ts
import { LambdaClient, InvokeCommand } from '@aws-sdk/client-lambda';
import { DynamoDBClient, GetItemCommand } from '@aws-sdk/client-dynamodb';

describe('Order Processing Integration', () => {
  const lambda = new LambdaClient({});
  const dynamodb = new DynamoDBClient({});

  it('processes order end-to-end', async () => {
    // Invoke Lambda
    const response = await lambda.send(new InvokeCommand({
      FunctionName: process.env.FUNCTION_NAME,
      Payload: JSON.stringify({
        orderId: 'test-123',
        items: [{ productId: 'prod-1', quantity: 2 }],
      }),
    }));

    const result = JSON.parse(Buffer.from(response.Payload!).toString());

    expect(result.statusCode).toBe(200);

    // Verify database write
    const dbResult = await dynamodb.send(new GetItemCommand({
      TableName: process.env.TABLE_NAME,
      Key: { orderId: { S: 'test-123' } },
    }));

    expect(dbResult.Item).toBeDefined();
    expect(dbResult.Item?.status.S).toBe('PROCESSED');
  });
});

Local Testing with SAM

Test locally before deployment:

# Start local API
sam local start-api

# Invoke function locally
sam local invoke OrderFunction -e events/create-order.json

# Generate sample events
sam local generate-event apigateway aws-proxy > event.json

# Debug locally
sam local invoke OrderFunction -d 5858

# Test with Docker
sam local start-api --docker-network my-network

Load Testing

Test under production load:

# Install Artillery
npm install -g artillery

# Create load test
cat > load-test.yml <<EOF
config:
  target: https://api.example.com
  phases:
    - duration: 300 # 5 minutes
      arrivalRate: 50 # 50 requests/second
      rampTo: 200 # Ramp to 200 req/sec
scenarios:
  - flow:
      - post:
          url: /orders
          json:
            orderId: "{{ $randomString() }}"
EOF

# Run load test
artillery run load-test.yml --output report.json

# Generate HTML report
artillery report report.json

Deployment Strategies

All-at-Once Deployment

Simple, fast, risky:

# SAM template
Resources:
  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      DeploymentPreference:
        Type: AllAtOnce # Deploy immediately

Use for:

  • Development environments
  • Non-critical applications
  • Quick hotfixes (with caution)

Blue/Green Deployment

Zero-downtime deployment:

Resources:
  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      AutoPublishAlias: live
      DeploymentPreference:
        Type: Linear10PercentEvery1Minute
        Alarms:
          - !Ref ErrorAlarm
          - !Ref LatencyAlarm

Deployment types:

  • Linear10PercentEvery1Minute: 10% traffic shift every minute
  • Linear10PercentEvery2Minutes: Slower, more conservative
  • Linear10PercentEvery3Minutes: Even slower
  • Linear10PercentEvery10Minutes: Very gradual
  • Canary10Percent5Minutes: 10% for 5 min, then 100%
  • Canary10Percent10Minutes: 10% for 10 min, then 100%
  • Canary10Percent30Minutes: 10% for 30 min, then 100%

Canary Deployment

Test with subset of traffic:

Resources:
  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      AutoPublishAlias: live
      DeploymentPreference:
        Type: Canary10Percent10Minutes
        Alarms:
          - !Ref ErrorAlarm
          - !Ref LatencyAlarm
        Hooks:
          PreTraffic: !Ref PreTrafficHook
          PostTraffic: !Ref PostTrafficHook

  PreTrafficHook:
    Type: AWS::Serverless::Function
    Properties:
      Handler: hooks.pre_traffic
      Runtime: python3.12
      # Runs before traffic shift
      # Validates new version

  PostTrafficHook:
    Type: AWS::Serverless::Function
    Properties:
      Handler: hooks.post_traffic
      Runtime: python3.12
      # Runs after traffic shift
      # Validates deployment success

CDK with CodeDeploy:

import * as codedeploy from 'aws-cdk-lib/aws-codedeploy';

const alias = fn.currentVersion.addAlias('live');

new codedeploy.LambdaDeploymentGroup(this, 'DeploymentGroup', {
  alias,
  deploymentConfig: codedeploy.LambdaDeploymentConfig.CANARY_10PERCENT_10MINUTES,
  alarms: [errorAlarm, latencyAlarm],
  autoRollback: {
    failedDeployment: true,
    stoppedDeployment: true,
    deploymentInAlarm: true,
  },
});

Deployment Hooks

Pre-traffic hook (validation):

# hooks.py
import boto3

lambda_client = boto3.client('lambda')
codedeploy = boto3.client('codedeploy')

def pre_traffic(event, context):
    """
    Validate new version before traffic shift
    """
    function_name = event['DeploymentId']
    version = event['NewVersion']

    try:
        # Invoke new version with test payload
        response = lambda_client.invoke(
            FunctionName=f"{function_name}:{version}",
            InvocationType='RequestResponse',
            Payload=json.dumps({'test': True})
        )

        # Validate response
        if response['StatusCode'] == 200:
            codedeploy.put_lifecycle_event_hook_execution_status(
                deploymentId=event['DeploymentId'],
                lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
                status='Succeeded'
            )
        else:
            raise Exception('Validation failed')

    except Exception as e:
        print(f'Pre-traffic validation failed: {e}')
        codedeploy.put_lifecycle_event_hook_execution_status(
            deploymentId=event['DeploymentId'],
            lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
            status='Failed'
        )

Post-traffic hook (verification):

def post_traffic(event, context):
    """
    Verify deployment success after traffic shift
    """
    try:
        # Check CloudWatch metrics
        cloudwatch = boto3.client('cloudwatch')

        metrics = cloudwatch.get_metric_statistics(
            Namespace='AWS/Lambda',
            MetricName='Errors',
            Dimensions=[{'Name': 'FunctionName', 'Value': function_name}],
            StartTime=deployment_start_time,
            EndTime=datetime.utcnow(),
            Period=300,
            Statistics=['Sum']
        )

        # Validate no errors
        total_errors = sum(point['Sum'] for point in metrics['Datapoints'])

        if total_errors == 0:
            codedeploy.put_lifecycle_event_hook_execution_status(
                deploymentId=event['DeploymentId'],
                lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
                status='Succeeded'
            )
        else:
            raise Exception(f'{total_errors} errors detected')

    except Exception as e:
        print(f'Post-traffic verification failed: {e}')
        codedeploy.put_lifecycle_event_hook_execution_status(
            deploymentId=event['DeploymentId'],
            lifecycleEventHookExecutionId=event['LifecycleEventHookExecutionId'],
            status='Failed'
        )

Rollback and Safety

Automatic Rollback

Configure rollback triggers:

DeploymentPreference:
  Type: Canary10Percent10Minutes
  Alarms:
    - !Ref ErrorAlarm
    - !Ref LatencyAlarm
  # Automatically rolls back if alarms trigger

Rollback scenarios:

  • CloudWatch alarm triggers during deployment
  • Pre-traffic hook fails
  • Post-traffic hook fails
  • Deployment manually stopped

CloudWatch Alarms for Deployment

Critical alarms during deployment:

// Error rate alarm
const errorAlarm = new cloudwatch.Alarm(this, 'ErrorAlarm', {
  metric: fn.metricErrors({
    statistic: 'Sum',
    period: Duration.minutes(1),
  }),
  threshold: 5,
  evaluationPeriods: 2,
  treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
});

// Duration alarm (regression)
const durationAlarm = new cloudwatch.Alarm(this, 'DurationAlarm', {
  metric: fn.metricDuration({
    statistic: 'Average',
    period: Duration.minutes(1),
  }),
  threshold: previousAvgDuration * 1.2, // 20% increase
  evaluationPeriods: 2,
  comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,
});

// Throttle alarm
const throttleAlarm = new cloudwatch.Alarm(this, 'ThrottleAlarm', {
  metric: fn.metricThrottles({
    statistic: 'Sum',
    period: Duration.minutes(1),
  }),
  threshold: 1,
  evaluationPeriods: 1,
});

Version Management

Use Lambda versions and aliases:

const version = fn.currentVersion;

const prodAlias = version.addAlias('prod');
const devAlias = version.addAlias('dev');

// Gradual rollout with weighted aliases
new lambda.Alias(this, 'LiveAlias', {
  aliasName: 'live',
  version: newVersion,
  additionalVersions: [
    { version: oldVersion, weight: 0.9 }, // 90% old
    // 10% automatically goes to main version (new)
  ],
});

Best Practices Checklist

Pre-Deployment

  • Code review completed
  • Unit tests passing
  • Integration tests passing
  • Security scan completed
  • Dependencies updated
  • Infrastructure validated (CDK synth, SAM validate)
  • Environment variables configured

Deployment

  • Use IaC (SAM, CDK, Terraform)
  • Separate environments (dev, staging, prod)
  • Automate deployments via CI/CD
  • Use gradual deployment (canary or linear)
  • Configure CloudWatch alarms
  • Enable automatic rollback
  • Use deployment hooks for validation

Post-Deployment

  • Monitor CloudWatch metrics
  • Check CloudWatch Logs for errors
  • Verify X-Ray traces
  • Validate business metrics
  • Check alarm status
  • Review deployment logs
  • Document any issues

Rollback Preparation

  • Keep previous version available
  • Document rollback procedure
  • Test rollback in non-prod
  • Configure automatic rollback
  • Monitor during rollback
  • Communication plan for rollback

Deployment Patterns

Multi-Region Deployment

Active-Passive:

// Primary region
new ServerlessStack(app, 'PrimaryStack', {
  env: { region: 'us-east-1' },
  isPrimary: true,
});

// Secondary region (standby)
new ServerlessStack(app, 'SecondaryStack', {
  env: { region: 'us-west-2' },
  isPrimary: false,
});

// Route 53 health check and failover
const healthCheck = new route53.CfnHealthCheck(this, 'HealthCheck', {
  type: 'HTTPS',
  resourcePath: '/health',
  fullyQualifiedDomainName: 'api.example.com',
});

Active-Active:

// Deploy to multiple regions
const regions = ['us-east-1', 'us-west-2', 'eu-west-1'];

for (const region of regions) {
  new ServerlessStack(app, `Stack-${region}`, {
    env: { region },
  });
}

// Route 53 geolocation routing
new route53.ARecord(this, 'GeoRecord', {
  zone: hostedZone,
  recordName: 'api',
  target: route53.RecordTarget.fromAlias(
    new targets.ApiGatewayDomain(domain)
  ),
  geoLocation: route53.GeoLocation.country('US'),
});

Feature Flags with AppConfig

Safe feature rollout:

import { AppConfigData } from '@aws-sdk/client-appconfigdata';

const appconfig = new AppConfigData({});

export const handler = async (event: any) => {
  // Fetch feature flags
  const config = await appconfig.getLatestConfiguration({
    ConfigurationToken: token,
  });

  const features = JSON.parse(config.Configuration.toString());

  if (features.newFeatureEnabled) {
    return newFeatureHandler(event);
  }

  return legacyHandler(event);
};

Summary

  • IaC: Use SAM or CDK for all deployments
  • Environments: Separate dev, staging, production
  • CI/CD: Automate build, test, and deployment
  • Testing: Unit, integration, and load testing
  • Gradual Deployment: Use canary or linear for production
  • Alarms: Configure and monitor during deployment
  • Rollback: Enable automatic rollback on failures
  • Hooks: Validate before and after traffic shifts
  • Versioning: Use Lambda versions and aliases
  • Multi-Region: Plan for disaster recovery