ServerlessAWS

AWS Lambda Cold Starts: Causes, Impact, and How to Eliminate Them

NT

Naveen Teja

3/2/2026

AWS Lambda Cold Starts: Causes, Impact, and How to Eliminate Them

AWS Lambda cold starts are one of the most searched pain points in serverless architecture. A cold start occurs when Lambda needs to initialize a new execution environment to handle an incoming request. During this initialization phase, AWS must allocate compute resources, download your deployment package, and bootstrap the runtime. For Node.js functions this adds 200-400ms of latency; for Java or .NET runtimes, the penalty can exceed 2 seconds.

Cold starts disproportionately affect user-facing APIs where latency directly impacts conversion rates. They occur most frequently after a function has been idle for 5-15 minutes, during traffic spikes requiring concurrent scale-out, or after a new deployment is published. The impact compounds when your function initializes a database connection pool or loads ML models during the init phase.

There are three effective mitigation strategies. First, keep your deployment package lean — strip unused dependencies, use tree-shaking, and target the smallest Node.js or Python runtime. Second, move expensive initialization (DB connections, SDK clients) outside the handler function so it is reused across warm invocations. Third, for latency-critical functions, use Provisioned Concurrency to pre-warm a fixed number of execution environments. The Terraform below configures Provisioned Concurrency on a specific Lambda alias, guaranteeing zero cold start latency for that capacity.

lambda-provisioned-concurrency.tf
resource "aws_lambda_alias" "live" {
  name             = "live"
  function_name    = aws_lambda_function.api_handler.function_name
  function_version = aws_lambda_function.api_handler.version
}

resource "aws_lambda_provisioned_concurrency_config" "warm" {
  function_name                  = aws_lambda_function.api_handler.function_name
  qualifier                      = aws_lambda_alias.live.name
  provisioned_concurrent_executions = 10
}

# Auto-scale provisioned concurrency based on utilization
resource "aws_appautoscaling_target" "lambda_target" {
  max_capacity       = 50
  min_capacity       = 5
  resource_id        = "function:${aws_lambda_function.api_handler.function_name}:live"
  scalable_dimension = "lambda:function:ProvisionedConcurrency"
  service_namespace  = "lambda"
}