Skip to content

Optimizing lambda coldstarts

Lambo kicking it into 6th gear

You thought you could just set some flags and your node.js lambda function would be bundled, tree-shaken, minified and go from ice cold to glowing hot in milliseconds? My sweet summer child. I thought that too, but then I analyzed the bundle, pored over some traces and realized how wrong I was. What follows is the rabbit hole I went down to optimize my lambda coldstarts.

How it started

I had an api with a 3.4M zipped bundle that was using require to import dependencies and a 1567 ms coldstart. Loading the secrets I needed for every request added 375 ms. Most invocations were coldstarts due to low traffic and before any processing happened there was 1974 ms of waiting. Dang. Before you tell me I can use provisioned concurrency, I know; I was more interested in what was making my coldstarts slow. This is what my package.json looked like:

  "dependencies": {
    "@aws-lambda-powertools/logger": "^1.12.1",  //powertools packages
    "@aws-lambda-powertools/metrics": "^1.12.1",
    "@aws-lambda-powertools/tracer": "^1.12.1",
    "axios": "^1.1.0", //for http requests
    "hash36": "^1.0.0", 
    "sha256-uint8array": "^0.10.3",
    "jose": "^4.6.0", //for jwt
    "lambda-api": "^1.0", //for the api
    "@aws-sdk/client-dynamodb": "^3.131.0", //for interacting with AWS
    "@aws-sdk/client-iam": "^3.131.0",
    "@aws-sdk/client-secrets-manager": "^3.131.0",
    "@aws-sdk/client-sts": "^3.131.0",
    "@aws-sdk/lib-dynamodb": "^3.131.0"
  }

Note

These benchmarks were done using node.js 18, with 256MB of memory running on arm64.

Knocking down the size with a bundler

The first step was to switch to static imports and use a bundler.

Before:

const { Logger } = require('@aws-lambda-powertools/logger');

After:

import { Logger } from '@aws-lambda-powertools/logger';

For a bundler, I chose esbuild over webpack because the first rule of doing something fast is to never use something slow. Benchmarking build times, esbuild took less than a second and webpack took 29 seconds. I am using the serverless framework to build and deploy my lambda, so I added the serverless-esbuild plugin.

Here's how I added it:

npm install -D serverless-esbuild esbuild

Then I added the following to my serverless.yml:

plugins:
  - serverless-esbuild

custom:
  esbuild:
    bundle: true
    minify: true

What's crazy about esbuild is that without bundling the build was 4 seconds and with bundling it was 1 second, esbuild made it faster!

Before:

> time sls package
Packaging speedrun-federation for stage dev (us-west-2) Service packaged (4s)
sls package  5.39s user 1.11s system 116% cpu 5.561 total

After:

> time sls package
Packaging speedrun-federation for stage dev (us-west-2) Service packaged (0s)
sls package  1.65s user 0.34s system 149% cpu 1.333 total

Note

If you are using the CDK to deploy, it uses esbuild under the hood so just pay attention to the flags I'm using. You can specify them using BundlingOptions in the NodeJSFunction constructor but as @dreamorosi points out, it chokes on plugins. If you need plugins, which you only need if you use sourcemaps, you may need to use this instead.

The results were that my zip file went from 3.4 MB to 425K and my coldstart went from 1567 ms to 591 ms. My secrets loading stayed steady at 375 ms. That's a 87.5% reduction in size and a 62.8% reduction in coldstart time. Not bad for a few minutes of work.

Fixing useless stacktraces

But minification has its downsides, I couldn't figure out anything useful from the minified stacktraces. Sourcemaps fix this, so I added the following to my serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true

I also set the corresponding node environment variable so Node 18 would use them:

environment:
  NODE_OPTIONS: '--enable-source-maps'

This ballooned the zip file from 425K to 1.5 MB, raised my coldstart to 650 ms and the loading of my secrets to 435 ms. I was better off than baseline, but I wanted to see if I could do better.

Analyzing the bundle

esbuild lets you analyze the bundle it creates by specifying the metafile flag. There is a nice analyzer on the website to visualize it. The analyzer is incredibly useful in determining what is contributing to bundle size and the reason it was included. I added the following config to my serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    metafile: true

@aws-sdk packages were the problem

Looking at the output, it was clear the @aws-sdk packages were the biggest offenders and tree shaking wasn't working properly. I found this GitHub Issue that explained how to fix it. I added the following to my serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    metafile: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'

This enabled treeshaking. The iam-client went from 466K to 13K.

@aws-sdk properly tree-shaken

But I noticed a few other things in there, the @aws-sdk/client-s3 remained and neither it or the aws-xray-sdk-core package was getting tree-shaken. @aws-sdk/client-s3 was a dependency of the lambda-api package and wasn't actually used in my code, so I removed it. The aws-xray-sdk-node doesn't have the necessary ESM build for tree shaking. I could have pulled it in using the lambda layer for AWS PowerTools, but I left it for now. This was how I excluded the @aws-sdk/client-s3 package from bundling, you can do something similar if you want to omit the other @aws-sdk packages and use the versions that exist in the lambda runtime:

Danger

Removing @aws-sdk packages like this is risky, because they must be lazy loaded and unused. If either of those aren't true, the on disk version will be loaded and that is super slow. To confirm that you are not using anything from disk, test your code using node 16 which only has the aws-sdk v2 baked into it. I had to submit a pull request to lambda-api to meet these requirements.

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    metafile: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    exclude:
      - '@aws-sdk/client-s3'

@aws-sdk tree-shaken without s3 client

I removed the metafile: true from my serverless.yml file and rebuilt.

Cool, the bundle went from 2.0 MB -> 745K uncompressed. Compressed, the bundle was now 955K with sourcemaps. Coldstart was 649ms and loading secrets was 435ms.

Removing sources content and json from sourcemaps

Adding the sourcemap was a major contributor to the bloat of the bundle. Using the source map visualizer I noticed that the sourcemaps were including the entire source. I really just needed method names and line numbers in my stacktraces. I added the following to my serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    exclude:
      - '@aws-sdk/client-s3'
    keepNames: true
    sourcesContent: false

That brought the compressed bundle down from 955K->935k. Digging in further, I found the sourcemap contained the details for everything in node_modules. I mainly care about my own code, so I started looking into omitting the node_modules from the sourcemap. I found this GitHub Issue and created a plugin as suggested. When I ran it, I discovered the sourcemap was also including json files, so I omitted those, and for good measure, reduced the size of the mime-db json to only the mime type I needed. I also specified using the copy loader on files needed by the aws-xray-sdk-core package. Using the copy loader omitted them from the sourcemap and didn't result in json errors due to the comments not being allowed. This is what the final plugin looked like:

/plugins/excludeVendorFromSourceMap.js
const fs = require('fs');
let excludeVendorFromSourceMap = {
  name: 'excludeVendorFromSourceMap',
  setup(build) {
    const sourceMapExcludeSuffix =
      '\n//# sourceMappingURL=data:application/json;base64,eyJ2ZXJzaW9uIjozLCJzb3VyY2VzIjpbIiJdLCJtYXBwaW5ncyI6IkEifQ==';

    //ignore source map for vendor files
    build.onLoad({ filter: /node_modules\/.*\.m?js$/ }, (args) => {
      return {
        contents: fs.readFileSync(args.path, 'utf8') + sourceMapExcludeSuffix,
        loader: 'default',
      };
    });

    //for .json files
    build.onLoad({ filter: /node_modules.*\.json$/ }, (args) => {
      // if it is the mime-db, replace the content with just the mime type we need
      var json = args.path.includes('/mime-db/db.json')
        ? `{"application/json":{"source":"iana","charset":"UTF-8","compressible":true,"extensions":["json","map"]}}`
        : fs.readFileSync(args.path, 'utf8');
      var js = `export default  ${json}${sourceMapExcludeSuffix}`;
      // if it is in the /resources folder, use copy instead of trying to ignore it, it is required by
      // aws-xray-sdk-core
      return args.path.includes('/resources')
        ? { contents: json, loader: 'copy' }
        : { contents: js, loader: 'js' };
    });
  },
};

module.exports = [excludeVendorFromSourceMap];

And the corresponding serverless.yml to use it:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    exclude:
      - '@aws-sdk/client-s3'
    keepNames: true
    sourcesContent: false
    plugins: plugins/excludeVendorFromSourceMap.js

This reduced the bundle down from 935K to 223K. The coldstart was 650 ms and secrets loading was 345 ms or 995 total.

Using the bigger CPU of initialization

I learned recently that during initialization, Lambda allocates more CPU and memory to the function. So one final thing I wanted to do was to move the secrets loading to initialization and see if more CPU would help the SSL connection setup go faster. To do this, I needed to use TLA (top level await). This required switching to ESM, adding a banner to support require and changing the output file extension to .mjs. It also required patching the serverless framework to support .mjs. This was my final serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    exclude:
      - '@aws-sdk/client-s3'
    keepNames: true
    sourcesContent: false
    plugins: plugins/excludeVendorFromSourceMap.js
    format: esm
    outputFileExtension: .mjs
    banner:
      js: import { createRequire } from 'module';const require = (await import('node:module')).createRequire(import.meta.url);const __filename = (await import('node:url')).fileURLToPath(import.meta.url);const __dirname = (await import('node:path')).dirname(__filename);

In the end, my bundle went from 3.4M to 223K (14.2x reduction) my cold starts went from 1974ms to 892 ms (2.2x reduction).

coldstart improvements

The biggest speedup was adding bundling. Adding sourcemaps actually caused a regression that was only countered by the final optimization.

bundling improvements

The biggest reduction in size was adding bundling. Adding sourcemaps caused a regression but that was countered by the other optimizations to enable tree shaking and removal of unused sourcemap content.

What's next

I think I'm going to replace the axios library with something like wretch. What I'm doing to reduce the size of the mime types json is fairly sketchy and jumping to wretch obviates this. I'll also keep watching the aws-xray-sdk-core package to see if they add an ESM build and I might experiment with what using the lambda layer for AWS PowerTools does to my coldstarts. I suspect it will be a wash, but I'm curious. Finally, I may try priming the aws clients in the initialization code to see what that does.

The gold is in the comments

Update 9/27/2023: After tweeting this , there was some buzz in the twitter comments and the rabbit hole got deeper.

Using the lambda layer for AWS PowerTools

I hadn't yet tested the Lambda layer for PowerTools, obviously it would shrink the bundle size but I wanted to know if it impacted coldstarts. I added the following to my serverless.yml:

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    keepNames: true
    sourcesContent: false
    plugins: plugins/excludeVendorFromSourceMap.js
    format: esm
    outputFileExtension: .mjs
    banner:
      js: import { createRequire } from 'module';const require = (await import('node:module')).createRequire(import.meta.url);const __filename = (await import('node:url')).fileURLToPath(import.meta.url);const __dirname = (await import('node:path')).dirname(__filename);
    exclude:
      - '@aws-sdk/client-s3'
      - '@aws-lambda-powertools'
      - 'aws-xray-sdk-core'

functions:
  speedrun-federation:
    layers:
      - !Sub 'arn:aws:lambda:${AWS::Region}:094274105915:layer:AWSLambdaPowertoolsTypeScript:18'
This reduced the bundle size 47K from 223K to 176K, but increased the coldstart by 116 ms from 892 to 1008 ms. To make sure calls to secrets manager wasn't impacting the result, I ran some more tests and subtracted the secrets manager time.

Init without secrets lambda layer delta
728 none (esbuild) +0 ms
922 v18 +194 ms
882 v20 +154 ms
894 v21 +166 ms

It was +171 ms on average, so I ripped layers out and followed up with the PowerTools team.

Switching to redaxios

@munawwarfiroz suggested that I use redaxios instead of axios.

I was already planning on switching to wretch, but this was a lightweight drop-in replacement for axios.

npm i redaxios
npm uninstall axios
And a three character change in my code:

-import axios from 'axios';
+import axios from 'redaxios';

This reduced the bundle size 25K from 223K to 198K, and correcting for secrets manager the coldstart was -7 ms to +25 ms. I chose to stick with it, the margin is lost in the noise.

Revisiting source maps

@thdxr mentioned that sourcemaps added a performance penalty.

I'd actually seen his comment here when I was first exploring sourcemaps. It appears that a fix landed in node.js 18.8 in this issue #41541. As of this writing, Lambda is using 18.12. I tried to confirm it in this benchmark but they never tested with node.js 18.8. Regardless of whether it's fixed or not, it can't have a penalty if it's not there right? What if we turn off minify and sourcemaps? That will make the bundle bigger, but I'll have stacktraces for everything including the packages I depend on. With treeshaking, my final bundled code fits within the limits of the Lambda console so I can easily go to the line number. Let's do this! Here's the final serverless.yml:

custom:
  esbuild:
    bundle: true
    # minify: true
    # sourcemap: true
    treeShaking: true
    mainFields:
      - 'module'
      - 'main'
    keepNames: true
    # sourcesContent: false
    plugins: plugins/excludeVendorFromSourceMap.js
    format: esm
    outputFileExtension: .mjs
    banner:
      js: import { createRequire } from 'module';const require = (await import('node:module')).createRequire(import.meta.url);const __filename = (await import('node:url')).fileURLToPath(import.meta.url);const __dirname = (await import('node:path')).dirname(__filename);
    exclude:
      - '@aws-sdk/client-s3'
I also removed the NODE_OPTIONS: '--enable-source-maps' environment variable.

Note

I attempted to remove the plugin too and use:

loader: 
   '.json':'copy'
to replace the only remaining thing it was doing, but I got this error: ✖ TypeError [ERR_IMPORT_ASSERTION_TYPE_MISSING]: Module "file:///Users/david/Documents/code/speedrun-api/.esbuild/.build/partitions-SKMMJENC.json" needs an import assertion of type "json" If you can't use plugins, just remove it completely and don't add the loader line.

This one felt fast to me when I tested it. Turns out it was. The bundle grew 87k from 198k to 285k, but the coldstart dropped 85 ms from 898 to 813 ms. The exhilaration was fleeting, I ran a few more coldstarts and saw 904 ms, 873 ms and 863 ms. But still, 30ms faster than my previous best on average. Note that these tests were run without correcting for secrets loading latency which is +/-20ms. This is just out of that margin and gives better stack traces so I kept it.

For a final test, I tried throwing exceptions with sourcemaps on and off. There still appears to be a penalty for the sourcemaps even with the fix put in node.js 18.8. I was seeing an additional 45 ms or so with sourcemaps on on warmed instances. That settles it, I'll leave sourcemaps off!

That's weird why is SSO bundled in there?

Poking around the bundle some more, I noticed that the @aws-sdk/client-sso package was in there for loading credentials. If you know anything about credentials in Lambda, credentials are injected via environment variables. There is no end user interaction so it can't possibly get credentials using SSO. Since SSO adds 39 ms when minified to the coldstart if you use any AWS v3 client, it's surprising there is no way to omit it. I opened this issue with some code on how to patch the default credentials provider if you don't need it. It's unfortunate the SDK team didn't pursue this.

Further reading

I found Robert Slootjes' article about optimization after I had independently done something very similar. He cuts right to the final configuration, but doesn't go as deep as me.

Yan Cui wrote All you need to know about lambda cold starts, he concludes bundling and treeshaking are the most important optimizations.

AJ Stuyvenberg has a few coldstart benchmarks comparing the AWS V2 and V3 javascript SDKs and the difference bundling makes.

This AWS blog article talks about Using ESM and TLA (top level await) with Lambda which is how I used the higher compute power of initialization to speed up my secrets loading.

Finally, cost optimization is the other dimension for optimization in Lambda. This Serverlessland article about Cost Optimization describes a few approaches to reduce your costs.