Using CloudFront functions as a REST API
While writing an AWS Step Function, I needed to sort an array of objects by an object path -- something that was beyond the ability of the built-in intrinsic functions. It was easily done with a Lambda, but that struck me as overkill. Could I use something lightweight like CloudFront functions? Yes! In this post, I'll show how I use CloudFront as a performant, inexpensive REST API that runs on the edge.
The limitations
Before you get too excited, I'd be remiss not to mention the many limitations of CloudFront Functions:
- Function must be written using a limited subset of JavaScript
- Function must be <= 10KB in size
- Function must execute in a few milliseconds (the exact number isn't mentioned)
- Function has no network access
- Inputs must be passed as query parameters or headers (you cannot access the request body)
- Actual deployment takes a few minutes (thankfully, testing is quick)
Oof -- that's a lot of limitations. But it's a great solution if you just need to run a tiny bit of javascript to generate a ULID, create a presigned S3 url, or sort an array and want to take advantage of CloudFront's global network of edge locations with zero cold start time!
Architecture
With those limitations, let's see the architecture to build a simple REST API. Given a url like the following:
https://dxxxxxxxx.cloudfront.net/sortByPath?arr=[{"name":"Alice","age":25},{"name":"Bob","age":30},{"name":"Charlie","age":20}]&path=age
- Use the path (
sortByPath
) to determine the API name - Use url encoded query parameters (
arr
andpath
) as the inputs - Write some code to process the inputs for each API
- Return the result in the response body as JSON, or if output is a url, use a 302 redirect.
Visualized, it looks like this:
Although you need a dummy origin like S3 to deploy the CloudFront distribution, the origin is never actually used. The function is invoked at the edge location closest to the user and the function returns the response body without hitting the origin.
Tip
Multiple APIs can be supported by using different paths. For example, https://dxxxxxxxx.cloudfront.net/ulid
. If necessary, you can have a different function per API using CloudFront CacheBehaviors so you can get the full 10KB of code per API. If needs be, you could route some paths to a Lambda@Edge function or Lambda Function Url origin for more power (but that adds cold starts, cost and negates most of this experiment).
The Code
Here's the code for the sortByPath
API:
function handler(event) {
let body ="Bad Request";
//Step 1: Get the API from the path
let path = event.request.uri.split('/',2);
let statusCode = 400;
let statusDescription = 'BadRequest';
if(path && path[1]) {
switch(path[1]) {
case 'sortByPath':
//Step 2: Extract parameters
let value = event.request.querystring.arr.value;
value = value.includes('%2522') ? decodeURIComponent(decodeURIComponent(value)) : decodeURIComponent(value);
try {
//Step 3: Invoke the function
body = {requestId: event.context.requestId, value: sortByPath(JSON.parse(value), event.request.querystring.path.value)};
statusCode = 200;
statusDescription = "OK";
} catch (e) {
body = e.message;
}
break;
default:
break;
}
}
//Step 4: Return the response
var response = {
statusCode,
statusDescription,
headers: {
'content-type' : { value: 'application/json' }
},
body: JSON.stringify(body)
};
return response;
}
// some helper functions
function sortByPath(arr, path) {
return arr.sort(compareByPath(path));
}
// get property of object
function index(o,p) {return o[p]}
// get property of object by path
function get(o, p) {return p.split(".").reduce(index, o)}
function compareByPath(p) {return function(a,b) {return compare(get(a,p), get(b,p))}}
function compare(a, b) {
if (typeof a === 'number') {
return a - b
} else if (typeof a === 'string') {
return Buffer.from(a).compare(Buffer.from(b));
} else {
throw new Error('value must be a number or string');
}
}
Going further
After I had successfully gotten objects sorted by path, I tried a few other things. I was able to generate a ULID (albiet using Math.random()
as a PRNG instead of the more secure crypto
library) and create S3 presigned urls. For the ULID generation I heavily modified this ulid library and for presigning I modified mhart's aws4 package.
Trying it yourself
I've included a CDK stack that creates a CloudFront distribution with 3 apis, sortByPath
, ulid
, and presign
in the coldstart-zero GitHub repository. Read the README for important instructions on how to deploy and interact with these APIs.
Too lazy to deploy it yourself? Here's a few links where you can see how fast it is in action:
- Sort by path: https://dqcixxp5socxd.cloudfront.net/sortByPath?arr=[{"name":"Alice","age":25},{"name":"Bob","age":30},{"name":"Charlie","age":20}]&path=age
- ULID: https://dqcixxp5socxd.cloudfront.net/ulid
- Presigned url: https://dqcixxp5socxd.cloudfront.net/presign/coldstart-zero-demo.s3.us-west-2.amazonaws.com/fast.txt
Danger
This code works, but is meant for reference only and is missing robust error checking. Before using the presign
function in production, you'll need to modify it do some authorization of the caller using a JWT token or some other method before vending the presigned url.
Network access
One of the things you can't do is access the network. However, you can write logs with console.log
which are streamed to CloudWatch. If you are ok with having the network access be asynchronous, you can use a CloudWatch subscription filter to invoke a Lambda that can access the network, or emit an event to EventBridge like this to route something to an EventBridge destination.
Benchmarking
On the first request I was seeing 100-130 ms due to SSL handshake setup time. Subsequent requests on the same edge location were 6-15 ms. You won't be able to get this E2E latency with Lambda.
Cost
Using this approach costs $0.10 per million requests, or $0.13 per million requests if you use the KeyValueStore like the presign
function does. The first 2 million requests are free with the free tier. By comparison, with Lambda you'll pay $0.20 per million requests if the average runtime of your function is 6ms and you use a 128 MB arm64 instance. Your first million requests are free with the free tier.
Musings
- In my experiment I disabled caching so my function would be hit with each request. If you have a cachable api, you may be able to cache the response at the edge location and reduce the number of invocations.
- You may be able to use a bundler and minifier to get more code into the 10KB limit, I haven't attempted this.
Conclusion
Using a CloudFront Function is a clever way of running code close to your users with low-cost and zero cold start time. Although the many limitations prevent it from being used in all use cases, I've outlined a few cases it worked for and I'm confident you can come up with many more. If you have questions or issues with the sample code, reach out on Twitter or open an issue on the GitHub repository.
Further Reading
- The CloudFront Functions Developer Guide
- The ULID GitHub repository the ULID code was based on.
- mhart's aws4 library the presigned url code was based on.
- I left out how to call these apis from a Step Function. If you're interested in how I do that, read this guide and Ian Mckay's Blog post on calling https endpoints from a Step Function.