Faster Kotlin APIs on AWS Lambda

Forget about 10-second cold starts

Andrew O'Hara
Better Programming

--

Photo by Kevin Goodrich on Unsplash

It’s well known that Kotlin/JVM functions on AWS Lambda have exceptionally bad cold starts. The JVM itself is often cited as the culprit, but while it does contribute an unavoidable penalty, the rest can be managed through careful design of your own application.

Why not SnapStart?

Let’s start with the elephant in the room: AWS Lambda SnapStart exists and promises to eliminate cold starts. However, there are some caveats:

  • only supported in select regions
  • no ARM, X̶-̶R̶a̶y̶, or EFS
  • limited ephemeral storage
  • requires published functions versions
  • instances cannot be unique

Personally, I like to run in ca-central-1, use X-Ray, and barely understand function versions, so SnapStart is a bit of a non-starter for me. But if SnapStart meets your needs, then please do use it. You may find some useful tidbits in this guide, nonetheless.

Keys to a Faster Cold Start

In my experience, the key to building a fast Lambda function is to keep it lean; try to constrain the scope of your function to a single “micro” or “nano” service. Let’s break this idea down into components.

Choose your dependencies carefully

Eliminate any “heavy” libraries like Apache HTTP, log4j, Hibernate, Jackson, and the official AWS SDK. Beware of transitive dependencies that sneak them in. You should avoid frameworks; they’ll do more harm than good in a Lambda.

Just because a library performs well in benchmarks, is popular, or is “ubiquitous,” this doesn’t mean it’s a good fit for Lambda. We need lightweight libraries that start up immediately, even if it means they perform worse in the long run.

Reduce Jar size

A larger jar just takes longer to load. Aim for a 10 MB binary or less. The best way to manage this is to choose lighter or fewer dependencies, but you can also leverage tools like Shadow to “minimize” the final binary.

Minimize initialization

This may seem like a no-brainer, but it’s more of a factor than you might think. Not only will init code directly count against your cold start, but it’s also running while your function is at its slowest. Try to eliminate or defer outgoing requests and computation.

If you use a relational database, consider using the Amazon RDS Proxy to pool your connections.

If you need to load secrets, consider the following:

Avoid reflection

This might be a hard pill to swallow, but Kotlin's reflection should be avoided in Lambda. Not only is it slow, but it also requires an additional heavyweight dependency. There are alternatives, though:

Optimize JIT compilation

Believe it or not, the JVM does actually do additional compilation on the bytecode at runtime. This will lead to a faster application in the long run, but if our goal is to optimize for cold start, it can be more hindrance than benefit.

Tweaking the JVM can be intimidating, but all you need to start seeing substantial improvements is to disable the C2 compiler with this environment variable set in the Lambda config.

JAVA_TOOL_OPTIONS: -XX:+TieredCompilation -XX:TieredStopAtLevel=1

Use a large container

A Lambda function’s CPU scales with the memory you allocate it. Choosing a larger amount of memory will drastically affect your cold start and runtime performance. Experiment to find the best balance for your function; you will face greatly diminishing returns at a certain point. I’ve had the best results with 2048 containers.

Choosing the Right Dependencies

This is easier said than done, so let’s break it down a bit more.

HTTP server

While you can integrate directly with Lambda’s HTTP Proxy Interface, you would lose all the routing, error handling, and other utilities that a good HTTP server would normally offer. But with the right library, you can have your cake and eat it, too; it just needs to be lightweight and have an adapter for Lambda’s interface.

Despite Spring supporting Lambda, it’s far too heavy to start in a reasonable amount of time.

Javalin is a nice lightweight server that can run on Lambda with awslabs/aws-serverless-java-container. However, Javalin bundles the Jetty server, adding unnecessary baggage and slowing the cold start. It’s hard to get Javalin to work without YMMV. If you try, please let me know how it went.

Http4k has a minimal core module and can run on Lambda with its official serverless-lambda plugin. The small footprint and functional programming model make it an ideal fit for AWS Lambda, and it is my recommendation.

val httpServer: HttpHandler = {
Response(Status.OK).body("Hello World")
}

// Run locally during development
fun main() {
httpServer
.asServer(SunHttp(8000))
.start()
.block()
}

// Run on Lambda
class ApiLambdaHandler : ApiGatewayV2LambdaFunction(AppLoader {
httpServer
})

HTTP client

On Lambda, the best part is no part, and Java 11 comes with a highly functional HttpClient. Unfortunately, it does suffer from a cold start delay, so it’s not an ideal choice.

Http4k’s core module provides an alternative: theJava8HttpClient , which is a wrapper around Java’s built-inHttpURLConnection. There are better choices for high-throughput, but it’s a perfect fit for cold starts. Even if you’re not using http4k for the server, this is my recommendation.

val client = Java8HttpClient()

val request = Request(Method.GET, "https://httpbin.org/json")

val response = client(request)
println(response.status)
// 200 OK

JSON serialization

Jackson is an excellent serializer, but it’s large, has a cold start delay, and uses reflection, so it isn’t a great fit for an API on Lambda. Moshi is lighter but still uses reflection. This still leaves us with several options:

  • manually process JSON with Argo
  • Use kotlinx-serialization, which uses code generation instead of reflection
  • Use moshi with Kotshi, which uses code generation instead of reflection (moshi-kotlin-codegen ironically still requires kotlin-reflect)

I don’t think you can go wrong with kotlinx-serialization or Kotshi, but due to kotlinx’s aversion to JVM platform types, I recommend Kotshi.

@JsonSerializable
data class Person(val id: Int, val name: String)

@KotshiJsonAdapterFactory
object MyJsonAdapterFactory : JsonAdapter.Factory by KotshiMyJsonAdapterFactory

val moshi = Moshi.Builder()
.add(MyJsonAdapterFactory)
.build()

fun main() {
val person = Person(1, "Jimmy")
val json = moshi.adapter(Person::class.java).toJson(person)
println(json)
// {"id":1,"name":"Jimmy"}
}

Database

Amazon Dynamo DB’s serverless model has perfect synergy with AWS Lambda, so it will always have my recommendation.

There are official AWS SDKs for Dynamo DB. The Java SDK can support mapping for Kotlin data classes with a plugin like the one I made. There’s also a Kotlin SDK, but it’s in preview and doesn’t support object mapping. But neither of these choices is a good fit for Lambda because they’re heavy libraries and use reflection.

Instead, I recommend the DynamoDB module for http4k-connect. It synergizes well with an http4k server and requires no reflection. There’s an optional table mapper model, and it can easily be done with a non-reflective JSON backend, like Kotshi.

val dynamoDb = DynamoDb.Http()
val tableName = TableName.of("people")

fun `document model`() {
dynamoDb.putItem(tableName, mapOf(
AttributeName.of("id") to AttributeValue.Num(1),
AttributeName.of("name") to AttributeValue.Str("Jimmy")
))
}

fun `dao model with Kotshi`() {
@JsonSerializable
data class DynamoPerson(val id: Int, val name: String)

val kotshiMarshaller = ConfigurableMoshi(
Moshi.Builder()
.add(MyJsonAdapterFactory) // <-- Kotshi
.add(ListAdapter)
.add(MapAdapter)
.asConfigurable()
.withStandardMappings()
.done()
)

val peopleDao = dynamoDb.tableMapper<DynamoPerson, Int, Unit>(
TableName = tableName,
hashKeyAttribute = Attribute.int().required("id"),
autoMarshalling = kotshiMarshaller
)

peopleDao += DynamoPerson(1, "Jimmy")
}

Logging

On Lambda, you don’t need to worry about writing to log files or emitting logs to an aggregator. Everything that goes to stdout is sent directly to Cloudwatch Logs. You can even get away with println, but slf4j is still useful for configuring log levels and appending metadata. You’ll need to attach a logging backend, but slf4j-simple gets the job done with minimal bloat.

val logger = LoggerFactory.getLogger("root")

val server: HttpHandler = {
Response(Status.OK)
}

val loggingFilter = Filter { next ->
{ request ->
val response = next(request)
logger.info("${request.method} ${request.uri}: ${response.status}")
response
}
}

val loggedServer = loggingFilter.then(server)

val request = Request(Method.GET, "foo")
loggedServer(request)
// [main] INFO root - GET foo: 200 OK

Swagger UI

A Swagger UI is invaluable for testing and consuming a REST API. You could write the OpenAPI spec yourself, but manual documentation can easily get out of date; I prefer to have the server generate it.

The options you have for a spec generator typically depend on your server, so your mileage may vary.

If you’re using http4k, there’s an excellent contract module that will generate the spec and serve a thin Swagger UI. The primary generator requires Jackson, but you can easily substitute it with the non-reflective generator at the expense of a slightly simpler spec.

val api = contract {
descriptionPath = "spec"

// Define an operation to be documented
routes += "/hello" bindContract Method.GET to { _: Request ->
Response(Status.OK)
}

// Define an OpenApi spec renderer
renderer = OpenApi3(
apiInfo = ApiInfo("My API", "1.0"),
json = Moshi,
apiRenderer = OpenApi3ApiRenderer(Moshi) // non-reflective renderer
)
}

val ui = swaggerUiLite {
url = "spec"
}

fun main() {
routes(api, ui)
.asServer(SunHttp(8000))
.start()
.block()
// Swagger UI available at http://localhost:8000
}

Case Study: A Minimal Bookshelf API

I’ve designed a minimal Bookshelf API that incorporates all of these lessons.

The Bookshelf Swagger UI

Despite its minimal footprint, it still has most of the elements a fully production-ready application will need: including a Swagger UI, JSON, database persistence, and logging.

Final Jar size of 5.4 MB

Thanks to the careful design choices we explored in the guide, this fully functional API can achieve a cold start time as low as 500 ms on a 2048 MB container. This doesn’t instantly elevate the JVM to a top-performing runtime, but it does make it a perfectly valid option. Kotlin enthusiasts can rejoice!

The source code is available on GitHub.

It’s Your Turn Now

With these tips in mind, I’d love to see where you can take your own APIs. Please let me know if you can reproduce similar results — or even better!

Updated 2024–02–4: X-Ray is now supported by SnapStart

--

--