HomeDockerTapir serverless: a proof of concept | by Adam Warski | Mar,...

Tapir serverless: a proof of concept | by Adam Warski | Mar, 2021

Adam Warski

The serverless model has a lot of appeal: simple to deploy, simple to scale, no hardware to handle, and no upfront financial commitments. However, we have to realize that using a serverless platform is in fact programming the [Amazon|Google|Microsoft] “virtual machine” which is induced by the myriad of services each of these clouds offers.

Nowadays, this programming is most often done using YAML. Opinions vary whether that’s a fine path. I don’t think so: each YAML configuration is in fact its own untyped DSL, often including an incomplete, ad-hoc encoding of programming constructs known from higher-level languages, such as conditionals, function calls, variable references, etc.

Still, if that’s the low-level language we have to work with, then be it — but nothing stops us from using a proper, high-level language on top. One possibility that we’ll discover here is to use a typed, practical languageScala — together with the sttp tapir library for describing and deploying HTTP APIs.

A warm-up rebus

A tapir is a large, herbivorous mammal, identical in form to a pig, with a quick, prehensile nose trunk.

Tapir is also a project with which it’s possible to portray HTTP endpoints using a programmer-friendly API using the Scala language. Endpoints are represented as immutable values. Such a value captures the form of the endpoint’s inputs, as well as the form of the outputs: what values are read from headers, URL, body; what values are required, and which are optional; and how they should be parsed to higher-level types.

An endpoint description can be interpreted in a number of ways:

As the endpoint description contains all of the endpoint’s metadata, we should be able to leverage this to deploy an endpoint to a serverless platform. Let’s discover that thought, using Amazon’s AWS Lambda. We’ll discuss which AWS services we’ll be using and how to generate the necessary configuration. As we’re deploying a JVM-based service, we’ll also compare Amazon Java and GraalVM-based versions.

We won’t go into details as to how to define a tapir endpoint, as this is thoroughly described in other blogs and the documentation. A operating example should be enough. Below is an immutable value describing a simple GET /pets/{petId} endpoint, which defines that an integer id should be extracted from the path, alongside with a string from a query parameter, and that the output should be an instance of Animal serialized to JSON:

case class Animal(id: Int, name: String)val getPet: Endpoint[(Int, String), Unit, Animal, Any] =
endpoint
.get
.in("pets" / path[Int]("petId"))
.in(query[String]("name"))
.out(jsonBody[Animal])

The getPet value is pure metadata, there’s no logic specifying how the Int and String values extracted from the request map into the Animal instance that should be serialized as the response.

In the following sections, our goal will be to deploy an HTTP API, described by a number of Endpoint values, such as getPet above, on the AWS Serverless platform (it should be possible to implement the same using either GCloud or Azure as well).

We don’t want to repeat the information contained in the endpoint description, so that should be our single source of truth. As much AWS configuration as possible should be generated from these descriptions.

To implement the logic, we’ll couple each endpoint description with an appropriate server logic function, which in tapir is represented as a value of type ServerEndpoint; we’ll see an example later in the article. A collection of server endpoints will then form the basis of our serverless function implementation.

The full code that is used below is available in the tapir-serverless repository.

To expose the API to the outdoors world, we’ll use the AWS API Gateway. The gateway is where HTTP connections are terminated, requests parsed and turned into events. The gateway uses the concepts of routes: each route is a combination of an HTTP method and a path. The path can contain path variables. In /pets/{petId}, the petId is a path variable.

For each route (method + path), we need to specify an integration. The integration passes the parsed HTTP request as a JSON event to some handler and expects another JSON in return. This is then transformed to an HTTP response.

Our integrations will be AWS Lambda functions which form the basis of Amazon’s serverless offering. They are fully managed, event-driven pieces of logic that are initiated and scaled on-demand. They can be deployed by providing an archive with the lambda implementation in 1 of the supported languages (such as Python, JavaScript, Java, and others), or by providing a Docker container.

We’ll be using the latter option, as it’s more flexible and it’s simpler to provide custom runtimes. As the docker registry, we’ll use the AWS ECR (Elastic Container Registry).

This definitely isn’t the first attempt to generate AWS configuration from a higher-level language, even when we restrict our consideration to the Scala ecosystem.

The scalambda project is 1 such example. It allows producing terraform modules that create Lambda functions, API Gateway routes, as well as integrations for other event types, which are available as Lambda triggers. To define the event metadata (such as the route path and HTTP method), it uses annotations.

That’s 1 possible approach, and it definitely has its advantages: it helps multiple event types, so it’s more complete. On the other hand, tapir’s endpoint descriptions contain much more information and can be used in many other contexts — not only for defining Lambda functions — such as producing documentation, clients, or stand-alone servers.

One lambda or many Lambdas?

Let’s start by deploying the logic for our service as a Lambda function. The first dilemma that we have to solve is whether we should have a single Lambda function for all of our endpoints or separate Lambda functions? Each option has its pros and cons.

If we have a Lambda-per-endpoint, it’s possible to scale the handler for each endpoint independently; we can define different memory quotas, timeouts, etc. for each such function. Observability (logging, metrics) might also be simpler with that separation.

However, when implementing an application we probably have a number of related endpoints which are developed together, as part of a single codebase. Most probably they share code, which is used to handle user requests. Hence when deploying separate Lambdas, each 1 would most probably contain exactly the same code — with the same third-party dependencies.

There is also a shaded side to using Lambdas, and that is the dispute of cold start. When the first request arrives, either after deployment, a period of inactivity, or when scaling kicks in, the container implementing the Lambda function is initiated and initialized by the Lambda framework. This takes a non-trivial amount of time, especially with JVM-based containers: they tend to be large (> 100MB), so apart from the JVM taking some time to start up, beforehand, the image needs to be downloaded to the Lambda-handling server.

By using a single Lambda for all of our endpoints, we can slightly reduce this dispute, as we have fewer Lambdas to hold warm and to spin up. Deployment is also simpler, as we simply use a single Docker image with all of our endpoint implementations. We can also probably recover the observability benefits with some effort by including the invoked endpoint identity in the logs.

Hence, we’ll go with the single-Lambda approach, but it’s fairly easy to modify the code to deploy multiple Lambdas if you choose that it’s a better alternative.

Accepting events

To implement a JVM-based lambda, we have to use Amazon’s base Java image, public.ecr.aws/lambda/java:11. We need to COPY all jar files to the /var/task/lib directory in the container, and the class to handle the incoming events needs to be specified as the CMD in the Dockerfile. For example:

CMD ["com.softwaremill.app.AppHandler::handleRequest"]

The base image contains a daemon process that receives Lambda events, parses them, performs appropriate logging and error handling, instantiates the provided class (here, AppHandler) and invokes the given method (handleRequest). We “only” have to implement a function that takes the enter JSON and produces the output JSON.

There’s a couple of ways to do it. The Amazon-provided Java runtime can do parsing for us using Jackson, but we’ll bypass that and implement a variant that takes in a raw enter stream of bytes, and writes the response in a identical way:

trait TapirHandler extends RequestStreamHandler {
override def handleRequest(
enter: InputStream,
output: OutputStream,
context: Context): Unit = ...
}

That way we don’t do any JSON parsing using reflection, but instead, we can generate the JSON codecs at compile-time using circe. You can browse the whole source code of TapirHandler (and other files) in the tapir-serverless repository.

Here’s a fragment of the incoming HTTP event that the Lambda receives from the API Gateway:

{
"version": "2.0",
"routeKey": "GET /hello",
"rawPath": "/hello",
"rawQueryString": "",
"headers": {
"accept": "*/*",
"content-length": "3",
"content-type": "application/x-www-form-urlencoded",
"x-amzn-trace-id": "Root=1-60250d19-7182a3ff0e9dffb334e2bf74",
...
},
"requestContext": {
"accountId": "1234567890",
"apiId": "9abc9",
"domainName": "9abc9.execute-api.eu-central-1.amazonaws.com",
"domainPrefix": "9abc9",
"http": {
"method": "GET",
"path": "/hello",
"protocol": "HTTP/1.1"
},
"requestId": "ak78CiA8FiAEPWQ=",
"routeKey": "POST /hello",
"stage": "$default",
"time": "11/Feb/2021:10:55:21 +0000",
"timeEpoch": 1613040921706
},
"pathParameters": {},
"body": "OTg3",
"isBase64Encoded": true
}

As you can see, all the necessary information is here: the method, path, query string, headers, and base64-encoded body. We parse that into an AwsRequest class, and equipped with that, we can proceed to actually handling the event and operating the appropriate logic.

Handling events

To handle incoming requests, we need to have a list of endpoint descriptions, each coupled with the server logic. In tapir, each server interpreter helps server logic functions of slightly different shapes. For example, the akka-http based 1 requires that results use Futures for results. That is, given an endpoint that parses inputs into a type I and returns a type O, the logic function needs to be of type I => Future[O]. The http4s interpreter can use any practical effect wrapper that’s suitable with cats-effect, e.g. the constructed-in IO, Monix’s or ZIO’s Task.

In our proof-of-concept, we’ll use synchronous logic functions, that is functions of type I => O, without any wrapper (or to be more precise, with the type Identity[X] = X wrapper). This is fine enough for testing and can be easily generalized to use either Futures or IOs if that’s the requirement.

Following our example, a server endpoint for getPet can gaze as follows:

val getPetServerEndpoint = getPet
.serverLogic[Identity] { case (id, name) =>
Right(Animal(id, name))
}

As you can see, we are simply echoing the enter, although in the HTTP response, it will be serialized in a different format (as JSON). The Right wrapper indicators a successful result, as opposed to a bad-request result.

A service typically consists of multiple endpoints, each having different enter and output parameters. Hence, an HTTP service implementation is a list of server endpoint instances, such as the 1 defined above.

The application needs to provide the list of endpoints

We can then implement the AwsServerInterpreter which, given a list of ServerEndpoints, does the following:

The TapirHandler then needs to serialize the response as JSON, again encoding the body as Base64, and write it to the output stream. The rest is handled by the Lambda framework and the API Gateway.

Packaging the Lambda

We have the implementation of the Lambda handler, but we nonetheless need to package it as a Docker container. This is done by sbt-native-packager, which with some tweaking to make it suitable with the Amazon-provided base Java image (such as putting the .jar files where the runtime expects them), produces the appropriate container.

We further configure the plugin to upload the container to an ECR repository, which needs to be created beforehand (by hand or routinely). The amazon-ecr-credential-helper might be useful to push docker images to a private ECR repository.

As a result, we get a single sbt lambda/docker:publish command which creates and uploads the container. Here lambda is the name of the construct module that contains the Lambda handler code.

We have the container with the Lambda handler ready, but we haven’t yet created anything on the AWS cloud. We need to configure both the API Gateway and the Lambda service itself, so that a route for each endpoint is created, together with an integration forwarding the events to our Lambda function.

There’s a couple of ways through which we can create the sources on Amazon’s cloud. The first is immediately using the API, creating API Gateway routes, integrations, and Lambda functions. But this is only 1 part of the story: we also need code that updates an existing configuration, adding new endpoints and removing old ones. It’s possible to implement this immediately using Amazon’s API but gets significantly more complex.

Luckily, we’re not the only ones that would like to hold the configuration of our serverless application centralized. For that purpose, Amazon has created the Amazon Serverless Application Model (SAM). Using it, we author a single YAML configuration file that captures the HTTP API and Lambda functions that we want to create. It can also contain information about other events (such as ones originating from Kafka, SQS, or S3) which can trigger Lambda functions, but we won’t be using these capabilities here.

We can then run the sam command-line application, which inspects the current state of the application’s sources on AWS, and generates a changeset to update it so that it’s in sync with what is described by the configuration file. This changeset can then be utilized, and our service is immediately live with the changes in place.

Behind the scenes, the sam application generates and applies a CloudFormation template, which in turn then interprets to direct API calls. As you can see, there’s quite a lot of layers of abstraction here, and we are adding a new 1!

Both creating the ECR repository, and deploying a serverless application using SAM assumes that you have a configured AWS account. Note that while there is a free tier, at some point these services start incurring costs. To effortlessly use the aws and sam command-line applicacations, you’ll need to have AWS credentials either in the environment, in a .aws/credentials file, or pass them in another supported way.

Of course, we won’t be writing the YAML by hand. Instead, we’ll generate it from the typed endpoint descriptions. We don’t need the server logic for that; the pure Endpoint descriptions are enough.

To generate a full SAM YAML, apart from the endpoints, we’ll also need 2 other parameters: the name of the application (which will be used as the basis of identifiers for AWS sources and to differentiate our service from others), and the URI of the docker image that we have uploaded to ECR, with the Lambda implementation.

Having that, we can write another interpreter for our Endpoint datatype, which produces a SAM YAML such as the following:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
TapirServerlessFunction:
Type: AWS::Serverless::Function
Properties:
ImageUri: 123.dkr.ecr.eu-central-1.amazonaws.com/xyz
Timeout: 10
MemorySize: 256
Events:
GetPetsPetid:
Type: HttpApi
Properties:
ApiId: !Ref 'TapirServerlessHttpApi'
Method: GET
Path: /pets/{petId}
TimeoutInMillis: 10000
PayloadFormatVersion: '2.0'
PackageType: Image
TapirServerlessHttpApi:
Type: AWS::Serverless::HttpApi
Properties:
StageName: $default
Outputs:
TapirServerlessUrl:
Description: Base URL of your endpoints
Value:
Fn::Sub: https://${TapirServerlessHttpApi}.execute-api.${AWS::Region}.${AWS::URLSuffix}

The above defines a container-based lambda function that will run the ImageUri container. We allocate 256MB of memory, which seems to be the minimal for non-trivial JVM-based lambdas, and specify a timeout of 10 seconds — this is desired in case of cold begins.

We also define an event, which will trigger the lambda. The event is of type HttpApi, and will translate to an API Gateway route and a Lambda integration. The timeout of the integration is also 10 seconds. Once you have more endpoints, this YAML can get quite huge. But since we are producing it from Scala, that’s not an issue. All we need to do is (see also the sources for SamTemplateInterpreter):

val endpoints: List[Endpoint] = ...
val namePrefix: String = ...
val imageUri: String = ...
val samTemplate = SamTemplateInterpreter(
endpoints,
namePrefix,
imageUri
)

If we save the template to a template.yaml file, we can now run:

sam deploy --guided

which will prompt us for information such as the AWS region to use, and have our service deployed to AWS.

There’s nonetheless 1 more automation step that we can take. We can use our construct tool — sbt — to perform the following:

This is implemented in the sbt deploy command which builds & deploys a new version of our API. A worthy way of rapidly prototyping and exposing a Scala-based API!

As mentioned before, the cold start is a known dispute when dealing with Lambda functions. JVM-based images are quite large, so downloading the image from the repository alone takes some time. Additionally, the JVM has non-negligible startup time, which provides to the total amount of time an initial invocation of our Lambda function takes.

In my non-scientific measurements, invoking a brand new image that hasn’t been used before took up to 28 seconds, but the timings varied a lot. However, once the image has been used at least once, subsequent cold start requests took about 5 seconds (that is, after redeploying the function using an image that has been used before, or when the image was stopped by Lambda due to inactivity).

Once the function’s image is initiated, subsequent invocations are as quickly as you would expect them to be. The latency principally consists of round trip times between my laptop and Amazon’s datacenter.

(One clarification of the very high latency for brand-new images is that AWS might be somehow processing the images before using them for the first time, but I have no way of verifying that.)

GraalVM

The natural question arises: What happens if we try GraalVM’s native-image? Using it, we can create a small native binary, without most of the Java runtime, containing only the Java classes that are in fact used by the application. That binary contains a baked-in garbage collector, but without JIT optimizing compiler, and without many other JRE features.

The analysis to determine which classes are used, done by GraalVM’s native-image, is performed when building the binary and is fully static (no application code is run). That’s why using reflection or any reflection-based libraries is principally of out the question. However, for Scala, that’s often not an issue — reflection, or any kinds of containers or frameworks are rarely used. This analysis takes quite a lot of time — even for simple applications, it might be 5–10 minutes — so it’s not something you’ll want to run for native runs, but rather on a CI server.

Moreover, if we want to run the native binary in a Linux-based container (and we do), we should construct the binary in a Linux environment as well. Hence to make sure the construct works on MacOS, Windows, etc., we need to run the GraalVM’s native-image tool inside yet another Docker (Linux-based) container.

Luckily, this is all handled by sbt-native-packager. We “only” need to provide the necessary configuration — which sadly isn’t that trivial to get right.

Even though we are in Scala, some of the libraries that we depend on might nonetheless use reflection. An example here is Logback. We also need to provide some hints to the analyzer as to how to handle Scala internals. For example, we need this “magical” file in our sources:

import com.oracle.svm.core.annotate.Substitute;
import com.oracle.svm.core.annotate.TargetClass;

@TargetClass(className = "scala.runtime.Statics")
final class Target_scala_runtime_Statics {
@Substitute
public static void releaseFence() {
UnsafeUtils.UNSAFE.storeFence();
}
}

public class ScalaSubstitutions {}

If you gaze at the configuration of the graalLambda project, you’ll notice quite a lot of configuration options are used. However, this is principally an upfront effort (in fact, I copied almost all of this from ElasticMQ’s configuration). That is, once we get over the initial hurdles of making our logging and concurrency libraries work with GraalVM, we shouldn’t get much trouble with that configuration later.

Endpoints and server logic are just plain code, which is easily handled by GraalVM’s native-image.

What do we get in return?

A cold-start request for a brand-new image took about 3–4 seconds, and for an image that has been used before, about 0.7 seconds. A huge improvement!

Custom runtime

One necessary note: when creating the native binary using GraalVM, we have to write a custom Lambda runtime. We cannot use the 1 provided by Amazon, as we want to use a small base image (alpine in our case), instead of Amazon’s 1, which in itself has 506MB.

Luckily, implementing such a runtime isn’t difficult. There’s a step-by-step tutorial in the documentation. Custom runtimes can be also examined regionally, which significantly speeds up development.

Runtime can be written in any language, as lengthy as we can package the result as a Docker container. When run, it needs to perform a couple of simple steps:

Additionally, the runtime should implement some logging and tracing, however, this is skipped in our PoC implementation. The sources are available in the AwsRuntime file.

We’ve gone through quite a detailed description, so let’s summarize what occurred alongside the way.

We initiated with a bunch of endpoint descriptions, captured as immutable Scala values of type Endpoint.

We then created a TapirHandler class which, given an AwsRequest extracted from the JSON event, runs the AwsServerInterpreter. This in turn uses the endpoint descriptions, each coupled with a server logic function, to create the AwsResponse, which is then serialized as JSON.

This Lambda function implementation is packaged as a Docker image and revealed to an ECR repository.

Then, we created a SamTemplateInterpreter, which takes a List[Endpoint] alongside with the URI of the image with the Lambda implementations and generates a YAML describing our application’s endpoints and functions.

We can use the sam command line utility to apply this template to our AWS account either immediately, or through a wrapper sbt deploy command which performs all of the necessary steps for us.

Finally, we can construct an image with a native binary of our application using GraalVM, to solve the cold start dispute.

As a result, we get a quickly way of deploying even complex APIs to Amazon’s cloud, without having to provision any hardware or pay any upfront fees. At the same time, we are using a high-level, modern, practical language to define our endpoints and their logic.

Thanks to using sttp tapir, we have a single, rich, centralized description of our application’s HTTP endpoints, which we can interpret in many ways: as documentation, clients or, as described above, expose using AWS’s serverless framework. At the same time, we can quite effortlessly migrate to a stand-alone server, if the circumstances (either financial or organizational) require it, using the same endpoint descriptions and the same server logic values. Finally, testing is possible using included utilities.

Time for some reader participation! Where should we go next with this project? Let us know! Thanks! 🙂

.

Most Popular