For a long time, I was looking for a serverless full-text search solution. Of course, there are managed solutions for this, like Elastic Cloud or Amazon Elasticsearch Service. But all these solutions have a fixed cost and do not scale automatically.
I also saw a couple of attempts to run Apache Lucene with Lambda and S3, but these had a couple of issues. Firstly, Lucene is a Java library, and JVM had terrible cold start issues. Well, not anymore with GraalVM, which generates a native executable out of a JAR, reducing cold start and improving overall performance. Secondly, S3 is a bit too slow. Guess what? You can use EFS with a Lambda now, and it has much better latency.
So I thought to myself, why not build a simple proof of concept and see how it goes.
I used the Quarkus framework because of its small footprint and the ease of creating a native executable. This will help me solving the cold start.
Reading a Lucene index is thread and process safe. It means that read throughput is only limited by the AWS Lambda and EFS service limits. Writing, however, is a different story. An IndexWriter needs to obtain an exclusive lock on an index and will fail if another process is holding it.
A common solution in the serverless world when dealing with less scalable components: buffer the requests through the queue. In my case, in a lambda called
enqueue-index, I put the write requests into SQS with unlimited concurrency. And in another lambda called
index, which has a limited concurrency, I write the requests to an EFS drive with a predictable throughput. It solves the exclusive locking and also makes the EFS load more predictable, eliminating spikes.
This approach has its issues, of course. Write operations don’t happen immediately, and the throughput is limited. The throughput issue, however, could be somewhat improved by creating a queue and a worker for each separate Lucene index.
I had to apply a single workaround for Lucene to work on a network file system. Props to Michael McCandless for his helpful contributions to the Lucene source code and for documenting everything. I set a deletion policy to never delete index files so a writer would never delete something a reader could still use. If you’re planning to use it in production, you’ll need to replace it with a better policy.
Using GraalVM made cold starts sit low at 250–300ms. The main driving factor here is the package size, which was around fifteen megabytes.
Running a load test on the
enqueue-index lambda was fruitless as it was just writing to SQS. But I still did it to see how fast the
index lambda will index the documents. I got figures between 50,000 and 100,000 documents per minute, which is from 833 to 1666 documents per second.
Not bad, but please don’t quote me on these numbers since the documents used in that test were quite small.
I also ran a load test on the read lambda and got around 2700 requests per second with an average latency of around 500ms, mostly limited by the EFS throughput.
Running 1m test @ https://<api-id>.execute-api.<region>.amazonaws.com/dev/query
12 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 538.90ms 858.99ms 12.62s 84.59%
Req/Sec 235.47 148.52 0.85k 61.56%
162949 requests in 1.00m, 160.22MB read
The EFS throughput could be improved in several ways, so it could be even faster. The latency could also be improved greatly by reducing the package size, properly reusing
IndexReader objects and caching.
I also ran a couple of tests with writing and reading concurrently, and there were no errors.
While I wouldn’t dare to use this project in production, it shows how many possibilities the ability to use EFS with Lambda unlocks.