Boost Your Application Performance with Lucene Caching and Regular Updates

Hatim Tachi
CodeX
Published in
4 min readJul 11, 2024

In modern applications, optimizing database access is crucial for ensuring fast response times and high performance. By integrating Apache Lucene as a caching layer and updating the cache periodically, we can significantly enhance the performance of HTTP requests. This article will guide you through the process and demonstrate the performance improvements with statistical comparisons.

Performance Comparison: Before and After Caching

To demonstrate the performance improvements, let’s compare response times before and after implementing Lucene caching.

Scenario:

  • Database Response Time: Average time to fetch data from the database is 100 ms.
  • Lucene Cache Response Time: Average time to fetch data from Lucene cache is 10 ms.

Before Caching:

  • Each request results in a database hit.
  • Average response time = 100 ms.

After Caching:

  • First request results in a database hit and data is indexed.
  • Subsequent requests are served from the Lucene cache.
  • Average response time = 10 ms.

Results:

Before Caching:

  • 100 requests -> Total time = 100 ms/request * 100 requests = 10 000 ms.
  • Average response time per request = 100 ms.

After Caching:

  • 1st request (database hit): 100 ms.
  • 99 subsequent requests (cache hits): 10 ms/request * 99 requests = 990 ms.
  • Total time = 100 ms + 990 ms = 1 090 ms.
  • Average response time per request = 1 090 ms / 100 requests = 10.9 ms.

Approach

  1. Set Up Apache Lucene
  • Include Lucene dependencies in your project.
  • Create a utility class to manage indexing and searching.

2. Periodic Cache Update

  • Implement a scheduled task to refresh the cache every 10 seconds.

3. Integrate with HTTP Requests

  • Modify your resource class to use Lucene for fast data retrieval.
  • Fallback to the database if data is not found in the cache.

Implementation

Step 1: Add Lucene Dependency

Add the following Lucene dependencies to your pom.xml file:

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>8.11.1</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>8.11.1</version>
</dependency>

Step 2: Create Lucene Utility Class

Create a utility class LuceneCache.java to manage Lucene operations:

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

import javax.enterprise.context.ApplicationScoped;
import java.io.IOException;

@ApplicationScoped
public class LuceneCache {

private final StandardAnalyzer analyzer = new StandardAnalyzer();
private final Directory index = new RAMDirectory();
private final IndexWriterConfig config = new IndexWriterConfig(analyzer);

public void indexData(String id, String content) throws IOException {
try (IndexWriter writer = new IndexWriter(index, config)) {
Document doc = new Document();
doc.add(new StringField("id", id, Field.Store.YES));
doc.add(new TextField("content", content, Field.Store.YES));
writer.addDocument(doc);
writer.commit();
}
}

public String searchData(String id) throws IOException, ParseException {
Query query = new QueryParser("id", analyzer).parse(id);
try (DirectoryReader reader = DirectoryReader.open(index)) {
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, 1);
if (docs.totalHits.value > 0) {
Document doc = searcher.doc(docs.scoreDocs[0].doc);
return doc.get("content");
} else {
return null;
}
}
}
}

Step 3: Implement Scheduled Cache Update

Use Quarkus’s scheduling capabilities to update the Lucene cache every 10 seconds:

import javax.enterprise.context.ApplicationScoped;
import javax.inject.Inject;
import io.quarkus.scheduler.Scheduled;

@ApplicationScoped
public class CacheUpdater {

@Inject
LuceneCache luceneCache;

@Scheduled(every = "10s")
void updateCache() {
// Fetch updated data from the database
// For demonstration, assume fetchDataFromDatabase() returns a Map<String, String>
// where the key is the ID and the value is the content.
Map<String, String> updatedData = fetchDataFromDatabase();
for (Map.Entry<String, String> entry : updatedData.entrySet()) {
try {
luceneCache.indexData(entry.getKey(), entry.getValue());
} catch (IOException e) {
e.printStackTrace();
}
}
}

private Map<String, String> fetchDataFromDatabase() {
// Implement your database fetch logic here
// Returning mock data for demonstration purposes
Map<String, String> data = new HashMap<>();
data.put("1", "Updated content for ID 1");
data.put("2", "Updated content for ID 2");
return data;
}
}

Step 4: Integrate Lucene with HTTP Requests

Modify your resource class to use Lucene for data retrieval:

import javax.ws.rs.*;
import javax.ws.rs.core.MediaType;
import javax.inject.Inject;

@Path("/data")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class DataResource {

@Inject
LuceneCache luceneCache;

@GET
@Path("/{id}")
public String getData(@PathParam("id") String id) throws Exception {
String result = luceneCache.searchData(id);
if (result == null) {
// Fallback to database if data is not found in cache
String dataFromDb = fetchDataFromDatabase(id);
// Index the data for future requests
luceneCache.indexData(id, dataFromDb);
return dataFromDb;
}
return result;
}

private String fetchDataFromDatabase(String id) {
// Implement your database fetch logic here
return "Data from database for id: " + id;
}
}

Testing and Validation

  • Index Data: Use the scheduled task to index data from the database every 10 seconds.
  • Retrieve Data: Use the GET /data/{id} endpoint to fetch data. The first request will fetch from the database and index it. Subsequent requests within 10 seconds will be served from the Lucene cache.

Conclusion

Integrating Apache Lucene as a caching layer with periodic updates significantly boosts application performance. The comparison clearly shows a dramatic reduction in average response time from 100 ms to approximately 10.9 ms per request. This approach ensures fast data retrieval and reduces the load on the database, making it ideal for applications with high read-intensity. Implementing such a solution can greatly enhance user experience and system efficiency.

--

--

Hatim Tachi
CodeX
Writer for

My name is Hatim and I am a passionate Data Tech Lead in the technology industry.