ETag 101: Tips and Tricks for Implementation
By Abhishek R and Naveen S.R
What is ETag? 🤔
The ETag or entity tag is a part of the HTTP response header that acts as an identifier for a specific resource version. It is one of the mechanisms that HTTP provides for Web cache validation. This mechanism allows caches to be more efficient and save bandwidth, as a web server or even backend services does not need to resend a full response if the content has not changed. Additionally, etags help prevents simultaneous updates of a resource from overwriting each other.
If the resource at a given URL changes, a new Etag value must be generated. Etags are similar to fingerprints, and can quickly determine whether two representations of a resource are the same. They might also be set to persist indefinitely by a server/service.
Where can we use ETag?
Avoiding mid-air collisions
Let’s imagine a scenario where multiple clients are trying to change a wiki page. How can we detect a mid-air edit collision?
We can hash the current wiki content and sent it in the ETag response header:
ETag: “006540df2072ef320c644e61720c754f3”
while saving the wiki page, we send a POST request with the If-Match request header containing the ETag value we received from the response header previously to check the freshness of the page.
If-Match: “006540df2072ef320c644e61720c754f3”
If the hashes don’t match, it means that the document has been edited in-between and a 412 Precondition Failed error is thrown.
Validation of Cached data
Let’s say a mobile/browser has cached response from the server. But how can it check the freshness of the cache and decide whether to show from the cache or get a fresh response from the server?
ETag when used in conjunction with the If-None-Match request header can be used to take advantage of caching at the client side. The server generates the ETag which can determine a page has changed. Essentially clients ask the server to validate its cache bypassing the ETag back to the server.
The process looks like this:
- Clients request page A.
- The server sends the request along with ETag.
ETag: “006540df2072ef320c644e61720c754f3”
- Clients then store the response in the cache, along with the ETag.
- When clients request page A again, send the same ETag in the If-None-Match in the header field
If-None-Match: “006540df2072ef320c644e61720c754f3”
- The server compares the If-None-Match with the present ETag, If it matches it sends back a response of 304 Not Modified with an empty body.
How we use ETag in our back end service 😉
Let’s look into how we can actually implement this on the back-end!
ETag Generation
We use the MD5 hash of response from our service, which happens at the servlet filter level.
How can we improve the server time? 🤔
Comparing the ETag and the If-None-Match and sending 304 Not Modified all seems fine, but the server time remains the same if not increased.
So the trick is to introduce caching and Interceptor in the spring-boot service.
Caching
We introduce caching mechanism in your service to store the request hash as a part of the key and the response hash has the value so that we can map them.
Cache Namespace:- ETag
Cache key:- consumerID::userID::MD5Hash(Request Payload)
Cache value:- MD5Hash(Response Body) — ETag value
At any point in time, there will be a single entry for a particular userID for a particular consumerID.
Servlet Filters and Interceptor
Servlet Filter
A filter is an object used to intercept the HTTP requests and responses of your application. By using a filter, we can perform two operations at two instances.
- Before sending the request to the controller
- Before sending a response to the client.
Usually, servlet filters will not have access to spring-context, meaning you can’t @Autowire to get spring beans.
So use the following method to register your filter or you can even use @Autowire if that filter is defined as a component.
@Configuration
public class ETagConfig { @Bean
public CustomETagHeaderFilter customETagHeaderFilter(){
return new CustomETagHeaderFilter();
}
// or
// @Autowire
// CustomETagHeaderFilter customETagHeaderFilter
// And use this in the while registering bellow @Bean
public FilterRegistrationBean<CustomETagHeaderFilter> customETagHeaderFilterRegistrationBean() {
FilterRegistrationBean<CustomETagHeaderFilter> filterRegistrationBean
= new FilterRegistrationBean<>(customETagHeaderFilter());
filterRegistrationBean.addUrlPatterns("/cs/v1/content/data","/cs/v1/content/userintent/data");
filterRegistrationBean.setName("etagFilter");
return filterRegistrationBean;
}
}
Interceptor — HandleInterceptor
HandlerInterceptor is very similar to Servlet Filter, but it just allows custom pre-processing with the option of prohibiting the execution of the handler itself, and custom post-processing.
This interface contains three main methods:
- prehandle() — called before the actual handler is executed, but the view is not generated yet. This method should return true to continue the further process, if false it stops the execution of the handler
- postHandle() — called after the handler is executed
- afterCompletion() — called after the complete request has finished and the view was generated
The following piece of code will register with InterceptorRegistry. Note you can also choose the URL pattern for it to apply the interceptors.
@Component
public class WebMvcConfig implements WebMvcConfigurer { @Autowired
ETagInterceptor eTagInterceptor; @Override
public void addInterceptors(InterceptorRegistry registry) {
List<String> pattern = new ArrayList<>();
pattern.add("/cs/v1/content/userintent/data");
pattern.add("/cs/v1/content/data");
registry.addInterceptor(eTagInterceptor).addPathPatterns(pattern);
}
}
Design
From the design, you can make out the following three cases:
- When the If-None-Match we just return true in the prehandle() without any processing and the control goes to Controller and the controller sends in response with 200 status.
- When the If-None-Match is present and cache miss
- The interceptor computes the hash of the request body to form the cache key. If the cache is not present for this particular user with that request body, we clear the cache with the “consumerID::userID*” pattern. And the request hash is sent as a request attribute for future use.
- Next, the control goes to the controller and returns the response, with the status set to 200
- The filter then picks up this response, as it sees the If-None-Match request header
- The filter then gets the hash of the request from the request attributing, thus reducing the computation, then computes the response hash and cache the data.
- It also adds Etag and Last-Modified (by getting the server time) in the response header and finally sending it out to the client
- When the If-None-Match is present and the cache is present
- The interceptor computes the hash of the request body to form the cache key. If the cache hits it sets the ETag with the previous ETag got from If-None-Match and sets the status code 304 and returns false. Thus, without entering the controller layer the response goes to the client with an empty body.
Implementation
Filter Implementation
Here I will provide a few methods from the CustomETagHeaderFilter class which extends from the OncePerRequestFilter abstract class
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException { String previousToken = request.getHeader(HttpHeaders.IF_NONE_MATCH);
if(previousToken != null) {
HttpServletResponse responseToUse = response;
HttpServletRequest requestWrapper = new BodyHttpServletRequestWrapper(request);
if (!isAsyncDispatch(request) && !(response instanceof ContentCachingResponseWrapper)) {
responseToUse = new ConditionalContentCachingResponseWrapper(response, request);
} filterChain.doFilter(requestWrapper, responseToUse); if (!isAsyncStarted(request) && !isContentCachingDisabled(request)) {
updateResponse(requestWrapper, responseToUse);
}
} else {
// If the header is not present just pass the control next filter in the chain
filterChain.doFilter(request, response);
}
}private void updateResponse(HttpServletRequest request, HttpServletResponse response) throws IOException { ContentCachingResponseWrapper wrapper =
WebUtils.getNativeResponse(response, ContentCachingResponseWrapper.class); Assert.notNull(wrapper, "ContentCachingResponseWrapper not found");
HttpServletResponse rawResponse = (HttpServletResponse) wrapper.getResponse(); if (isEligibleForEtag(request, wrapper, wrapper.getStatus(), wrapper.getContentInputStream())) { String previousToken = request.getHeader(HttpHeaders.IF_NONE_MATCH);
String eTag = wrapper.getHeader(HttpHeaders.ETAG); if (!StringUtils.hasText(eTag)) {
eTag = generateETagHeaderValue(wrapper.getContentInputStream(), this.writeWeakETag);
rawResponse.setHeader(HttpHeaders.ETAG, eTag);
logger.info("ETAG: " + eTag);
} String cacheControl = response.getHeader(HttpHeaders.CACHE_CONTROL);
if(cacheControl == null || !cacheControl.contains(DIRECTIVE_NO_STORE))
{
String finalCacheKey = String.valueOf(request.getAttribute(Constants.RequestAttributes.NAMED_ATTR_REQUEST_HASH));
cacheHelper.setCache(Constants.Cache.ETAG_CACHE_NAMESPACE, finalCacheKey, eTag.replace("\"", ""));
} if(compareETagHeaderValue(previousToken, eTag)){ // compare previous token with current one
// use the same date we sent when we created the ETag the first time through
rawResponse.setHeader(HttpHeaders.LAST_MODIFIED, request.getHeader(HttpHeaders.IF_MODIFIED_SINCE));
logger.info("ETag match: returning 304 Not Modified");
rawResponse.sendError(HttpServletResponse.SC_NOT_MODIFIED);
} else { // first time through - set last modified time to now
Calendar cal = Calendar.getInstance();
cal.set(Calendar.MILLISECOND, 0);
Date lastModified = cal.getTime();
rawResponse.setDateHeader(HttpHeaders.LAST_MODIFIED, lastModified.getTime());
}
} wrapper.copyBodyToResponse();
}
Note that we have an option to tell the filter not to store the Etag in the cache by sending in the request header Cache-Control with the value containing no-store.
To read the request multiple times in the filters/interceptors we have written a wrapper called BodyHttpServletRequestWrapper and we pass this wrapper through the filter chain instead of HttpServletRequest which can be read only once.
public class BodyHttpServletRequestWrapper extends HttpServletRequestWrapper {
private final byte[] body; public BodyHttpServletRequestWrapper(HttpServletRequest request) {
super(request);
this.body = HttpRequestHelper.getBodyString(request).getBytes(StandardCharsets.UTF_8);
} public BufferedReader getReader() throws IOException {
return new BufferedReader(new InputStreamReader(this.getInputStream()));
} public ServletInputStream getInputStream() throws IOException {
final ByteArrayInputStream bais = new ByteArrayInputStream(this.body);
return new ServletInputStream() {
public boolean isFinished() {
return false;
} public boolean isReady() {
return false;
} public void setReadListener(ReadListener readListener) {
} public int read() throws IOException {
return bais.read();
}
};
}
}
Here we are storing the request body in a variable, which basically serves as a cache for the request body which can be read as many times as you want!
Interceptor
@Slf4j
@Component
public class ETagInterceptor extends HandlerInterceptorAdapter { @Autowired
HashRequestKey hashRequestKey; @Autowired
CacheHelper cacheHelper; @Autowired
CacheEvictService cacheEvictService; private static final String DIRECTIVE_NO_STORE = "no-store"; @Override
public final boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws IOException { String cacheControl = response.getHeader(HttpHeaders.CACHE_CONTROL);
if(!cacheControl.contains(DIRECTIVE_NO_STORE))
{
String method = request.getMethod();
if (!"GET".equalsIgnoreCase(method) && !"POST".equalsIgnoreCase(method))
return true; String previousETag = request.getHeader(HttpHeaders.IF_NONE_MATCH); if (previousETag != null) // If the If-None-Match header is present
{
HttpServletRequest requestWrapper = request;
if (!(request instanceof BodyHttpServletRequestWrapper))
requestWrapper = new BodyHttpServletRequestWrapper(request); String finalCacheKey = hashRequestKey.getCacheKey(requestWrapper); if (finalCacheKey != null) {
// check cache present or not
if (cacheHelper.cacheCheck(Constants.Cache.ETAG_CACHE_NAMESPACE, finalCacheKey, previousETag.replace("\"", ""))) {
response.setHeader(HttpHeaders.ETAG, previousETag);
// re-use original last modified timestamp
response.setHeader(HttpHeaders.LAST_MODIFIED, request.getHeader(HttpHeaders.IF_MODIFIED_SINCE));
log.info("ETag match: returning 304 Not Modified"); response.sendError(HttpServletResponse.SC_NOT_MODIFIED); return false; // no further processing required
} log.info("ETag no match found");
String[] splitCacheKey = finalCacheKey.split("::", 3);
// Set the request hash in the request header for future use
request.setAttribute(Constants.RequestAttributes.NAMED_ATTR_REQUEST_HASH, finalCacheKey);
// Evict the cache if the userID is not Null
if (!splitCacheKey[1].equals("null")) {
String pattern = Constants.Cache.ETAG_CACHE_NAMESPACE + "::" + splitCacheKey[0] + "::" + splitCacheKey[1] + "*";
cacheEvictService.clearCacheByType(pattern);
}
}
}
}
return true;
}}
Note that we have an option to not implement any process at the interceptor level by sending in the request header Cache-Control with the value containing no-store.
How can a client use the ETag?
For a client to enable the ETag and use it for cache validation, they can simply do so by sending in two request headers:
- If-None-Match — Initially send an empty string
- If-Modified-Since — Initially send an empty string
When they get the response from the server, they have to store the following response headers for future uses:
- ETag
- Last-Modified
So in the future when they again call the server, they should send the following
- If-None-Match = ETag from the cache, if not present send empty string
- If-Modified-Since = Last-Modified from the cache, if not present send an empty string
Validating the cache
- When the response code 200 is sent from the server, invalidate the present cache and update the cache with the response from the server.
- When the response code 304 is sent from the server, the cache still holds good. Serve the data from the cache.
Make sure you are having the cache for the user in the client before using the ETag feature.
Hope you found our ETag guide useful!