HTTP caching with hazelcast, and other tricks how to make my angular app load faster.

piotr szybicki
12 developer labors
10 min readJun 19, 2018

--

Front-end

HTTP Caching

For beginner developers, is a subject usually researched at the end of the project when it turns out that not everybody will load a page form the localhost:4200 :). And the amount of bandwidth needed to load a page is far more then today 3G networks can handle. I titled this post HTTP caching but I will share all the tricks I had to use to make my application load faster. From the very basic to the more advanced (maybe a little specific for my application). To demonstrate concepts described here, I created a demo application in my repo. The stack is angular 6, spring 5, hazelcast 3.10.1. Link in the section at the bottom.

First think you have to do if you haven’t already is set the -prod flag. There is a lot of optimization like tree shaking that can reduce the size of your page significantly. In my case it was almost 80% reduction. It can also spill some interesting errors that have to be fixed. For example the visibility of the fields might be enforced more strongly.

ng build --deploy-url /your_url/ --prod --env=prod

Enable compression

In the spring, by setting the parameters in our application.properties.

server.compression.enabled=true
server.compression.min-response-size=2048
server.compression.mime-types=application/json,application/xml,text/html,text/xml,text/plain,text/css,application/javascript

That gave me another 40%. The files, that before I started, had combined size of 25MB now weight below 1.5 MB. If you want to be extra fancy there is a new compression algo from google called brotli. Some further reading:

Cache Control settings

And that could be good enough, but i decided to keep going and try to figure out a way to make my application load event faster. I started to look at what was loaded and that’s when i noticed this header set on the request to fetch static resource:

It turns out that if you are using spring security this header is added to all outgoing request. In order to enable caching of static resources you have to do some configuration. Example below.

@Configuration
public class HttpClientConfiguration extends WebMvcConfigurerAdapter {

@Override
public void addResourceHandlers(ResourceHandlerRegistry registry) {
final CacheControl cacheControl = maxAge(1, DAYS)
.cachePrivate();

registry.addResourceHandler("/**")
.setCacheControl(cacheControl)
.addResourceLocations("/");
}

}

After that I could see a proper header set and in the column size (google chrome dev tools) from memory cache, from disk cache messages.

I think this is a good moment to do some explanation what is actually in those headers and how we can leverage them to reduce the traffic to server. First let’s start with the Pragma, this is the legacy header used for backwards compatibility it was created as part of HTTP/1.0 standard. If the Cache-Control is present it is ignored. The same goes for the Expires. If in the Cache-Control appears value for max-age the Expires header is also ignored.

Ok so the Cache-Control is the only header that really interests us. And the next question is what values can be put in it. First we can specify public, private, no-cache. No-cache means that the resource has to be re-validated with the origin server every time, some strategies how to leverage that will be described later. It is important to understand that no-cache doesn’t mean that the resource can’t be cached. Public indicates that is is a resource that can be shared by many users in practice is is a directive for proxy servers and API gateways telling them it is ok to cache the response and if any other user behind the proxy requests the same resource it will be served from the proxy. Private means that the request is for specific user only and therefore can be cached only by the browser.

Using public caches is outside scope of this tutorial. All my caching setting will be set to either private or no-cache. Now first choice we have to make is either cache or no cache. If we want to fetch resource every-time the full header has to look like this:

Cache-Control: no-store, no-cache, must-revalidate, max-age=0

Technically the only header that we have to set is no-store if we wanted to make sure that our response is not saved anywhere, but unfortunately some older browsers treat no-cache as they should no-store. Also we must take under consideration what should happen when user hits back button, many browser (Chrome) will use the previous response doing that. And when dealing with sensitive information like the one from your bank or stock broker this default behavior might not be desired. As I mentioned above if you are using spring security(and have single bundle deployment including static resources) this header is added to all outgoing requests.

Of course you don’t want to use that for every request. Let’s talk about how the single page angular application is structured. We have the index.html that contains link-s to the *.js files. The html file is usually small, but it changes once per deployment so we can cache it. We of course can’t use the fix-ed value when determining whether or not the data is stale (I did that in the example above Cache-Control: max-age=86400).It is easy to understand why if user cached the page and then we would deploy new one he or she would use the old version and that,for example, could break our api and cause the page to function incorrectly. Here where ETag-s come to rescue. ETag (short for entity tag) has two application: confirming that the resource has not changed, and help-s us to avoid mid air collision (situation that we want to modify previously fetched resource while that resource has been modify by somebody else in the meantime). I will focus only on the caching application.

The above flow demonstrate the simple case of using Etag. First the browser sends the GET request and fetches the index.html in the response full page is send, along with the Etag header. Then some time has passed and the browser request the index.html again but this time the header If-None-Mach, containing the Etag value, is send along the way. Back-end server can use that to verify if the response body has changed if it didn’t only the header is send with the status 303 Not Modified. Thus reducing the bandwidth. Screen shoot bellow shows the HTTP response code

If however the response has changed, meaning the Etag-s do not match, full response is sent along with the new Etag.

This gives us interesting caching scenario. As I mention the index.html in angular contains links to *.js but every-time we do a build using ng build. All the generated files have this format name.unique_id.bundle.js. That means that to *.js file we can apply more relax caching policy, and cache them forever. Because if we do a next deployment the Etag flow will detect the changes in html and loads everything (as the name of files will change). And then all we need is verify that index.html has not changed. That means that we have to make some changes to our configuration.

final CacheControl cacheControl = maxAge(100, DAYS)
.cachePrivate();

registry.addResourceHandler("*.js")
.setCacheControl(cacheControl)
.addResourceLocations("classpath:/public/");

registry.addResourceHandler("*.css")
.setCacheControl(cacheControl)
.addResourceLocations("classpath:/public/");

That’s it when it comes to handling static resources. All you have to do to enable Etag behavior is to add ShallowEtagHeaderFilter to your configuration of spring security, and disable the default cache control behavior. That means that if left as it is the cache controll header will not be set. Normally it is not a problem as we will add the Etag to all outgoing request and that will give us a predicted behavior.

@Override
protected void configure(HttpSecurity http) throws Exception {
http.authorizeRequests().antMatchers("/*").hasRole("ROLE").and().httpBasic();
http.headers().cacheControl().disable();
http.addFilterAfter(new ShallowEtagHeaderFilter(), FilterSecurityInterceptor.class);
}

Ok. So we handle our static resources as efficiently as we possibly can. The screenshot from the network traffic tells the following story. We send the request to get the index.html and got the 304 Not Modified response. To the browser fetch the html file from the cache, parsed it and determined that all the resources can be loaded either from memory or from disk. 378 B was transferred. Cool hug?

screen shot from the network traffic tab.

Protocol version

This is actually quite important. Make sure that your hosting supports HTTP/2 it allows for the parallel fetching resources from the server. Instead the water fall the can be seen in bellow:

Lazy Loading

There is also something we can do in angular it’s self. Lazy loading of a modules. Bellow there is a code snippet with the root router definition. In the line 3 you can see components on the path ‘customers’ will be lazy loaded from the customers module. I also would like to draw your attention to the line 10. I have a setting {preloadingStrategy: PreloadAllModules}. It means that as soon as the resources from index.html end loading the loading of other modules will begin. Good scenario is when you have a login screen to display is as fast as possible and when user will type his/her credentials the rest of the page will load.

Backend

Now let’s do something about dynamic part . Here we basically have the same options: set the Etag, set the max-age, disable cache. I will cover them next.

So starting from the bottom of the code snippet, if there is some value that we do not, under any circumstances, want to cache all we have to do is set:noStore().noTransform().mustRevalidate() This is for back account data, medical history, porn hub searches :) etc.

The next one is a little bit tricky, here we can actually use hazelcast (great in-memory cache, separate article on it is also coming). Let’s assume that there is some resource that takes a lot of CPU Time to compute and the response is quite big. So two problems we don’t want to send entire response again and again as well we don’t want to trigger recalculation every time. Here the Etag alone will not help us, remember in order to calculate the check sum of the response you have to have it first. So going to your db fetching the query result and then calculating some complicated CPU statistic is not an option.

There is a saying in developer community “Two hardest problems in software development are: variable naming and cache invalidation.” Cache invalidation is a complicated subject that gave me a lot of headache. And I will cover it in a separate article. Today however I want to talk about schedule based invalidation. Probably the simplest one and therefore only good in certain cases. In short I have a scheduler service that runs every 30 seconds to clear hazelcast ICache. And uses first incoming request to recalculate and store the result in a object called CachedObject that has two fields, insertionTime and payload. We use the insertion Time to calculate the number of second the resources will be valid and then use that number to set max-age in Cache-Control. Here it is important to mention in what order the cache controlling headers are processed (remember we still are using Etags). So according to a spec if we set the cache control max-time it will take precedence before Etag so the resource will not be fetched from the server again before it expires from the cache (Of course it will include the Etag in that request so the rest of that logic still applies).

There is something very subtle that makes this work and I want to direct your attention to it. I mentioned this before and I want to add some specifics. Take a look at the code peace bellow. I used annotations to implement caching. And instead of calling recalculation method from the cache I evict the all entries. So the next recalculation will happen when user ask for data, getting response out of that first request will be slower than the rest. Of course the danger here is that many users will trigger the recalculation, but i found that generally it is still worth it. To have 10 request slow and 100 fast and cheap.

Of course if there is no way to determine the time when the response may change (and user always has to have the fresh data) Etag is all we got. In that case the first method in the code snippet is an example of desired behavior. We fetch the value from the cache and pass it in the response entity. The ShallowEtagHeaderFilter will add all the necessary headers. And max-time is set to zero.

Screen shot bellow gives the overview of all three methods with the comments.

That is it for today. What i didn’t mention in this tutorial I definitely expressed in the code. So I welcome you to take a look. Link below:

https://github.com/kwaspl/blog_posts/tree/master/http%20cache%20demo

--

--

piotr szybicki
12 developer labors

Piotr Szybicki’s, Programmer, Java Developer, ML Entusiast