Optimizing CPU Usage with Java Couchbase Client

Onur Yılmaz
Trendyol Tech
Published in
3 min readJul 2, 2020

In Trendyol, we use microservice architecture. Claim service is the one of them which responsible of the claim operations of order management process. The service is written with Java and its most popular framework Spring Boot is used. Spring Boot is a great framework that has a lot of features that provides easily write code without thinking dependency injection or the other configurations and it provides easy configuration of your applications. It brings so much ease ,but also it brings some problems that are difficult to be detected.

The Problem

In this article, I will briefly explain the problem that we have encountered which may randomly cause cpu leakage. It’s a difficult problem to solve since it is very hard to detect the cause. One of the application pods cpu significantly increased in two days. We monitored that behavior of the application for a month, progress continued within normal levels at first then cpu consumption increased gradually. We saw application hangs runnable a lots threads when we profiled this pod with newrelic thread profiler which allows us to profile of running threads there were java threads and a small number of netty threads.

cpu leaks

Accordingly we began a search to figure out which client used netty package. We found that couchbase and redisson client used netty background at last.

The application depended on two different storage solutions. Couchbase and Redis. We used couchbase with spring boot couchbase template at the service. The template brought so many reflection processes that we would never utilize. We used redis for distributed lock and redisson clients provided us easy implementation with lock method.

Redis Client Modification

We changed redis client redisson to jedis. After that we observed the behaviour of application for a few days. Unfortunately this changed nothing, one of these pods were consuming a lot of cpu again. Accordingly, we turned back to redisson client and upgraded it. After two days of monitoring we detected that even cpu leaks continued, but this time netty thread hang count was decreased when compared before.

before redisson version update
after redisson version update

Couchbase Client Modification

Considering this improvement, we decided to change the couchbase client since the client uses netty nio transport. After some research we found netty nio bug https://github.com/netty/netty/issues/327 thus we focussed on this issue. We also identified that our source of cpu leaks was coming from spring boot couchbase client. So we decided to switch to couchbase java native client.

The native client uses netty epoll event group native linux transport which generates less garbage and improves performance when compares to nio.

We paired with Erdem Erbas to implement couchbase native client. After monitoring of several days , congratulations we solved the cpu leakage problem. Because the client uses latest version of netty package also we saw cpu usage decrease %60 all of pods.

after couchbase native client implementation

Conclusion

The spring boot couchbase client 2.1.8 version failed for us, maybe we have failed to implement but native client provided better cpu consumption.

If you desire to research and fix kind of scale challenges in Trendyol with us, #cometotrendyol

--

--