Micro optimizations in Java. String.equalsIgnoreCase()

Dmytro Dumanskiy
Javarevisited
Published in
4 min readAug 6, 2020

So, in the previous post, we considered two approaches to improve the performance of the well-known String.equals() method. Now, we’ll look into two other String class methods that you are using every day.

(Please consider all the below code from the point of performance)

(Please do not focus on numbers, these are just examples to prove the point)

equalsIgnoreCase

As usual, we’ll start with the very simple and basic code. This code is pretty common for web servers that have HTTP API and expect some kind of a parameter:

if ("hello world".equals(param.toLowerCase())) {}

What is wrong here?

It’s pretty obvious. We do toLowerCase() that allocates a new instance of the String object. Now GC has to gather it and making GC busy is always a bad idea for the performance.

We can easily replace it with String.equalsIgnoreCase():

if ("hello world".equalsIgnoreCase(param)) {}

Let’s create a benchmark and check whether it’s actually faster (I’ll add more drama by using one of the parameters in the upper case):

Note: I added the equals() method so we can use it as a baseline. Also, I tried Apache Commons Lang 3 StringUtils.equalsIgnoreCase(), however, it was slower than the Java version. So I excluded it for simplicity.

Results (lower score means faster):

So what do we see?

equalsIgnoreCase() is 20–50% faster than the equals(param.toLowerCase()) pattern. And 25 times faster when the parameter doesn’t match the constant string. Let’s take a look at the String.equalsIgnoreCase() to understand why the difference is so huge for the non-matching parameters:

Aha, that’s because equalsIgnoreCase() has a string length check and in that particular test, this length is different. So no char matching actually happening.

Another observation — the more upper case letters the string has, the slower the results. That’s because both toLowerCase and equalsIgnoreCase iterate over all chars and perform Character.toLowerCase() if chars don’t match.

And finally, the equals() method outperforms equalsIgnoreCase() by 2.5 times.

“Okay, but Java 8 a bit outdated these days”, you would say. I agree, so let’s repeat the test with Java 11.

Wait a minute… That’s not what we expected. Looks like in Java 11 the performance of both options is similar. And with the specific input data the equals(param.toLowerCase()) even a bit faster!

How is that possible? If you look in the numbers carefully, you can find that in both tests the performance of the equalsIgnoreCase is very close. What actually had changed, is the execution time of the equals(param.toLowerCase()). So, we have either equals or toLowerCase method improved in Java 11.

(I did a small investigation. Looks like the main difference in performance between Java 8 and Java 11 comes from the toLowerCase() method. It was like 70% refactored).

Here are some conclusions from the above results:

  • Use Java 11 to get the free improvements even without any code changes
  • Go with equals() when possible. For example, in case you have control over the data or the server API, you can do toLowerCase() on the UI or you can require the parameters of the API to be case sensitive. We do it a lot in Blynk
  • When you don’t have control over the input, choose equalsIgnoreCase() over equals(param.toLowerCase()). It doesn’t allocate unnecessary objects, has a fast path when the string length doesn't match and it’s still faster on Java 8

Fortunately, this time spring and hibernate did much better. I didn’t find any equals(param.toLowerCase()) patterns. Looks like they already fixed that.

startsWith

Let’s look at another use case:

if (url.toLowerCase().startsWith(“http”)) {}

or

if (url.substring(0, 4).equalsIgnoreCase(“http”)) {}

Again, this is a valid, commonly used code. It’s pretty clear what it’s doing. But we can do better. Let’s rewrite it with the less known method String.regionMatches that we saw in the String.equalsIgnoreCase:

if (url.regionMatches(true, 0, "http", 0, "http".length())) {
}

Now, let’s prove it’s actually faster:

Note: I also tried Apache Commons Lang 3 StringUtils.startsWithIgnoreCase() method, but it was slower than the Java version. So I excluded it for simplicity.

Results (lower score means faster):

So, for all cases the String.regionMatches wins with 50–400% difference to the closest competitor.

The only downside of the regionMatches is that it’s not very self-descriptive, has 5 parameters and it’s hard to see what method does if you aren’t used to it. However, you can add a utility method here, like this:

static boolean startsWithIgnoreCase(String url, String param) {
return url.regionMatches(true, 0, param, 0, param.length());
}

And here again, spring and hibernate did great. Only one or two similar code constructions. But we have other sources of the code.

Let’s search for similar code in the GitHub:

A lot of slow code out there! It could have been worse.

Have you checked your code already?

Conclusion

  • Use Java 11
  • Always go with String.equals() over String.equalsIgnoreCase()
  • If you can’t use String.equals(), use String.equalsIgnoreCase() instead of the equals(toLowerCase()) combination
  • Always choose String.regionMatches() over alternatives (3-d party library methods included) like toLowerCase().startsWith()

Here is source code of benchmarks, so you can try it yourself.

Thank you for your attention and stay tuned.

The next post: Micro optimizations in Java. Good, nice and slow Enum

--

--