Comparing the Performance of Empty String Checks in Mendix (Banner Image)
Comparing the Performance of Empty String Checks in Mendix

Comparing the Performance of Empty String Checks in Mendix

Published in
7 min readJan 18, 2024

--

In my previous article about empty Strings, I proposed that the expression — trim($String) != ‘’ — is best practice due to readability ánd performance reasons. However, after posting the article on LinkedIn, it got a lot of traction, specifically on this performance claim. Hence, in this article, using 2 experiments, we will try to find out the performance differences between the following 4 common empty-string-check expressions, and settle the dispute once and for all!

trim($Object/StringAttribute) != ''            // Expression 1

length(trim($Object/StringAttribute)) > 0 // Expression 2

$Object/StringAttribute != empty
and trim($Object/StringAttribute) != '' // Expression 3

$Object/StringAttribute != empty
and length(trim($Object/StringAttribute)) > 0 // Expression 4

Experiment 1

Our methodology for the first experiment was as follows. In a project in Mendix 10.6.1, we built a Microflow with a while loop that ran as long as an $Iterator variable stayed below or on the $MaxIterations of 1 million. We tracked the start time of the loop, and at the end of the loop, we calculated the delta time in milliseconds. The Microflow was duplicated four times and added to four buttons on a page. Each of the four Microflows included its own expression (1–4), as listed at the top of the article. The only difference between the flows was the expression, ensuring that any time differences were the result of the expression.

Screenshot of part 1/2 of the Microflow testing expression 3
Screenshot of part 1/2 of the Microflow testing expression 3

In the while loop, we started with a create enum variable activity that calculates the $Iterator modulo 5, and assigns it to an enum value: 1/2/3/4/0. Next a Decision Switch on the just-created $Modulo enum determined what String variable value would be created next: 1) ‘ trim me ’, 2) empty, 3) ‘’, 4) ‘ ’, 0) ‘trimme’. The values were deliberately chosen to include both whitespace and empty, as well as non-whitespace-containing values. As long as the number of $MaxIterations modulo 5 equals 0, the speed differences due to shorter/longer switch calculations should be negated, on average. Next in the loop, a create variable boolean activity ran the expressions (1–4). Finally, the $Iterator was incremented.

After the loop, we calculated the delta time it took to run the iterations by using the function millisecondsBetween($StartTime, [%CurrentDateTime%]). Next, we showed a message that included this value, and finally, we committed the floored result in the database.

Screenshot of part 2/2 of the Microflow testing expression 3
Screenshot of part 2/2 of the Microflow testing expression 3

All 4 Microflows were run 50 times and the average time it took to run each was calculated. We deliberately chose to mimic form-hand in behavior by clicking a button for each run, thereby hoping to average out the interference of background OS processes. All tests were run sequentially (1 for expression 1, 1 for expression 2, etc.) after a fresh system start-up, and all non-Mendix programs were closed. Seeing as the differences between the Microflows boiled down to just the expressions, we should be able to determine which is faster than the other, although we recognize that OS background processes still might interfere. Hence, we ran the same test on 4 different devices, by four different developers. Finally, we used a One-Way Anova in Excel to test if the differences in calculation time (ms) between the expressions (1–4) were significant or not.

The results of Experiment 1 were as follows. It has to be mentioned that due to computer issues, Expression 4 got fewer calculations from 1 user, as is visible below.

Table1, showing descriptive statistics for each of the four expressions for Experiment 1

On average, it turned out that expression 1 was significantly faster than the others (p = 0.00), although the observed average differences were small. Interestingly, and contrary to expectations, it seemed that expression 3 outperformed expression 2 on average. A more in-depth statistical analysis was out of scope for this article.

A Conundrum

Our hypothesis had been that expression 1 should be faster than 2, 2 than 3, and 3 faster than 4; we expected extra empty checks to cause a delay in processing time: more operations = slower code. As the results from experiment 1 were not completely in line with these expectations, this led us to question the methodology. I decided it was time to call in help from Dr. in IT Marien Krouwel, who has a lot more experience with these kinds of experiments.

The main feedback he had on the set-up of our experiment was that 1) it measured much more than just the expressions; there was the creation of a modulo enum, the loops itself, and this would probably have an impact, where the fine-grained detail of the results would be mixed in with other data, and 2) it didn’t test larger Strings, which is known to have an effect on processing time. He proposed a different setup, which led to Experiment 2!

Experiment 2

For Experiment 2, we decided to only measure the time it takes to execute the Change Variable Activity of expressions 1–4; however, it initially turned out that the function millisecondsInBetween() did not allow for enough resolution, as we would always get 1ms as a result. Hence we used 10 successive Change-Variable Activities, but the function still did not provide fine-grained results. Consequently, we looked at a Java action to help us out.

Screenshot of Java Action JA_GetNanoTime

This action returns the System time in nanoseconds (ns) and is intended for comparing two sequential times. However, it is limited in a sense, as no guarantees are given that it returns an actually updated value if queried too closely after one another; there was a rough 15ns margin of error.

It turned out that there was now enough resolution, as we would get a result above 100ns per iteration. In the following While Loop, it is possible to see how we measure time and execute the expressional changes.

Inner Loop in which we take a StartTime (ns), then execute 10 Change-Variable activities using Expression 1 and measure the EndTime (ns).

These while loops would be run 10.000 times, and each time the difference between start and end times would be added to a tally for the expression. This would be done for each Expression (1–4) String, a total number of 100 times, leading to 100 x 4 datapoints per String, where each datapoint represents 10.000 calculations of the expression.

A second limitation of Experiment 1 was that there were no long Strings included in the test. Thus, for Experiment 2, longer randomStrings were added, to eventually make up for 14 TestStrings. In all, this meant we tested 14 (Strings) x 100 x 10.000 = 14m iterations of each of the four expressions (140m calculations each).

The results of Experiment 2 were as follows.

Table2, showing descriptive statistics for each of the four expressions for Experiment 2

On average, it turned out that expression 1 was significantly faster than the others (F= 940.8, p = 0.00). Moreover, expression 2 was less fast than 1, expression 3 was less fast than 2, and expression 4 less fast than 3. In all, the more operations were used, the slower the expression, which was in line with expectations. Nevertheless, the observed differences were very small; the average processing times listed here are for 10k iterations of an expression. A more in-depth statistical analysis was out of scope for this article.

Concluding, it seems clear that performance differences between the empty checks in both experiments are very small. A clear limitation of these experiments was the use of System.nanoTime() and the uncontrollable influence of background OS processes. This may have led to unrealiable results. However, this is not a scientific paper — it is just a fun way to look at performance within Mendix ;-)

That being said, for readability’s sake, there is clearly only one expression which is superior, which is:

trim($Object/StringAttribute) != ''            // Expression 1 - Winner!

Hence, let it be sorted now once and for all! ;-)

Want to try out the experiment yourself? You can download the projects here.

Special Thanks to Lennart Spaans, Dominique Munten, and Bartosz Hetmanski for helping out with gathering data for experiment 1, and engaging in an interesting conversation about performance on LinkedIn. Special thanks to Marien Krouwel for setting up and optimizing experiment 2, critically reviewing, and thinking along!

Do you want to make your validations even more efficient? Check out my other articles below ;-)

#About: Wouter Penris
Wouter is a (lead) Mendix Software Engineer (Expert/MVP/Trainer) with a background in education (English & Music) and in music (Jazz Vocals & piano); a multi-potentialite. Where as a teacher he spent years perfecting the art of instructing his students as simply and understandably as possible, he now immensely enjoys using this same skill to build high-tech products with an intuitive and easily-understandably UX. He is a highly creative problem solver, in which he blends research, technology and aesthetics into actual business value and advice. Wouter is a communicator at heart and especially enjoys working with international clients due to his background in English. Wouter currently works as Senior Principal Expert in Mendix at Ordina.

Ordina promo, January 2024 — www.ordina.nl

--

--