The unexpected complexity of simple software

Tigran Saluev
JoomTech
Published in
8 min readJul 15, 2022

Again and again I face surprise when announcing an estimate of project complexity: “Why does it take so long?”, “But it’s just three easy steps!”, “But you can just take X and combine it with Y”. Programmers are used to estimating deadlines as the time to write and debug code, but there’s a lot more to implementing a big feature.

The iceberg of software development stages. Implementation of a feature, writing tests, configuring metrics and alerts, debugging, refinement of requirements, API review & approval, fixes after QA, backward compatibility, validation with A/B test, hotfixes, analytics.
Did you know that icebergs in real life are oriented horizontally in the water, not vertically as in most stock images?

Although, even if we forget about a traditional bunch of enterprise tweaks like analytics, backward compatibility support and A/B testing and focus purely on the code in direct relation to the implemented functionality, we may see that its complexity often grows out of control.

In this article, I’m going to talk about several features that my colleagues and I have implemented in Joom at different times, from the problem definition to the details of implementation, and show how easily the seemingly simple things turn into a tangle of endlessly complex logic, requiring many iterations of development.

Profile search

One of the big sections of the Joom app is an internal social network where customers can write reviews of products, like and discuss them, and subscribe to each other. And what kind of a social network would it be without profile search!

Of course, search as a feature is not such a seemingly easy thing. But I already had all the necessary knowledge, and we had a ready joom-mongo-connector component which could transfer data from a MongoDB collection to Elasticsearch index, adding additional data and doing some post-processing. The task sounded pretty simple.

Task. Implement an API for searching by social network profiles. No search filters — sorting by the number of followers will do for a start.

Okay, that sounds easy enough. We set up a transfer from the socialUsers collection to Elasticsearch by writing a YAML config. On the backend, we add a new API endpoint similar to the product search API, but without filters and sorting support for now (just query text and pagination). In the handler we make a simple request to Elasticsearch-cluster (the key is to get the right cluster!), then we take IDs of found documents — they are also user IDs — and convert them to the client JSON, hiding private information from prying eyes. That’s it — or is it?

The first problem we faced was transliteration. The user names were taken from social networks, where users from Eastern Europe (they were the majority at the time) often wrote them in Latin letters while their languages used Cyrillic. You may try to find “Юля”, but she’s “Julia” on Facebook, so she’s not in the search results. Similarly, you cannot find “Иван” by “Ivan”, though you’d really like to!

Here goes the first complication — during indexing, we would use Microsoft Translator API for transliteration and store two versions of first and last names. As an unpleasant side effect, the common indexing component became dependent on the transliterator client (and still is).

And the second problem, which is easy to anticipate if your native language is Russian, but it exists in other European languages as well, is the diminutive forms and abbreviations of names. If “Ivan” decides to call himself “Vanya” on Facebook, you won’t find him by querying “Ivan”, no matter how much you transliterate it.

So the next complication was that we found an index of diminutive names, added it to the code base as a hardcoded table (as easy as two thousand lines), and began to index not only the names and their transliterations, but also all the obtained diminutive forms (fun fact: in English they are officially called “hypocorisms”). We took every word in the username and looked it up in our humble table.

Official screenshot of the Joom codebase. Circa 2018.

Then we put the word out to Joom’s regional managers and asked them to find us reference books of national name abbreviations in their countries of operation. If not academic, then at least whatever they could provide. It turned out that in some languages, in addition to the tradition of having a compound name (“Juan Carlos”, “Maria Aurora”), there are also abbreviations of two, three or even four words into one (“María de las Nieves” → “Marinieves”).

This new fact made it impossible for us to make a one-word lookup. Now we had to break a sequence of words into fragments of arbitrary length, and what’s more, different breaks can lead to different abbreviations! We didn’t want to delve into the depths of linguistics and write an AI that would abbreviate a Spanish name the way it’d be abbreviated by a real Spaniard, so we sketched out (I’m so sorry, Dr. Knuth) a combinatorial search.

And, as always happens with combinatorial search, it exploded on one of the users and we had to urgently add a limit on the maximum number of generated spelling variants. This further complicated the code (which was already surprisingly complex for this kind of task).

Machine translation of products

Task. Translate names and descriptions of products provided by the sellers in English to the user’s language.

You might have seen memes about the bizarre translation of Chinese product names. We’ve seen them too, but the desired time to market didn’t let us come up with anything better than using some existing API for translation.

It is easy to write an HTTP-client, create an account, and translate the product into the required language when it’s displayed to a user. But translations aren’t cheap, and it would be wasteful to translate the same popular product into French in each one of tens of thousands of views. Therefore, we’ve implemented caching: for each product we saved translations to the database, and when there were translations already available there, we didn’t use the translator.

But there was still room for improvement. We figured that a reasonable trade-off between translation quality and price would be to break the descriptions into sentences and cache them. After all, products often have the same patterned phrases, and constantly translating them is just wasteful. So we added another layer of abstraction to our translator component — a layer between the HTTP client and the cache (holding entire products in different languages) that breaks the text into fragments.

After release, the quality of translations was, of course, a great concern. We thought: what if we used a more expensive translation API? Would it work better with our specific texts? You can’t compare them with the naked eye, so we had to conduct an A/B test. This way we added a translation API name to our translation cache key in addition to the product ID and started requesting a translation with translation API choice depending on which A/B test group the user was in.

The expensive translator performed well, but it was still too wasteful to use it on all products. At some point, however, Joom was released in new countries where national languages were so poorly handled by our primary translator that we were ready to spend more for a successful launch; this way the logic of choosing translation API became even more complicated.

Then we decided that some of the stores on the platform are so good and the platform cares so much for their success that it’s OK to translate their products with the more expensive translator. So the logic for choosing a translator became dependent on the user, the country and the store ID as well.

Finally, we decided that our primary translator might have improved over the few years of Joom’s existence, and it could make sense to update the translation cache with some periodicity. But how to go without the A/B test? To that end, we got a “freshness” field in our cache, and things got even more complicated once again. As a result, our translation component got so incredibly complex, and this despite the fact we haven’t even enabled any homemade computational linguistics… Yet.

Converting clothing sizes

Perhaps one of the most painful problems when you buy clothes and shoes online is choosing the right size. When shipping from local warehouses, domestic businesses like Lamoda can simply deliver several sizes at once and just as easily take back what didn’t fit, but it doesn’t work that way with cross-borders. Parcels take a long time to deliver, the cost of each extra kilogram is high and their senders don’t expect a large flow of returning mail.

The problem gets further complicated by the fact that sellers from different countries may have completely different ideas of size. A Chinese “M” can easily turn out to be a European “XS”, and the horrendous-sounding “9XL” may not be all that different from an “XXL”. Experienced users rely on measurements, but even these are not always correct. For example, the user expects to see a measurement for chest girth, but the seller shows measurements of the garment itself — these differ by 5–10%. We don’t want the user to bother that much to shop at Joom!

Task. Instead of sizes provided by sellers, show users the sizes calculated by us using some uniform conversion table based on measurements.

Okay. We take the size table, which is parsed from the product description (there’s a dedicated rocket science of 5 KLOC component to handle this) and stored in a separate field, and substitute the sizes in it with the calculated ones. Then we hardcode the table to convert the girth to size (the one we simply find on the Internet) and rejoice.

But if there is no size table in a product or there are not enough rows in it, it doesn’t work. There goes the first reason for implicitly turning off the feature on a particular product.

Hmm, the size table is supposed to show sizes for real body measurements, but most sellers provide them by measuring clothes rather than mannequins. OK, let’s tweak them by a difference factor. Product manager Rodion, the lucky owner of the perfect size “M” shirt, goes to a nearby mall, tries on a bunch of different clothes, and comes up with factors — they’re similar, but vary significantly for different kinds of clothing. For a tight turtleneck the difference is almost 0%, but for a sweater it’s 10%. Also, outerwear of the same kind may vary in fit (“slim fit”, “normal fit” and “loose fit”), and that gives a spread of another ±5%. Now our difference factor (carved in the code as the Rodion coefficient) consists of two multipliers.

To define the fit, we make another parser that attempts to extract it from the product’s name or description. If an item does not fall into one of the categories checked by Rodion, the feature is implicitly turned off — for the second reason.

The final touch: in a large number of products, the chest girth is listed from armpit to armpit, that is, it’s only half the girth, which leads to ridiculously small sizes. We add the logic that if the girth is less than X, well, it just can’t be, it’s obviously half the girth, and we multiply it by two. We’re lucky that adults don’t usually differ from each other in size by more than 100%.

Now everything is so complicated that when testing a feature, it’s impossible to understand why it didn’t turn on or worked out one way rather than another just by looking at the product. We’re adding a large layer of logic to the code which logs detailed reasons why the conversion is turned off. To be able to fully trace the reason for it turning off on a particular product, we have to forward the error messages up the stack, adding details for each instance, several times. The code becomes dreadful.

And it all works in various ways depending on the A/B test group, of course.

Beware of G̶r̶e̶e̶k̶s̶ ̶b̶e̶a̶r̶i̶n̶g̶ ̶g̶i̶f̶t̶s developers optimistically estimating deadlines. Estimating development time is very difficult, no matter how simple the task sounds, and surprises await at every turn!

--

--