The Dreaded Boolean Search

Getting into the nitty gritty of UX and Boolean Search

If we get into our way-back machine and travel all the way back to 1997 here is Jakob Nielsen’s advice regarding search:

Summary: Search is the primary interface to the Web for many users. Search should be global (not scoped to a subsite) and available from every page; booleans should be made intimidating since users usually use them wrong.

Nielsen couldn’t be more right about it. In my job search is critical to just about every major workflow my users conduct across a suite of about 15 different tools. Almost twenty years after Nielsen’s advice, search is still central to how to design websites and web apps. Search should be prominent at the top of the page; it should review all of the items across the website (including the image metadata for people with vision impairments); and, best of all, Boolean search is just about the worst thing ever created.

The concept of Boolean search was born when the user creates a search string that includes specific verbiage to link terms together. Common Boolean terms are “And”, which means that two terms have to be found together in order to return that result; “Or”, which means that either of the two terms have to be found in order to return that result; and, “Not”, which means to exclude any results that include that term. Users can combine these terms together to create complex and nuanced searches.

Because Boolean search is really powerful, it is inevitably going to lead to user errors. It is hard to understand, users are over confident in how to use it, and the terms are jargon. To give you an idea of how poor Boolean search is, I am going to review some of the research in Marti A. Hearst’s book Search User Interface.

Dr. Hearst starts with her description of the problems with Boolean search by saying:

“…studies have shown time and again that most users have difficulty specifying queries in Boolean format and often misjudge what the results will be.” p108

The problems with Boolean search are:

  1. Users add too many terms and the search results are not useful. Think of a set of terms with a whole bunch of “OR”s. Each “OR” could return a whole different data set. For example, if a user were searching for images of ‘Tardis OR DeLorean’ the search engine could return images with Dr. Who, Marty McFly, and this sweet mashup (GREAT SCOTT!). The search also does not exclude other time defying majesties. This is particularly hurtful because users get frustrated with the results being too broad even though they are using awesome Boolean logic terms.
TARDIS OR DELOREAN or is it TARDIS AND DELOREAN?

2. Conversely, users can overly restrict the results from ones that could be useful. To continue with the previous example, if a user were to do an image search for ‘Tardis AND DeLorean’ the search engine would return only such sweet mashups as that which is pictured above. Anything with only ‘Tardis’ would not be returned.

3. The jargon is pretty confusing. For example, most users think that the term “AND” widens the number of things that are going to be returned. Instead, what Boolean logic actually does is highly restrict what is returned. This is terribly counter intuitive for even expert users. The other terms are not much better either.

When users start making use of Boolean search it is a bit like Pandora and that pesky box — as in it is a source of potential unforeseen trouble. Users have no idea what they are getting into until they are already using it; they have no idea what they are missing and get frustrated at the results being continuously broad or too narrow even though sophisticated search terms are used.

Boolean search is the worst.

Yet no matter how many times I beg my users to rely on the power of smart ranking algorithms instead of Boolean search, I cannot convince them. I hear things like, “But other tools have it!” and “But we’re really smart users”. In the end I always design a Boolean search component against my better judgment.

With that in mind I have some advice and a sample design pattern that is working relatively well. But, I also have a caveat. I design web applications and not many websites. The design below works for web applications where the user is trying to do a set of tasks. One of those tasks is to search across an extremely large dataset. This kind of search is in contrast to a website search where there may be a lot of content within the website. For instance, a user may want to search across a website that has many subpages. That kind of search is a simpler keyword search versus the more complex search across multiple different values (e.g., keywords, IDs, numerical values). The design pattern illustrated below may have to be simplified in order to be used for a websites search.

Below is a basic Boolean Search modal that I have used in multiple systems now with good results with user testing. I’ll step through the pieces that reflect design decisions that were made.

  1. We followed the advice given by Hearst to use terms like “Any” and “All” rather than “And” and “Or”. In the text at the top of each group it says “Include ‘ANY’ of these requirements”. The drop down allows the user to select any of the terms listed below to be returned or all of the search terms to be used. This allows the user to understand what is restricting or expanding her search results.
  2. We removed the need for parenthesis. Most Boolean search forms allows users to enter parenthesis to make sets of search terms; we don’t. Users can create sets of terms that are joined together, and they do it in a visual way by using pluses. Also, we grouped terms inside gray boxes to denote what was together and what was not.
  3. There was a conscious choice to only support two layers of nested linking of items. If users *wanted* to do more layers we waited for users to come to us and say that before designing it. So far, two years in, no one has asked for it.
  4. There was a decent amount of debate on what to call a set of terms that have been grouped together. We settled on “Group” since it was the most generic term that most understood.
  5. We made it easy to continue adding terms so that after the user adds the first one they can hit the plus sign on the right to add the next one. If the user hits the plus then the “Or” appears — if ANY is in the drop down — to further explain how the terms are related. Conversely, if ALL is in the drop down then “And” is listed between the terms.
  6. When the user adds a group then a drop down appears allowing the user to again link by “And”s and “Or”s.

I can’t get away from Boolean search, but I can do my best to make it usable for everyone involved. Boolean logic is hard to understand, even for experts. So being kind and being thoughtful about how to allow users to do complex queries is the best solution — because in the end, no matter how well designed, I know my users are going to get confused and frustrated. It is inherent to the beast that is Boolean search. It doesn’t mean we cannot iterate, user test, and continue the great research into how to improve.

It also doesn’t mean you can’t remind yourself how powerful natural language recognition and ranking search algorithms really are, and that one day users will stop asking for this outdated and problematic feature.


Like what you read? Give Laurian Vega a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.