Innovation Sprint Series: The ins and outs of Elasticsearch

“Give up searching and the ten thousands things become one
Take up searching and one becomes the ten thousand
The unborn mind does not cling to either”

Hi, my name is Brooke Stephens and I’m a web developer working on the Ukulele project, a.k.a. the new CBC Listen web app prototype. Currently, Radio One and CBC Music content is on separate apps, but Ukulele will offer these two content types in one web application. As such, this provides a new set of challenges and opportunities for search.

For our two-week department-wide Innovation Sprint, I pitched “Ukulele Search,” a new CBC Listen search feature that indexes both talk radio and music content. I conceived this idea by experimenting with our existing search features and asking questions like:

  • What if the user misspells the host’s name while searching?
  • How do we index multiple data sources all with different data structures?
  • What makes audio content results relevant to our audience?
  • What are CBC’s most popular podcasts?

After pondering these questions and doing some basic research on existing search tools, I decided that Elasticsearch would help us solve many of these issues. I formed a team with developers Abdel Bolanos Martinez, Deven Juta and Dryden Zarate and we got down to business.

This is the search results page. The user can either tap on the image to dig deeper into that particular show or music stream, or they can tap the play button to start streaming audio instantly. (Ukulele Search Team/CBC)

What is Elasticsearch?

Elasticsearch is open-source software that provides search indexing in a RESTful manner (more on that below). It allows different search functions, such as making different indexes for music streams and radio shows, which make it easy to filter when you search. It also provides other useful features, like scaling for high usage, fuzzy search for spelling mistakes, and weighting meta-fields for controlling search result relevance (i.e. making the title worth more than the description).

My top three takeaways from the Innovation Sprint:

  1. Building a search index up front improves performance and query capabilities.
    Elasticsearch is a RESTful search index that consumes and serves JSON responses. Doing the heavy lifting up front by continuously indexing audio metadata means search performance can be greatly improved. Furthermore, structuring your search index appropriately expands how the audience can search and what they can surface. For example, we will be able to allow users to find CBC shows by host name.
  2. Including auto-complete/auto-suggest helps guide users on search possibilities.
    Helping the audience understand what is searchable by enabling auto-complete on the search box allows for better discovery of our audio content. It also provides quicker access to audio content when users know what they are looking for. Type in the query “Tremon” and you will get an auto-complete list including “The Current with Anna Marie Tremonti.”
  3. Empowering teams to make critical decisions for themselves fosters creativity and a strong work ethic.
    The psychology of ownership is powerful. When teams can make democratic decisions together and act on those decisions then a sense of ownership is created, which allows team members to be invested in their work. For example, during the Innovation Sprint our team worked overtime because we were excited about building something in an environment where we were free to make decisions that would impact the outcome of our project.
This is how the autocomplete feature works. As the user types into the search field, autocomplete helps the user finish their search query. (Ukulele Search Team/CBC)

The End Result

At the end of the sprint we created an awesome prototype search feature indexing both Radio One shows and CBC Music streams. In just two weeks, our team was able to create an Elasticsearch index, a search index aggregator, a search results web interface with playable items and an auto-complete feature. We also developed a better understanding of the Ukulele product more generally.

This is how Elasticsearch fits into the overall infrastructure of our CBC Listen applications. The right-hand side of the diagram shows the data sources that feed into the Elasticsearch index via the aggregator. The left-hand side shows how our applications call Elasticsearch through our API. GraphQL can be used to transform API data into consistent JSON responses. (Ukulele Search Team/CBC)

Working on this project with this team was a positive experience for me. I enjoyed taking on a leadership role and learning new technologies. I’m looking forward to learning more about search and discovering ways to improve our current prototype.

Thanks for reading and thanks to the Ukulele Search team for all the hard work!

Abdel Bolanos Martinez and Brooke Stephens give Elasticsearch a thumbs up. (Evan Mitsui/CBC).