Discovering Azure’s Computer Vision and Cloud Search Services — Part 2

8 min readAug 9, 2018

Recap

In the first part of this blog I set myself the challenge to write a solution which would extract text from an image and then match it to something in our dataset. To solve this, we looked at two Azure services (at a high level) and conducted a very simple test to check that we’re heading in the right direction. So far so good. The theory is that these fully managed services, particularly Azure Search, is going to give us a rich set of features to tune our matching logic without a ton of “hackery” or bespoke coding. Best of all, we don’t have to worry about hosting, patching or upgrading the service, and it can be scaled quickly and cost effectively.

In this post we going to try a more complicated test case and write the core logic to process the image.

Curve ball

What we need to do next is test what would happen to our search results if we introduced an ambiguous test case. For example, let’s say we used a poor quality image or the receipt had been severely crumpled in someone’s pocket, and the text recognition algorithm thought the word Lewis was Levis. It might not be a problem if JL was the only entry in our database which resembled that, but what if we had a Levi’s there also?

So I’ll add Levi’s to the store table and rerun the indexer to load in the new data. Now when I run the search string search=john levis~&$select=Id,Store&api-version=2017–11–11 we definitely have an issue…

Our search scores Levis the highest. What happened is that the default search algorithm will tokenize each word and match on any of the words rather than the phrase. Even if we matched the phrase we still have an issue with the spelling error so now I think it’s time to introduce Lucene.

Lucene’s fuzzy matching is based on the Damerau–Levenshtein distance algorithm and one can optionally specify a distance between 0 and 2. There are a number of options to refine your search such as proximity searching, term boosting and regex patterns. If you need to go deeper I highly recommend starting with the following blog and also this one. If Lucene can’t yield the results you’re looking chances are you need to customise the way the data is being indexed. You can use Index operations such as custom analysers or a scoring profiles. Custom analysers tell the service how to convert the text into indexable, searchable tokens for example using a custom tokenizer or adding phonetic search (sounds the same) capabilities. Scoring profiles define how fields should be weighted in order of relevance based on other fields, for example a category field or a relevance field.

Tell Search to use the Lucene query parser by setting the parameter queryType to full and when necessary instruct the engine to use fuzzy matching by appending a tilde at the end of the token. Unfortunately you can’t fuzzy match against a phrase, only each individual term/token.

Lastly and crucially, let’s also tell Search to include all the terms (a phrase) by setting the searchMode parameter to all parameter. Our query string now resembles:

search=john~ levis~&$select=Id,Store&queryType=full&searchMode=all

Whilst the addition of tilde after each word may seem like a bit of a headache, what I would recommend is a graded search which starts with the most specific phrase and if no result are found then try a fuzzy search. Out of interest, in order to run an exact search, enclose the search term in (escaped) quotes like this

search=\”john levis\”

The general rule is that the more specific search request the faster the response and the converse also applies. Even if we do have to make a second call to the API it’s still going to pretty quick and it’s not going to cost any more as the service is billed by the hour, not by how many calls you make.

Let’s change gears and wrap this all into some code, so it’s time to head over to one of my favourite development/execution environments for interactive, sharable and collaborative coding …

Azure Notebooks

Azure Notebooks (powered by Juypter) are a free browser based development environment for Python, R and F#. There are numerous benefits to notebooks, which are beyond the scope of this post, but for our purposes they provide an easy way to develop, run and visualise the output of our code without the complexities of any local environment set up.

Sign up for an account and then open my notebook here. Click on the clone button which will copy the library into your account. Click on the process-image notebook to open it. This is your version of the code so you can substitute the placeholders with your your details. For those not familiar with Python, I’ve added a few comments (beginning with #) so you can hopefully grasp what I’m trying to acheive. To run each block or cell of code, click on it and then click the run button.

Once you’ve successfully searched for the store, return to this post to index the product data.

Product Indexing

Now we’re ready to load our next index for products so flip back to the Azure portal, find your Search resource, and click on the import link. Select Azure SQL Database for the data source (as we did initially) but this time we’re going to specify the Products table. In the target panel, select retrievable and searchable for the product field but also filterable on the StoreId. This is so we can search for products belonging to a particular store.

Click OK, and provide an indexer name however this time let’s define a custom refresh schedule which will track changes in the products table. Click Custom schedule, specify an interval such as 5 minutes, start time and Id as the high watermark column. This is going to “pull” new rows into our index which is, at worst, going to be 5 minutes stale. If you need to synchronise updates then we would need a version number or timestamp column that changes everytime the record is updated. If you have more complex requirments here’s a great document which provides additional information on implementing incremental indexing and how to synchronise deletes.

Notebooking— part deux

Back in the notebook, I’ll reveal how we going to extract the products from the receipt metadata and run them through a Search request.You’ll also see how I used the store ID, found in part one, to filter the results in the product search.

Hopefully you were successful. My final result looked like this:

Searching for product S 76770159 Novelty Gift 
Product is 76770159 Novelty Gift, ID 1 
Searching for product S 76770148 Novelty Gift 
Product is 76770148 Novelty Gift, ID 2 
Searching for product S 74400103 Travel Accessory 
Product could not be found 
Searching for product 5 62630101 Carrier bas charge 
Product is 62630101 Carrier bag charge, ID 3

Final plug for Search

Two other features I thought I should mention are synonym search and search traffic analytics. Synonyms provide you a seemingly “intelligent” solution to those cases where no amount of fuzzy matching is going to yield the correct result. Either it is a different word altogether but equivalent or it’s an acroynm or an abbreviation. I say “intelligent” because there’s actually no real intelligence or AI used to solve this, you literally have to write your own synonym map and load it into the service. It would be really useful in our scenario because I can foresee an issue with say TV and television.

Finally, traffic analytics provides a way to gather telemetry data from all search requests so that you can analyse the results for issues or perhaps particualr terms that are frequently returning no results.

Before we go on to describe the architecture in the next post I want to introduce you to two key players in our final solution, Functions and Queues.

Azure Functions

Simply put, Functions are a way to run code without worrying about servers. There are two pricing models but normally people chose to pay for what you use, the Consumption plan, which is great if you have sporadic and unpredictable workloads throughout the day. On this plan, each Function shouldn’t be long running (10 minutes max) or use excessive amounts of memory (1.5 GB max) otherwise you should consider the App Service plan or another type of service. If you want to orchestrate a set of stateful functions you should check out Durable Functions. In theory we should decompose our script into two functions, one two perform the text recognition and the other to perform the search but for simplicity here we’ll keep them as one.

Last thing to note before we talk about queues. Azure Functions currently offers first class support for a bunch of languages, however Python is not there yet. This means there are a few additional steps we’ll need to take to make our code work, but believe me it’s worth it.

Azure Storage Queues

To me, queues epitomise reslient, fault tolerant workloads. The primary advantage of queues is that they act as a “buffer” of incoming units of work. Typically a worker may be a function app, docker container or low priority VM running some code to pick items off items the queue and process them. The beauty of this design is resiliency. Should a worker crash half way through processing or your low priority VM be evicted, the item, or message as it’s known, will remain in the queue and some other worker will pick it up. How do you prevent two or more workers processing the same message at the same time? The instant a message is read from the queue, that thread will obtain a “lock” on that message effectively making it invisible to all other worker threads. If there is a failure that same message re-appears on the queue (by default 4 retries) for another thread/worker to process it. If you want a great deep dive into event processing using Azure Functions I highly recommend reading Jeff Hollan’s blog.

Another advantage of queues is that they allow you to easily monitor and react to how many items of work / messages are in the “backlog”. Thanks to the scale controller however, you won’t need to scale your Function, it will automatically scale (both out and in ) according to the number and age of the messages in the queue. Lastly, and perhaps not as obvious, but queues can provide Acme an alternative entry point to their service. What if, not all images land in our storage account, what if a 3rd party held the images but wanted to provide us with a reference to their image, they would simply push messages into our queue for processing.

For those curious, here is a great article about the difference between Azure Storage Queues and Azure Service Bus Queues.

Part 2 — Conclusion

In this post we dove a bit deeper into the search service, and looked at a trickier test case which involved a fuzzy search. We wrote the core part of our logic using Notebooks and we introduced some important cloud services. In the next post we’ll look at the architecture and deploy a working prototype to the Azure Cloud.