Realtime Recommendations in Retail using OpenVINO™

Rishabh Banga
Intel Software Innovators
7 min readJul 1, 2019

INTRODUCTION

With the retail market getting competitive by the day, there has been a sharp rise in the ability to optimize services and business processes while trying to satisfy the expectations of customers.To stay competitive, big retail players all over the world are applying analytics at all stages of the retail process — keeping track of emerging and popular products, sales forecasting and future demand via predictive simulation, offers through heat-mapping of customers and many others. Even though retailers collect enormous amounts of data from transactional systems to help inform decisions, making the most of that data can be challenging. In 2016, U.S. and European retailers only used about 30–40 percent of all business data available to them.

This is where our solution comes in to the picture. By identifying customers (for now those who’ve opted for store membership) and assessing their past purchases, the store is able to suggest certain products customers would likely be interested in, finding the most suitable way to handle them via targeted marketing strategies (real-time offers) and then coming up with what to sell next.

STRATEGIC AREA OF RETAIL

Although not as prevalent as in other vertical segments, Analytics has still managed to show itself useful in Retail, some of which include:

  • Churn Rate Reduction
  • Demand prediction
  • Discount Efficiency
  • Explanatory Analytics
  • Forecasting trends
  • Future performance prediction
  • Identifying customers
  • Price Optimization

THE IDEA

The idea revolves around limiting the data stream to just one source, i.e. Cameras. Facial recognition would be employed to detect customers inside the shop & cross check from the store’s database to verify their membership status. Members who have opted in for this service are given discounts on products they have recently looked at like a pair of blue Levi’s jeans. This is achieved by tracking the member’s movements and identifying what clothing article & brand they’re interested in via Object recognition. So discounts can be given on the fly.

THE GEEKY STUFF

For the most part of this project, i.e. training and testing, the main factor was obviously the data, loads and loads of it, which was mostly uncategorized. Some of this data was derived from a combination of datasets to get the kind of dataset I was looking for. This included datasets Apparel Sales by Store Type and Retail Sales by Category amongst others. For the uninitiated this is what falls under the category of Data Mining — The computational process of discovering patterns in large data sets using quantitative methods. This involves finding hidden patterns and relationships in large databases and inferring rules to predict future behavior with a probability of occurrence.

Each category is more advanced in terms of complexity, level of automation, the amount of data required, difficulty in applying the method, and business value. Each category also represents a reduction in the level of human input required for decision making.

Descriptive analytics tells us what happened, when and where. Diagnostic analytics tell us why. Predictive analytics go a step further, telling us what is likely to happen and prescriptive analytics what we can do to make a particular outcome more likely. Lastly, cognitive analytics advise us on the best action to take.

IMPLEMENTATION

To create this PoC, I ended up mixing and matching a lot of different combinations of the base machine, cameras, inferencing software and inferencing hardware before finalizing on the below mentioned setup:

CAMERA

The camera (a Logitech camera with 4K Ultra HD quality) had inbuilt light adjusting features and a 90 degree field of view, which ensured I had to worry less about factors like minimizing background noise & visual clutter. Moreover, owing to its native picture/video recording capabilities as well as the option of switching between 3 different field of views and 4K quality, this was the best camera to go with.

INTEL® DEVCLOUD

For most part of this project I utilized Intel’s IoT Edge on the Cloud service.

The main reason behind this was the access to latest Intel® hardware, Intel® optimized frameworks, latest computer vision tools and 50 GB of file storage, which ensured I was able to train my model and perform inferencing easily from my laptop without worrying about the proper configuration or overheating issues from training for longer hours.

OPENVINO™ TOOLKIT AND ASYNC API

For inferencing purposes, I utilized the capabilities of Intel’s Open Visual Inference & Neural network Optimization (OpenVINO™)Toolkit, particularly the following:

  • Convolutional neural ​network (CNN) based deep learning inference on the edge
  • Heterogeneous execution across OpenVINO toolkit accelerators — CPU, and Intel® Movidius™ Neural Compute Stick
  • Optimized calls for CV standards, including OpenCV*, OpenCL™, and OpenVX.

To know more about the OpenVINO™ Toolkit, it’s capabilities and installation steps, you can refer to the link here and here.

Once the setup was done and model ready, all that was needed was to test it. Luckily, the toolkit provides a nifty way of ensuring a model trained using frameworks such as TensorFlow and Caffe is adjusted for optimal execution on end-point target devices. This can be done via navigating to the folder — <INSTALL_DIR>/deployment_tools/and running using mo_tf.py and mo.py respectively.

Async API usage can improve overall frame-rate of the application, because rather than wait for inference to complete, the app can continue doing things on the host, while accelerator is busy. Specifically, this demo keeps two parallel infer requests and while the current is processed, the input frame for the next is being captured. This essentially hides the latency of capturing, so that the overall framerate is rather determined by the MAXIMUM(detection time, input capturing time) and not the SUM(detection time, input capturing time).

Finally, once this has all been done, it’s time to run inference. Which leads to the third item required for the PoC.

PUTTING IT ALL TOGETHER — THE CODE

Now that we’ve gone through all the individual elements it’s time to talk about the glue that binds them together. The code breakdowns the entire process into these 3 flows:

  • Video as input being supported using OpenCV and fed to the model with OpenCV providing the bounding boxes, labels and other information. Thee code of video being supported for inferencing can be seen below:
videoHTML(‘IEI Tank + Myriad (Intel® Core + Movidius NCS 2)’,[‘results/myriad/inference_output_Video_0.mp4’, ‘results/myriad/inference_output_Video_1.mp4’,,’results/core/Statistics.mp4'], ‘results/myriad/stats.txt’)
  • Inference being performed on edge compute nodes with various target compute devices. The code of submitting to a node with Intel® Neural Compute Stick 2 is as follows:
print(“Submitting a job to an edge compute node with Intel Movidius NCS 2…”)#Submit job to the queue
job_id_myriad = !qsub shopper_detection_job.sh -l nodes=1:tank-870:i5–6500te:intel-ncs2 -F “results/myriad MYRIAD FP16 3” -N monitor_myriad
print(job_id_myriad[0])corne#Progress indicators
if job_id_myriad:
progressIndicator(‘results/myriad’, ‘i_progress_’+job_id_myriad[0]+’.txt’, “Inference”, 0, 100)
  • Visualization of the resulting bounding boxes. The code of visualizing the node with Intel® Neural Compute Stick 2 is as follows:
stats_list = []stats_list.append((‘results/myriad/stats.txt’, ‘Intel\nNCS2’))summaryPlot(stats_list, ‘Architecture’, ‘Time, seconds’, ‘Inference Engine Processing Time’, ‘time’)summaryPlot(stats_list, ‘Architecture’, ‘Frames per second’, ‘Inference Engine FPS’, ‘fps’)

RESULT

Now that the major work has been done, it’s now time to visualize the performance across different architectures. In terms of Inference Engine Processing Time, the Intel Xeon took the least amount of time, obviously, followed by Intel Core i5 CPU and GPU.

In terms of Inference Engine Frames Per Second, the Intel Xeon peaked at 155 followed by Intel Core i5 CPU and GPU and 127 and 11 respectively.

CONCLUSION

The PoC turned out to be way better than originally anticipated, especially because of the OpenVINO™ Toolkit together with the Intel® DevCloud capabilities. One thing to be noted is that while I limited the scope of this article to the inclusion of only those components that I used in the final PoC, one can also utilize the following for the same — Intel AI Vision Kit which comes with the OpenVINO toolkit pre installed on a UP Squared Board and the, Intel® RealSense™ Camera, which not only judges depth but also extracts and records 3D information in .bag format, which can be manipulated using the instructions here.

WHATS NEXT?

Now that I have an up and ready PoC, it’s now time to make a workable solution out of it. One is to get more and more data and facilitate online learning so that the model can improve on its failures, and the second linking it to a Database for Data storage and retrieval purposes.

Thank you for taking the time to read the article up to this point. The project is published on the Intel DevMesh and all of the relevant details of the project can be found there. If you’d like to contribute to this project or any of the projects mentioned on my feed, feel free to drop a mail if you are interested.

--

--

Rishabh Banga
Intel Software Innovators

Head of Product | Ex-R&D Senior SWE at Ericsson | Tech Evangelist | Prototype Consultant