The Difference Between Implicit and Explicit Data for Business
Every customer interaction is a chance to learn and offer better user engagement
We’re going to talk about explicit and implicit data in Recommendation Systems, especially looking at the problems that negative or unclear responses can cause. It helps to know something about the subject upfront. If you are new to the topic, you can catch up by reading the first two articles in this series that will walk you through it:
- Building Python Recommendation Systems that Work (Jakub Cwynar)
- Everything You Need to Know Before Building a Recommendation System (Tim Clayton)
All caught up? Now, let’s start by looking at the two different types of data.
Explicit data
A customer buys a product, rates a film, or gives a thumbs up or down to a post. The customer is clearly showing how they feel about a product. The data we receive is clean and actionable.
Implicit data
A customer views a product but does not make a purchase. A user watches a film trailer or reads an article about something. We’ve got a statement of intent but no clear, affirmative action.
Is one type of data better than the other?
You might assume that explicit data is always more valuable to businesses. It is clear, unambiguous, and gives us a definite picture of the user. However, some businesses may actually prefer, or take more value from implicit data — often because explicit data is much harder to collect. Let’s look at Spotify. Simply listening to a song is not explicit data in itself. The system does not know for sure that the user likes that song. Actual explicit data is when the user adds a specific tune to a playlist or hits the heart icon to say that they enjoy listening to it. But how many of us actually do that? I have personally listened to thousands of songs on Spotify without really noticing the heart button. If you are in the office listening to a recommended playlist in a browser while busily working in a different tab, are you really going to click back and forth to like each song that comes up?
The same applies to YouTube, IMDB, and a host of other websites where people browse and view but do not always leave a rating. In such cases, there is exponentially more implicit than explicit data being created by user activity.
Explicit data can also be shallow. Users may be asked to give binary reactions: like or dislike, thumbs up or thumbs down. Even when a site like IMDB allows for ratings from 1 to 10, human nature means that people tend to rate in the extremes. Users regularly rate everything as 10 or 1; not many people take the time to leave a 4-out-of-10 rating because they clearly didn’t have a strong opinion in the first place.
The leading builders of Recommendation System have learned to harness the abundance of implicit data, understanding as much from suggestion as they do from clear and explicit reactions.
Are explicit and implicit data weighted differently?
Explicit data will always have more obvious value than implicit data, and will naturally be favorable (if, as highlighted above, users can actually be encouraged to give enough clear feedback). The main reason that implicit data is harder for companies to interpret is that it requires clarification. Explicit data is one-action feedback: a single click tells us that a user liked a video or rated a product positively. With implicit data, we sometimes need to observe what the user does next. If someone listens to a single song, we cannot know if they liked that artist. The system needs to store that information and see what happens in future. If the user then purchases an album download a few days later, that second action backs up the initial assumption. The system can then learn how that single user or all users interact with the system and make better assumptions as time passes and more data is generated.
How do the best systems learn from user behavior?
Each person is unique. Although sites are designed to encourage us to interact with them in certain, predictable ways — to create the clearest user paths — we all still behave differently.
My wife and I love watching movies together, but we certainly don’t enjoy trailers in the same way. To decide if she is interested in a film, my wife will watch the entire trailer — sometimes more than once. If she is not interested in a film, she will turn the trailer off within ten seconds.
My behavior is the polar opposite. I can usually tell if a film is going to interest me within the first ten seconds of the trailer. I will then switch it off immediately to avoid any spoilers. If I am not interested in a film, I will watch the entire trailer (although now that I actually see that in writing, I have no reasonable explanation of why I do it).
A really good Recommendation System also has to learn to interpret and explain opposing behaviors. If my wife and I both stop watching a trailer on streaming service after ten seconds, it may be right to assume that neither of us is interested in the movie. If I then go back and watch the film within the month and my wife does not, that tells the system a little something about our behavior. If we then repeat that behavior several more times, the most powerful Recommendation System will be able to interpret my seemingly negative response as a positive signal. However, as with almost all implicit data, it requires confirmation of the initial assumption.
How do systems deal with negative ratings?
Whether explicit or implicit data, the biggest challenge for Recommendations Systems is often in dealing with negative feedback. If, for example, a user watches ten heavy metal videos on YouTube and gives them all a solid thumbs down, what does the system learn from our activity? Does it stop showing the user Metallica in the recommendations list because he repeatedly had negative reactions to metal music? Or does it decide that he watched ten metal videos in a row and, therefore, suggest more of the same (because, presumably, the user is a glutton for punishment)?
In reality, it depends on the goal of the system. If the aim is to simply keep the user consuming content, because the revenue comes from advertising income, the Recommendation System will probably suggest whatever content the user is willing to consume. If the aim is to drive sales, the system will be more inclined to try and show the user products or content that he or she will actually like. When you have an online store in which users can rate items of clothing, you don’t want your Recommendation System to keep showing red dresses to a user who negatively rated everything in that color, as it is unlikely to lead to a sale. You don’t want your user to see red.
What if a user doesn’t interact at all?
If you have an online store with two rows of products and you find that people are only clicking on the items you present to them on the first row, what does it really mean? When your recommendation engine decides upon the products that users see, is it so perfectly calibrated that users’ favorite choices are always the first row? Or is there something lurking beneath?
Users could be only clicking on the products in the first row because they are of most interest, but it could be a problem with the UX. Maybe the site is designed in such a way that the second row is not displayed prominently and doesn’t catch the eye. Remember, I love Spotify, but didn’t notice that little heart that lets me confirm that I like a track… which perhaps means that maybe it is not working in the way it should.
It is therefore not enough to implement a Recommendation System and take the results at face value. You need to ‘A/B test’ your results as much as possible. Test the system against a second version of your store without the system running, so you can measure results. Then test different versions of your interface against one another to see which works best for business.
This is a key element of our own upcoming Saleor Cloud e-commerce solution. You can build multiple versions of your storefront to check that you are getting it right.
You don’t have to go so deep
Some of the explicit and implicit data issues in this article are only important for some of the world’s biggest companies. However, it does give you an overview of the deeper data analysis that is going on even for less complex systems. In the first articles of this series, we discussed how you can make a recommendation system of your own and we then answered some of the FAQs you might ask before you get started. If you are going it alone, we wish you successful and profitable recommendations! If you are looking to build something more complex or want us to do the hard work for you, feel free to contact our data science department and let’s start the conversation. They come highly recommended!
Still hungry for more about data? Check out how far Netflix takes its recommendation system. It’s a great read and helps put the content of our three-part series into a fresh context.
We love to hear your thoughts on our thoughts, so please leave a comment.
Mirumee guides clients through their digital transformation by providing a wide range of services from design and architecture, through business process automation to machine learning. We tailor services to the needs of organizations as diverse as governments and disruptive innovators on the ‘Forbes 30 Under 30’ list. Find out more by visiting our services page.