Teaching Machines to Recommend

Recommendation and personalization are two major opportunities of our time. Better recommendations of the content users are consuming showcases better engagement and user experience.

It helps you build high quality, dynamic products that show promise and are high performing.

This is a very interesting opportunity with any product but it is a bigger opportunity when it comes to a publishing company. A publishing company publishes a plethora of stories in a day. These stories cater to different audiences with a wide spectrum and this presents a very unique opportunity for us technology enthusiasts. There is both opportunity and challenge to match the content with the interested reader that will drive more engagement, social shares and revenue rather than let the story float in the world-wide web like space junk.

This requires us to understand the core features of our site and tweak them with data driven experimentation to be able to personalize the content to match the readers’ preference and probability of engagement. A very important feature for this would be a decision tree. So, defining your decision tree from the start is extraordinarily critical. This leads to the question of automating the recommendation by teaching machines to learn the patterns of personalization and bring about the content the users may like to explore. With relevant information now flowing to the right users, the engagement increase results in higher content consumption, higher revenue with a better user experience.

Many companies are faced with this challenge and the goal is to describe a solution at a high level for enthusiastic techs who are looking to solve similar problems.

A vital section where there is growing opportunity in publication is the latest news recommendation and/or personalization. For example: a technology sector’s latest news may be irrelevant for a healthcare professional BUT a technology sector’s latest news that impacts healthcare may be very relevant to the user, or vice versa. The core of the opportunity here lies in striking the right balance and bringing about the right content based on the pattern of content consumed by users as soon as new content is published.

This may be possible with the right kind of data being collated and used across the board. The raw data collected should be normalized and uniquely identified sections of that data should be parameterized to best leverage the story the data provides.

Sifting through historically consumed content by users and storing information about the categories, tags, and keywords per profile, provides the context to the basic set of algorithms. NLP analysis can be used for the keywords, sentiment of content consumed within a designated recent historical period. This will help us deliver the content that might be read based on historical reference points.

This can be achieved using simple content-based data modeling. The tags or categories used here play a very vital role in having varied levels of effects on the simulation of the content model based on the uniqueness and broadness of the category or the tag. This approach, while very data heavy, may not be the most effective one.

For example, someone reading about fast food may be interested in reading about how organic food is better than fast food.

The most important change here from the previous algorithm is the context and sentiment of the content that is being consumed. This actually helps us place relevant ads that are based on sentimental analysis — ads that the user is most likely to engage with. This also helps us deal with the issue of negative ad campaigns.

For example, a consumer engaging with negative content about phones catching fire is less likely to interact with an ad about the same phone.

Network filtering based on the nearest neighbor algorithm should be added on top of the historical data-based algorithm. This is very powerful where we can analyze similar content being consumed by similar profiles, hence increasing your test data exponentially. This will also help you to edit the features of your algorithm to serve more accurate engage-able content. Unfortunately, this works well only for content that is within the well-defined boundaries of the explorative realm. There are questions this approach still needs to answer like — How about the incorrect labelling of content with this approach? How do we solve this? What about newer content/type of content? These are very unique problems that have an answer which is simple yet complex. The complex structure stems from extrapolating logic from Network Modeling, Content Modeling, Sentiment Analysis, Historical Content Modeling and Preference-based Profile Modeling of data. This unison of different data models helps us teach the machine the best approach to pick and choose recommended content for the user.

The algorithm should start by modeling each piece of content as an amalgamation of the data modeling it falls under. It should start sifting for keywords by topic that speak about information within the content. This helps us model each user based on their Preference Profile. The algorithm should break the Preference Profile in sections based on how the user consumes the content across the topics they prefer to read. Then a spatial analysis is performed to identify how and where the content consumed aligns to other profiles using the nearest neighbor algorithm. This analysis helps us understand the reading patterns of each uniquely identified profile. This in turn helps us to draw a map connecting each profile, thus giving us a network of reading patterns which can be used to feed into the classifiers that define these as parameters. This can be done using various algorithms but the basic underlying concept is NLP. (This helps us understand that the content with keywords could be explanatory rather than being the topic of the content piece.)

The results of this algorithm need to be tested for offsets within the result sets we get and we should make corrections to our algorithm by parsing training data through it. It is preferred to have multiple layers of these algorithms that are constantly modified by parameters to help you achieve the result set you are expecting using training data.

The perfect recipe that you find for this should be tested on a large-scale experiment to see if the data you are collating corroborates your algorithm. Once you hit the right rhythm you can consistently bring out the right content for the right consumers at the right time, hence increasing engagement, performance and user experience. Once you start collecting a lot of data based on your algorithm, you can run your data through neural networks to improve your data set and results.

Hopefully this is helpful in improving reader engagement and user experience.

Note — This would not have been possible without the help and support of my amazing rockstar team! Thank you — Ryan B, Karen Rosenblatt, Guvenc, John Xitas, David Rankin, Asad Richardson, Milan T, Kevin Meltzer, Adam Childers, Justin Grady, Prasad.