Photo by Christian Fregnan on Unsplash

The Effect of Explanations and Algorithmic Accuracy on Visual Recommender Systems of Artistic Images

Vicente Dominguez, Pablo Messina, Ivania Donoso-Guzmán, Denis Parra

Yoav Navon
2 min readOct 6, 2019


This paper aims to evaluate the usefulness of explanations for a recommender system of pieces of art. The study was conducted with people from the Mechanical Turk platform, and different variations of the system were presented to them.

The first variation was the recommender system used, being all content-based. The authors used one model based on a Deep Neural Network to extract features, and also used Attractiveness Visual Features (AVF) as a second option. The other variation was the explanations presented to the users, having 3 different interfaces. The first interface consisted in recommendations for the user without any explanations, for the second one the user was given similar images as explanations, and the third one only applied for AVF, and consisted on a bar chart of the features.

As it’s expected, the perception of explicability was higher with the interfaces with explanations. So was the case for Trust, Relevance and Diverity ratings. The authors noted that the algorithm used for recommendation was also important, because the DNN model achieved significantly better ratings.

One question that I have about the research, is that for the DNN model, the interface 2 and 3 is exactly the same, giving to the user a top-3 similar view of the item recommended. Nevertheless, for some of the ratings there was a significant difference between interfaces (Diversity, Interface Satisfaction), and given that they are the same it shouldn’t be the case. I suspect that this is because of the sample size of the study, because only 41 and 39 people participated in interface 2 and 3 respectively.

Another question is about the differences between algorithms in interface 3. It is not clear if the differences were because of the model, or because of the bar chart visualization for AVF. Probably is a mix between the two, but measuring this would be interesting.

