A study on beer: logo detection and analysis on social media
A new class of state-of-the-art object detectors from Computer Vision can provide novel insights in brand analysis.
Why logo detection matters?
Social media analytics has greatly increased over the past decades, which showed to be a great way for the companies to know their public better since social networks took over our lives. These kinds of study can provide clues on how the public perceives and interacts with a specific brand. Some companies provide platforms that are dedicated to social media analytics, such as Buffer, Hootsuite and Sproutsocial among others. In the recent years, however, much of the social engaging has been through the use of pictures and videos. This became a trend not only on image based social networks such as Instagram and Snapchat, but also on previously more text-based cases as Twitter and Facebook.
Nowadays is estimated that more than 80% of the posts in social medias contain images or videos. To analyze only the text and metadata of such posts is clearly not enough.
This can pose a great challenge for the companies to extract meaningful information from them automatically. But, fear not, Computer Vision (CV) comes to the rescue. Analyzing images in order to extract meaningful information has been the objective of this field for many decades. Many examples of CV have gain big commercial importance in past years. Systems able to detect and recognize faces, read documents, etc.
Despite these applications success, it was until recently very difficult to detect logos from unconstrained (in-the-wild) images, such as the ones present in social media. One of the major issue is the variability of objects in which a brand can appear. Think of a RedBull logo appearing in a can, outdoor, t-shirt or a F1 car. The system must be able to model the huge variability in which a logo can be used.
Here on Meerkat we started developing a complete new system based on Deep Learning, which is revolutionizing several areas of Computer Vision, including object (and logo) detection. We developed a Deep Learning system that given some real image examples of a logo (a learning phase), it learns to identify and localize this logo on any image.
Once we have these images with associated metadata from social media, we can generate useful information. In this post, we describe a sample study evaluating 6 brands of beer on Twitter posts.
In order to briefly show the potential of information that can be retrieved from detecting logos on social network images, we gathered tweets of the last 6 months that contained words commonly used in the context of beer, such as beer, cerveza, bar, barbecue, etc. We trained our system to detect the logo of the following beer brands: Budweiser, Bud Light, Corona, Guinness, Heineken and Stella Artois.
Within a period of 6 months we retrieved more than one million of tweets with images. Only a small fraction of these images contained a logo from these types of beers. All the images in these posts were gathered in this automatic fashion without any manual input.
Take a look at some of the images gathered by the system for the Heineken brand:
We can clearly observe several different instances of the same logo. The fact that our detector can correctly model this huge variability is a direct result from modern deeplearning-based systems proposed in the last couple of years.
Here is some more cool examples of other brands:
So, we found which tweets contained a picture of a given brand, now what to do with this information? A no-brainer is to index the tweet metadata so we search on these information and extract some statistics. We made exactly this using ElasticSearch and Kibana. More details on the whole pipeline will be provided in a different Medium post ;-). In the meantime, here is a gif showing the system working:
Twitter posts stats
Disclaimer: Because the tweets were extracted from a relatively small period (six months), the following statistics are an illustration of a logo detection and brand analysis system and should be taken with a grain of salt since they may be prone to outliers.
So, here it comes the stats, ready? First, let's take a look at the raw number of posts for each type of beer and how these numbers relate to the market share of each company:
Notice that the more present beers in tweets were Corona, followed by Heineken and BudLight. Also there is no correlation between the number of posts with a logo in it and the market share associated with that brand. A clear example comes from Guinness, which has a small market share (~1%), yet has a big presence in social media (~11% in our dataset).
This type of information is obviously of great importance for the company, since it shows a high/low level of connection with the brand. This can be associated with different causes. From the work of Doorn and colleagues: "[…] customer engagement behavior can be associated with antecedents Customer-Based factors like customer satisfaction, brand commitment, trust, brand attachment, and brand performance perceptions. Generally speaking, very high or very low levels of these factors can lead to engagement."
Additionally, it would be nice to see how the brand presence appears in different geographic locations. However, only 4% of the tweets images had geo coordinates, which does not amount to a significant value. What we choose to do was to extract the string of the location informed by the user itself in its profile page. Upon processing those entries with Google Maps API we were able to extract the location of around 73% of the dataset.
To evaluate this data, we plot the percentage of each beer for the top 5 countries that appeared in the dataset (graph on left). It's interesting to see that Heineken and Guinness appear to have a more geographically distributed posts, while BudLight is greatly concentrated in North America.
More Computer Vision: face and gender analysis
Within our system we also detected and classified persons according to their gender, which is right on Meerkat's know-how. This can lead to interesting data as well. One of the aspects that we can investigate is to see the probability of a share (in this case a retweet = RT) given that there is a face appearing in the image.
It’s well know that faces can contribute greatly to the number of shares/likes/retweets a post can have. In the graph in the left we compute the ratio between the number of retweets over tweets (RT/T) for posts with no face or at least a single face present. This ratio is 56% larger when faces are present in our dataset.
Another interesting possibility here is to find the correlation between gender and posts. In the graph we show the overall percentage of each detected gender across all posts.
It's clear that combining different sources of image information can further increase the relevance of the extract data. Below are some pictures showing the detection of different faces/genders for different beers:
That's it! The takeaway here is that the recent leap in Computer Vision provided by Deep Learning has still not yet translated into useful applications. This study was just a small sample on what is possible to do with a logo detection technology. In the future we plan to add more general brands on our system and get much more data from social networks. This will allow us to get many different insights such as the relation across brands. For instance, we may discover a high correlation between Nike users and McDonald’s. Also we can use computer vision methods to detect inappropriate content linked with a brand. This technology is also being ported to the mobile and can serve to many AR applications.
So, how is your branding being advertised by your consumers? Are you following the references to your brand on social medias? How your brand is positioned in relation to your competitors?
If you want to contact us: firstname.lastname@example.org