Reporting / Visualizations

In order to visualize our results we used a few different techniques

Word Clouds

We found a word cloud library online created by Andreas Mueller and can use this to create word clouds of each of our wines. For our top reviewed wine, we created a preliminary word cloud. It does not take into account the weight of the words, so there are some uninteresting words (we removed a few using the build in stop words feature) but it does have a lot of interesting words and shows what we can do with the word cloud library.

As we move forward we can use our weights on words to improve our word cloud. We can also change the shape of it, colors, size, etc. Using the mask feature of the wordcloud library we were able to create the following.

Update: We were able to take the frequencies of each word into account and use those to weight our word clouds. We decided to create word clouds of the most frequently used words to describe each variety of wine.

Here is the word cloud for Chardonnay:


Additionally, we were able to to find the most frequently used unique words for wines binned by price, which had some interesting results.

0–15: {‘drinkable’, ‘everyday’, ‘fairly’, ‘ok’, ‘smell’, ‘grapefruit’, ‘acidic’, ‘thin’, ‘refreshing’}
15–20 :{‘sweetness’}
20–30 : {‘syrah’}
30–40: {‘champagne’}
40–70: EMPTY
70–100: {‘cabernet’}
100+: {‘rim’, ‘cork’, ‘huge’, ‘concentrated’, ‘powerful’, ‘bouquet’, ‘pure’, ‘blind’, ‘closed’}

Heat Map

Another visualization of our data is a heat map of our wine similarity matrix. Since our similarity matrix for all our wines was was so large we decided to create a heat map for the similarity matrix of two hundred wines

Heatmap of similarity matrix (N = 200)

Wine Aroma Wheel

As the project progressed, we realized that we would not create a typical wine aroma wheel. Instead we took the top words used to describe different varieties of wine and created a sunburst graph of those.

Here we took eight varieties of wine (which we picked by hand) and looked at the top words for each variety based on frequency. We then picked five words from each, taking out words that all the varieties had and words that were not exactly tasting notes (like smooth, medium etc.). This gave us five flavors/tasting notes for each variety of wine.