What we talk about when we talk about space heaters

Adventures in Data Mining

My Victorian terrace in Melbourne, Australia

The Inspo

I attended graduate school in Melbourne, Victoria, Australia, where I lived in a two-story brick terrace house built in Victorian times. I loved the sunny morning coffees, summer “sunbakes” on the terrace and convenient location of this beautiful, historic house, but I did not love the winter cold! Forty degree weather in a house with high ceilings and heat-absorbing bricks gave the house the feeling of a cave. Or an icebox. I’m sure it felt comfortable to settlers who considered southern Victoria a sauna compared to the British Isles (or they just used the four fireplaces which had since been blocked). But for me, from mid-May to August it was a struggle to get out of bed or study in a house so cold.

Yet, the volume of the house, with two stories of 14-foot ceilings, was so large that I was reticent to turn on the furnace; plus, the lack of upstairs vents meant that the heat never made it to my bedroom. I thought maybe I could save on the gas bill by getting space heaters for just a couple rooms. As a lazy American who likes to acquire products by clicking a button, I turned to my trusty old friend Amazon.

You can’t actually buy most products from Amazon in Australia due to import restrictions and the high cost of shipping but I took a look for research purposes anyway. I quickly found that the seemingly simple task of buying a heater was easier said than done. I realized I know very little about thermodynamics (is that even the right word??) and had a lot of reading to do. Immediately I was overwhelmed. So many types of heaters! Such a range of prices! The design of each heater is suited to a different purpose, and each may have different effects — for example, one might keep you toasty warm but completely run up your electric bill. I had to analyze: Which one is best for my needs? What trade-offs do I need to make?

Software and Economics

I ended up doing a software development project on this problem: a multi-objective search for a product about which the consumer does not necessarily have prior knowledge. Before, people would go into Best Buy or Sears and get an earful from some sales rep in a polo shirt. Now, in e-commerce the product knowledge can be crowd-sourced from the reviews, which on the aggregate may or may not be more accurate than a single voice claiming to have knowledge.

I faced the problem with a vision for what might be useful to a consumer: a web app containing interactive scatterplot visualizations of the product features. I studied economics in undergrad. What would a micro-economist say about choosing between products in a multi-objective search, for example with the objectives of low Price and high “Quality”?

Well economics is a science of 2D line charts, like Supply and Demand. A microeconomist might say, plot the entire product market space along the axes of Price and Quality. Then look for the outliers that give you a good Quality-Price trade-off. The same logic could be used for any two aspects of a product. For example, a space heater’s ability to heat to a nice temperature and its effect on your electric bill.

Discover trade-offs and look for outliers

This thinking recognizes that there is no such thing as a “perfect” product for someone with specific tastes and multiple objectives, because no product is going to score 5/5 on every dimension by the laws of physics and of the market. The best we can do is get ourselves to some optimum.

But how can we figure out the labels to those axes without feeding them in by hand? And how can we score each product along those axes? That’s where data mining comes in.

Fun with Data Mining

Overall, the problem can then be broken down into two sub-tasks:

  1. Identify the relevant product “features” or aspects that people need to know about (label the axes)
  2. Determine consumer attitudes to each feature or aspect of a product as implied by the reviews (score each product along the axes)

The second problem is essentially sentiment analysis, which is so prevalent these days that I won’t go into detail about it. (For a comprehensive overview, see Pang and Lee.) I tested a few machine learning methods, SentiWordNet and also just using the human-supplied star ratings.

But how do we identify the product features?

Researchers have been working on this problem for over ten years, with Hu and Liu’s 2004 paper on opinion mining of Amazon product reviews being one of the seminal works. Hu and Liu used an algorithm called association rule mining to extract product features like “picture quality” and “size” from consumer reviews of a digital camera. They also noted the important observation that most product features are nouns, so they look only at nouns and noun phrases.

There are many more tricks you can do, applying techniques like clustering, association rule mining, SVM classifiers, topic modeling…

But this all makes the process seem like very intimidating computer science, when the underlying assumption is simple and should lead to a simple conclusion: When people write product reviews about products in the same category, they all end up talking about the same things. Essentially, the discussion converges upon the relevant features or aspects of the products.

So, all we really have to do is find the most frequently mentioned words and phrases aggregated across the products and reviews, and filter to find only nouns using a part-of-speech tagger. I used what I call a “Frankenstein” method of finding popular words, two-word phrases and three-word phrases through slightly different methodology.

For single words, I looked for popular words but also benchmarked them against a body of text representing more “generic” web English (the NPS Chat corpus), using the assumption that people tend to mention product features at higher frequency in reviews than otherwise.

For two-word phrases, I looked for sets of words which were highly dependent on each other, using Pointwise Mutual Information, though also above a frequency threshold.

I found using three-word phrases led to too much redundancy due to synonyms, so I only included trigrams with high PMI and frequency that were composed of important bigrams (for example “oil filled radiator”).

This required a bit of threshold tuning, which is where your good ol’ human brain comes in. Caveats remain: synonyms may arise — that should probably lead to some kind of thesaurus-based matching. Also, people might write words in slightly different form, such as plural and singular — this may require additional rules, or the comparison of word roots instead of groups of letters in the mining phase.

Still, with my Frankensteined method I was able to find a set of interesting words and phrases:

With this simple strategy, we “bootstrap” our knowledge a bit, and get ourselves to think about words and phrases that we might not have been thinking of before. For example, when I tested this on “wireless printer”, one of the mined features I found was “Google Cloud”. Having not purchased a printer in many years, I did not even know that was an available feature but definitely wanted to look at printers with Google Cloud print capability.

Further, once we mine a set of features we now have created structure in our unstructured data. From there we can move to sentiment analysis and relationship mining, and go from this:

to this:

Other Possibilities: Market Research

My web app was envisioned to help consumers make a better-informed choice, but review mining can easily lead us to useful information for manufacturers, product designers, marketers and others on the supply side.

Somewhat obviously, these reviews provide instant feedback on a product’s shortcomings, as well as helpful suggestions, and this aggregation framework also saves the marketer a lot of time. When aggregated across the market, a marketer can also see how its product stacks up against the competition.

The reviews also shed light on consumer’s specific use cases. I was surprised to discover that many consumers use their space heaters in the bathroom. They hate stepping out of a shower into cold air. Some even start their heaters a few minutes before they enter the bathroom in the morning. The frequency with which this use case popped up certainly motivates features like timers and bathroom-safe cords.

We still end up having to inspect the text to discover stories like this, but extracting features also bootstraps this process, identifying the spots in the data where we should examine context.

From Numbers to Creativity

The fact that the original text remains important speaks to the fact that data mining is powerful but does not tell the whole human story. In the end, I aggregated up the numbers, which created quick and easy-to-read visual summaries, but I found I still always wanted to see the original text. Luckily, linking the two is easy with JavaScript.

OK, so we looked at frequently mentioned nouns and noun phrases. But what about the infrequently mentioned words?

I had a lot of fun looking at the rarer words in the space heater reviews. Many of the “singletons” were just misspelled more obvious words, but in other cases, you get to thinking, why on earth did someone use that word in a review of a space heater?

Examples: saipan, scavenge, counseling, airliner, suicidal, membrane, galaxies, dutchman, etruscan, whoopi, yellowstone, macaws, bodega

Some of these turn out to be proper nouns that got lost when I made all the words lowercase (e.g. Dutchman is a kind of RV — many people use heaters in their RVs or camping tents; “bodega” was Bodega Bay) but it makes a fun game of mad libs that encourages creativity:

(My guess) “This space heater was as noisy as an airliner, it triggered my PTSD bringing me back to the Battle of Saipan. I had to go in for counseling lest I become suicidal.”

Too dark?

I found some really great imagery and ideas throughout the actual reviews in which people exercised their creativity:

“Our Chihuahua loves to lay in front of it and soak up the heat.”

“Yeah, I could see Whoopi Golberg’s [sic] character warming her hands in front of this artsy and effective space heater.”

“I have been RV trailer camping for 14 years. My hobby is dark remote locations to photograph galaxies with my telescope. It can get very cold at 8000 feet.”

“alright ladies if youre looking for a heater that will keep you warm those nights your man is up late playing video games or something nerdy this heater is the one for you! It is an excellent listen and will never complain about drama you bring to the table. Just dont cuddle with it…”

If these aren’t good fodder for an ad campaign, I don’t know what is. The image of the chihuahua loving the heater brings so much warmth to a cold metal product. The joke about a heater being better than a man is so clever. The idea of an guy bundled in Patagonia at 8000 feet photographing galaxies with a telescope and then warming up with that space heater at night brings so much adventure to a potentially boring product.

This project involved a computer mining text and aggregating it into numbers, but let’s not forget these human stories that create volumes of emotional association with just a few nuggets.