Want to know, customer’s preferences for new product’s features’ , here is a simple example of conjoint analysis..
We all have read conjoint analysis but when it comes to implementation we get stuck in one thing or the other. Here is a simple approach with more focus on simple concepts and silly mistakes which does make life simple :)
Quick steps for Conjoint Analysis:
- Selection of attributes and levels
- Formation of Bundles
- Data Collection
- Data Analysis
- Data Interpretation
Lets consider one example:
I came up with a new product idea “Clinky’ for one of my interviews. (My husband named this :P)
USP- Professional Assistant to make sure you are always on time
- Customized wake up alarm- Depending upon reporting time to office (considering meeting schedule, traffic, etc.), it will wake you up by calculating the required time for all the morning activities. Inputs: Your calendar (meetings, scheduled trips, gym schedule, holidays), traffic Congestion, Weather (Rainy, fog), location, audio option to make you remind
- Switch on Geyser and coffee machine- It will be scheduled to switch on the appliances considering the ‘wake up time’ and the ‘appliance time’
- Book your cab- Shower sensor / washroom door sensor / Wardrobe door sensor can be used to estimate the time taken to get ready and it will accordingly book your cab within a specified time interval (may be 10–15 min)
- Last minute key things to tick- When cab notification on the smartphone, screen pop up with a list of key things to make sure before you leave. If own car, when keys are picked
- Smartphone will be linked with the Clinky App which also enables bluetooth connectivity to sync your calendar and location.
Attributes and Levels
Here, Size, Price, Wifi, Color and No of sensors are various attributes while the various options for that attributes are levels.
By going for various combinations of levels we can come up with various product offerings which are called bundles.
For example: Size: 9.3" x 3.3" x 3.3" , Price : 8000, WiFi: Dual Band, Color: White, No of Sensors: 5 is one possible bundle.
We can either go for all the possible combinations and then understand how much the rating varies. For instance, a respondent, for one product offering, with lowest price, gives rating 5 out of 5 while for the same product offering with highest price gives only 2/ 5 as rating. Or we can come up with only feasible product offerings. For instance, with highest price, one will get all the best features while for the lowest price, all the standard features.
Values placed on each attribute is called Part-Worth.
For the above example, the maximum number of bundles will be:
3c1 * 3C1 * 2C1 * 2C1 * 3C1 = 3*3*2*2*3
Three types of conjoint analysis:
There are three techniques for computing conjoint analysis:
- Pair wise: Respondents asked to select one from the pair and all bundle combinations are repeated. Can be done using specific softwares designed for conjoint analysis as calculating part worths is complex.
- Rating: Respondent rate each card on a given scale, which I will discuss further (excel can be used only for one respondent survey results, as output is correlated with input, so part worth can also be calculated)
- Ranking: Simply rank all the cards in order. Cannot use excel as input is not directly correlated to output (forced ranking)
As discussed before, we will go for rating.
Conjoint Analysis Methodology using ‘Rating’ Approach:
Since rating is more person dependent (some very lenient while some very strict), it is rather good to give guidance for rating. We are considering five rating options. In each, we will go for two decimal points to increase the pointer scale so that the data analysis will be good.
For sake of simplicity I am considering only four attributes with each two levels.
So total bundles will be:
2C1 * 2C1* 2C1 * 2C1= 2⁴ = 16
Lets see how the data will look for one respondent:
For all the respondents:
Conjoint Analysis by using R
1. Download R
2. Launch R
Execute the command : install.packages(“conjoint”). While Installing the package conjoint, it will ask to choose mirror (these are the various locations where code is available). Select USA CA1
Then give command Library(conjoint)
You might get a message that library is unprintable, should we create new library folder? Press y and enter.
If still error comes and there is some rgl issue, try using the command:
If in the error message it says X11 not found, download Quartz X11 from the below site.
4. Loading Data in R
There are three kinds of csv files: (Save them in csv format)
- Preferences: Stores the Respondent’s rating data for all bundles / cards
Command to load in R:
2. Profiles: Stores the various bundles which have been used for the conjoint analysis
Command to load in R:
3. Levelnames: Stores the names of levels of each attribute
Command to load in R:
After loading the files, you can use print command to validate.
print(preferences), print(profiles), print (levelnames)
Some common Errors in loading files:
- There is a possibility that the respondent has given same rating to all the cards, such a situation implies that the respondent has not attempted the survey carefully so we should delete such responses before calculating importance and Part Utilities.The command importance will give a error of NaN in case there are such records in preferences file.
- In case of line errors while reading the file, open the csv file in notepad and press enter, the issue will get resolved.
Calculation of Importance and Part Utilities
For computing the importance of each attribute and Part Utilities of each level of attribute, we will use two commands:
importance <- caImportance (preferences, profiles)
partutil <- caPartUtilities (preferences, profiles, levelnames)
Use print command to see the results.
The importance score for each attribute turns out to be:
The above results indicate that Price is the most important attribute to be considered while making the purchase decision for the product, while Wi-Fi attribute is the least important.
Calculating Willingness to Pay from Importance Score:
As per the importance score, when the price is reduced by 2,000 (8000–4000),the consumer gets 32.9% of ‘uti’. So, per uti, the amount required = 2000 / 32.9 = 60.8 Rs.
Willingness to pay for Wi-Fi = 60.8*19.88=1208.5 Rs. Similarly for others.
We can get final equation from the command:
caUtilities (preferences, profiles, levelnames)
y = Intercept + M11*X11 +M12*X12 +M21*X21 +M22*X22 + M31*X31 + M32*X32 +M41*X41 + M42*X42
preference = 2.98489865 — 0.02831081*Small + 0.02831081*Large — 0.18608108*Blue + 0.18608108*White — 0.06733108*Dual + 0.06733108*Standard + 0.24280405*Low — 0.24280405*High
Cross Validation from Excel
1. Validation of PartUtility results:
Lets see how excel does this.
We will use regression for our analysis. We are considering first respondent’s survey results. (As described above, we cannot use excel for more than one respondent as by doing so we are trying to find relationship with identical inputs and different outputs while for one respondent, all inputs are different)
Consider Rating as y and size, color, wifi and price as x parameters.
The “Part Utility from R” column has been taken from the first row of caPartUtility function results. When we double the PartUtilty score, we get the same results as in excel.
You can repeat the same exercise for other respondents.
2. Cross validation of Importance Score:
Importance of an attribute measures the relevance of that attribute in the buying decision for that product.
Importance is the maximum swing between the various levels of each attribute, here, double the absolute value of the partUtilities function.
One person might have a partUtility of white color as -4 while other has +4 partUtility for white color. This implies for both, Color is a very important attribute but the first person does not want white color while the second one wants while color very much.
Rather than doing regression for all the respondents separately in excel, I will directly use partUtility scores from R results and then calculate in excel.
The small changes in percentages is due to the fact that we are only getting results till three decimal places from R. With large number of records, the values might differ more.
While collecting preference data from the target audience, we can also collect demographics data. By doing so, we can analyze preference data with respect to various cohorts. It will help us in market segmentation and market share estimation.