THE AMBASSADOR CAMPAIGN — BOUTIQ TRAVEL

For my CAPSTONE in the Data Science Immersive course at General Assembly, I had a chance to work with Boutiq Travel — a travelling social network where users join to share and recommend hotels, restaurants, cafes and beautiful places from around the world.
Having been rolling out for a few months, now Boutiq desires to implement their new campaign — the Ambassador Program — with the target of rising users interaction. The main purpose of the program is to find the most active and influential group of users and grant them the Ambassador titles. Such representative users would potentially be motivated to post more quality recommendations and help the social network to grow faster.
Approach
The main techniques would be using cluster analysis to separate out the most active group of users, using Social Network Analysis to detect the most central users of the network, and taking an inner join of the two groups. If someone both sits in each group, the higher chance he/she would be a potential candidate for the Ambassador title.
Overview
For generalization, I exclude users who are either relative, friend or financially-incentivized member from the data set because we only want to value the real users.

Most of the users come from London — which are accounted for over 1000 users. The next ones are Sydney, Berlin, Stockholm, Paris, New York and Melbourne.

Also, from the heatmap can see that the most active region is Europe (especially London), including parts of Italia and Spain. Australia with Sydney and the United States with New York and San Francisco unsurprisingly also made to the top.
K-means clustering
I used K-means clustering to segment the most active group of users based on some features such as the number followers someone has, his number of likes and the quality of his post (average like per post). I came up with five different clusters after several experiments and decided to combine ‘Very highly active users’ (green) cluster and ‘Highly Active Users’ (red) cluster together to form a group call ‘Highly Active Users’ for later analysis.

Red and Green clusters really stand out from the crowd in terms of number of reviews they wrote, number of followers, likes and wish list they got. We are quite confident that those clusters with 20 users truly define the boundaries between highly active and lowly active users.

I went further to investigate the online pattern of those highly active users, and it turned out only 84% of them were active in the last 100 days. This information would be valuable to filter out the highly active users but stopped using the app for a long time.

Another point is that they have different online patterns in different weekdays, we can take note of this to send out ambassador request and promotion in an appropriate time frame.
Social Network Analysis
In the second step, I used Social Network Analysis (SNA) to calculate the ‘betweenness centrality’ of each user, i.e. the number of shortest paths each person has with the others in the network. The higher such index, the more central a user sits in the network.

In general, the social network is not fully connected with many ‘edgy’ groups of users who remains far away from the ‘central hub’. However, we can still detect 50 most central users (big red points) who have most connections in the network.
Putting things together
Finally, I took an inner join between two methods, plus filter out users who were not active in the last 100 days and came up with a list of best of the best candidates for the Ambassador Program. They come from different part of the world, but all have similar attributes: write many reviews, have many followers, and… took nice photos.

Further Recommendations
- Combine results from K-means clustering, SNA and recent filter to select the best users for the Ambassador Campaign.
- Send out Promotions, Pop ups, Ambassador Request… in most active hours considering the current day of the week.
- Check SNA models regularly to see the interaction and connection of the community.
Next Steps
- Add the weight of relationship in Social Network Analysis (people with followers are more influent than people who follow others).
- Explore other clustering algorithms (DBSCAN, EM using GMM).
- Build a recommender system to suggest ambassadors for other users to follow.
*Because of NDA bound I am not allowed to share the source code.
Connect with me on LinkedIn .
