Cataloguing Indonesian Perfume — Part 2: Recommender System

Syifa Addini
2 min readMay 8, 2024

--

Since I have those perfume data anyways, I think I can utilize further by making a simple recommender for me (or perhaps for anyone who reads this).

I adapt this tutorial https://medium.com/dare-to-be-better/movie-recommender-system-netflix-youtube-e4c8cdb11f77 to try built a simple web page using Streamlit, then host it free on Streamlit sharing platform. I get this url below:

https://perfumerecommenderindo.streamlit.app/.

(The link is error now 😭, I can’t even login)

I also plan to add more perfume data and hyperlink of the official shop of each of the recommendation.

So, basically this code below will deploy a recommender webpage after Streamlit installed all the required requirements.

import streamlit as st
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Load the perfume data
perfumes = pd.read_csv('Perfume Data.csv', encoding='latin1')

# Preprocess the data
perfumes[['Top Notes', 'Heart Notes', 'Based Notes', 'Mood']] = perfumes[['Top Notes', 'Heart Notes', 'Based Notes', 'Mood']].fillna('')
perfumes['tags'] = perfumes[['Top Notes', 'Heart Notes', 'Based Notes', 'Mood']].agg(', '.join, axis=1)
perfumes['tags'] = perfumes['tags'].apply(lambda x: ', '.join(filter(None, x.split(', '))))

# Vectorize the tags
cv = CountVectorizer(max_features=5000, stop_words='english')
vector = cv.fit_transform(perfumes['tags']).toarray()
similarity = cosine_similarity(vector)

# Filtered dataframe based on selected brand
def filter_by_brand(brand_name):
return perfumes[perfumes['Brand'] == brand_name]['Product Name'].tolist()

# Recommendation function
def recommend_3(brand_name, perfume):
index = perfumes[(perfumes['Brand'] == brand_name) & (perfumes['Product Name'] == perfume)].index[0]
distances = sorted(list(enumerate(similarity[index])), reverse=True, key=lambda x: x[1])
recommended_perfumes = []
for i in distances[1:6]:
recommended_perfume = perfumes.iloc[i[0]]
recommended_perfumes.append((recommended_perfume['Brand'], recommended_perfume['Product Name']))
return recommended_perfumes

# Streamlit app
st.title('Perfume Recommender')

# Dropdown list for selecting brand
brand_name = st.selectbox('Select a Brand', perfumes['Brand'].unique())

# Dropdown list for selecting product name based on selected brand
product_names = filter_by_brand(brand_name)
product_name = st.selectbox('Select a Product Name', product_names)

# Button to trigger recommendation
if st.button('Show Similar Products'):
# Get recommendations
recommendations = recommend_3(brand_name, product_name)

# Display recommendations
st.subheader('Similar Perfumes:')
for i, recommendation in enumerate(recommendations, start=1):
st.write(f'{i}. {recommendation[0]} / {recommendation[1]}')

On the webpage, when I pick my favorite local Indonesian perfume like below,

Picking my favorite from the list

then clicking the button “Show Similar Products”, it’ll give me similar perfume based on cosine similarity calculated from the ingredients and moods data.

Recommender system result

If anyone wants to take a look at the data, this the sheet of the perfume data:

https://github.com/spdini/perfume_recommender/blob/main/Perfume%20Data.csv

--

--

Syifa Addini

Currently a caregiver/caretaker. Marketer with CRM specialty, learning mainly text analytics. Open for commission/remote/freelance. Contact me through LinkedIn~