Ben Yi
Jul 16, 2023

--

There are 2 issues with this example:

a) the cosine similarities are dominated by Project Cost, at the very least, this column needs to be normalised

b) On Project Category by itself, this method would only give it any credit if the two items being compared have the exact same category. Otherwise the Project Category part of the cosine similarity score would be 0. If there is sufficient data, perhaps consider embedding.

--

--