Categorization: Company Title to Company Types

karanjude
MLFeatureEngineering
1 min readDec 31, 2017

Problem: You have a company title and you want to get the corresponding company category.

Detailed Statement: Lets say you have a list of company titles and their corresponding categorizations that you can get from yelp / google etc. Given a new company title with an unknown classification, how do you classify it ?

Solution: Multi class classification

One Solution: Use a simple 3 layer DNN.

  • Input layer : Company title word2vec embedding.
  • Hidden layer with something like 200 nodes.
  • Softmax output layer with N categories.

Feature Input : Embedding layer

  • Averaged word2vec embedding applied to all words of the company title, normalized by the number of words in the title. In other words, apply custom word2vec model trained on company title corpus to each word in the company title. Sum the vector, corresponding to each word and then divide the sum by the word count.
  • Word2vec model should be trained on the company title corpus. Embedding dimension could be 200.

--

--