Entity Recognition
Daniel Tunkelang

Hi Daniel – Thanks for this post and this great blog.

Forgive me if this is a naive question. I have tried using default models of libraries like Stanford NLP , NLTK, Spacy.IO and some NLP-as-a-service services including the mighty IBM Watson in Blueminx, why would all the NER recognise ONLY strings or phrases with Titlecase and not in lower case or mixed case.

In NL search users seldom pay attention to Titlecase. Google search seems to pick NERs written in any case. Does this mean google has trained their NER model with all variants of an entity say Adidas, ADIDAS, adidas or as a part of query rewriting individual tokens should be Titlecased to look for NER.

Search Query: wide fit running shoes for men adidas.

Toy version of a rewritten query: Wide Fit Running Shoes For Men Adidas.

I know I am missing something.

Please advice.

