3 subword algorithms help to improve your NLP model performance

Photo by Edward Ma on Unsplash

Classic word representation cannot handle unseen word or rare word well. Character embeddings is one of the solution to overcome out-of-vocabulary (OOV). However, it may too fine-grained any missing some important information. Subword is in between word and character. It is not too fine-grained while able to handle…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Edward Ma

Edward Ma

Focus in Natural Language Processing, Data Science Platform Architecture. https://makcedward.github.io/