Chemistry patent search engine

Harsh
iReadRx
Published in
2 min readJun 15, 2021

http://ichemist.ai is a search engine we developed to help researchers find chemistry information that appears in patents.

(if the link above is broken check out our demo videos at https://www.youtube.com/playlist?list=PLZ0CqxpJ8nQsPeIg5QowE1s5esIONoFQt)

We use data from chemistry patents and index the text of the claims in our database. We process the text of the claim to extract named entities that are specific to this domain. These are keywords that have specific meaning in organic and medicinal chemistry. Some of these are names of diseases, proteins, genes, and of course chemical compounds.

The reason we developed iChemist was to enable patent analytics by offering context at various stages of the user journey.

  • Our search provides typeahead to enable the user to quickly arrive at an appropriate term
  • After the search is submitted we showcase similar keywords. This allows many analysts to increase coverage of the topic they are researching. We’ve seen users capture these keywords and post them in http://patents.google.com to improve their keyword patterns.
  • As you browse through the search results you are offered the chance to find similar patents. We use large text embeddings to find similar patents based on title and abstract texts.
  • When you expand the claims section you are presented with each claim that can be read independently. In addition, you are also presented with entity labels which can be clicked to highlight corresponding entity values in the text of the claims. This provides a quick way to summarize and helps you decide if you wish to linger on and read more or move to the next claims or search result

All of our back-end models are using state-of-the-art (SOTA) algorithms to train text embeddings and NER models.

Do take http://ichemist.ai for a spin and share your feedback with us.

--

--

Harsh
iReadRx
Editor for

Thank you for reading. I write about technology and ideas that keep me up at night. I also blog at www.harshsinghal.dev