An ‘Entity Embeddings’ sharing with New York’s AI Community

Image for post
Image for post
Photograph — Courtesy Fintan Quill of Shakti Software

On 6th June 2019, I had the good fortune to address around 50 Artificial Intelligence (AI) professionals of the New York City (NYC) Deep Learning (DL) Meetup Group on the topic of ‘Entity Embeddings’. I was kindly invited to speak there by Kris Skrinak @skrinak, AI Architect at Amazon AWS who hosts the above NYC DL meetup along with Pallavi Gadgil.

The venue was the Amazon midtown NYC office 10 levels above it’s book store there. With pizza & wine served at the start, the hour long talk evoked keen interest with lively interactions from the audience consisting of AI, Software Professionals, Data Scientists, Industry Executives, Enthusiasts, Students and others.

Image for post
Image for post
A view of the audience before the talk began

‘Entity Embeddings’ is an upcoming AI technique for applying deep learning. It involves representing the categorical data of an information systems entity with multiple dimensions to generate better quality predictions. It is being extensively used in several large AI production systems at companies such as Google, Instacart, OpenAI, Twitter & many others.

Why is it important ? Business leaders can no longer ignore AI which is estimated by Forbes to be US 150 $ Trillion industry by 2025. Within AI, Entity Embeddings is a powerful technique that works across different business domains and verticals. I call it as the ‘Mathematization of Organizational Intelligence’. For AI Technology leaders, it is important to know that Entity Embeddings is independent of any specific Machine Learning (ML) method & also does not need any domain specific feature engineering knowledge or sector expertise for designing AI models.

Scope of my talk : In my presentation, I presented the ‘Entity Embeddings’ concept along with examining it’s usage in the following 3 AI papers : Two Kaggle Competition winner papers — Artificial Neural Networks Applied to Taxi Destination Prediction (Yoshua Bengio’s team ) & Entity Embeddings of Categorical Variables & Google Research paper — Deep Neural Networks for YouTube Recommendations.

Image for post
Image for post
One of the slides explaining the concept using examples of popular soft drinks

The Youtube recording is here. The following subjects at the respective timelines were covered by me :

0:00 — Why Talk of Entity Embeddings ?

2:45 — What Are Entity Embeddings ?

8:32 — Importance of Entity Embeddings

9:34 — Two Perspectives — Word Embeddings & Real World Tabular Data

10:15 — Word Embeddings (including references to contemporary research)

27:00 — Real World Tabular Data

34:55 — Machine Learning Library Support

39:39 — Artificial Neural Networks Applied to Taxi Destination Prediction

42:33 — Entity Embeddings of Categorical Variables

45:24 — Deep Neural Networks for YouTube Recommendations

52:30 — Industry Usage — Twitter, OpenAI, Healthcare Domain, etc.

55:13 — Aricles, Summary, Call To Action

Interactions & Discussions : My talk was interspersed with many questions which I tried to answer to the best of my abilities with simple every day examples. Some of the areas where people had questions were as follows :

  • Clarity on Embeddings, initialization, update & sharing mechanisms
  • t-SNE tool to visualize Embeddings outputs to understand it’s impact
  • Understanding Embeddings usage in various organizational data types

Since Entity Embeddings is an extremely important area of applied research, I hope to clarify further on some of the questions raised in a future article.

Earlier References : I had earlier presented on the same topic at a ‘This Week in Machine Learning & AI ( TWiML & AI of Sam Charrington) study group session & also wrote about it’s usage in Collaborative Filtering algorithms for Movie Recommendations.

FastAI Shoutout: A huge thanks to Jeremy Howard & Rachel Thomas of fastai for introducing me to the concept of Entity Embeddings and the excellent ground breaking work they have been doing in the area of AI research and cutting edge AI online education for the masses. References to FastAI were made at the following timelines :

14:07 — Size of Embedding

26:23 & 35:35 — Fastai Library Support function — add_datepart

37:38 — Jeremy Howard on Embedding size

44:37 — Rachel Thomas on ‘Rossman Stores Competition’ paper

52:30 — Jeremy Howard on commercial & scientific opportunities

Thanks to Kris Skrinak @skrinak & Pallavi Gadgil for their amazing support and hard work in organizing the meetup and also to all the participants who spared time for the same. All feedback & suggestions appreciated. You are also welcome to visit our Easy AI page for more information from time to time.

Written by

Engineer, MBA (Finance) - Entrepreneurship, Software Architecture, Business, Management, IT Consulting, Advisory & Mentoring services.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store