Out of Einstein’s Lab: Salesforce Research Unveils Breakthroughs in Deep Learning

By Richard Socher, Chief Scientist, Salesforce

Salesforce
Salesforce Engineering
4 min readNov 15, 2016

--

Artificial intelligence will be at the core of most enterprise products that deal with large amounts of text, structured data and image data. Salesforce Einstein strives to bring AI to use cases in service, sales, marketing and others by embedding it directly into Salesforce products and enabling developers through our platform. AI will empower every company to deliver smarter, more personalized customer experiences.

Democratizing AI to everyone is challenging but something that Salesforce is uniquely positioned to take on. Our mission is to bring the power of AI to CRM, and Salesforce Research is focused on bringing cutting edge algorithms into the Salesforce CRM ecosystem, ensuring that our customers benefit from the latest breakthroughs in AI.

In the less than two months since its inception, the Salesforce Research team has made incredible progress. Today I’m excited to announce our breakthroughs in deep learning including question answering, joint many-task learning, inventing faster, more accurate neural networks and enabling models to predict previously unseen outcomes. The team’s findings have broad applicability as AI continues to move into the enterprise.

Some of the authors of the first large batch of Salesforce Research papers. From left to right: Caiming Xiong, Kazuma Hashimoto, RIchard Socher, Victor Zhong, James Bradbury

See below for more details on the four most interesting projects and papers.

Better Question Answering through Conditional Understanding of Text with Dynamic Coattention Networks

The idea of open-domain question answering–a system’s ability to answer arbitrary questions about arbitrary documents–is one of the most difficult, albeit important, challenges we face in natural language processing (NLP). We introduced the Dynamic Coattention Network (DCN), a state of the art neural network system. The DCN interprets documents based on specific questions, builds a conditional representation of the document for each different question asked, iteratively hypothesizes multiple answers, and weeds out initial incorrect predictions in order to arrive at the most accurate and intuitive answers. The DCN achieves the highest accuracy on the Stanford Question Answering Dataset (SQuAD), significantly outperforming all submitted systems to date, including those developed by the Allen Institute for Artificial Intelligence, Microsoft, Google and IBM. Learn more by viewing:

Growing a Neural Network for Multiple NLP Tasks

When learning a language we start with basic words, then move to short phrases and eventually understand the meaning of complex sentences. NLP models are usually designed to handle a single task or a few closely-related tasks. Salesforce Research developed a joint many-task model for handling a variety of NLP tasks in a single deep model. The model is successively trained with basic tasks and gradually moves to more complex ones until the model learns all tasks. Our simple regularization strategy prevents our model from forgetting information about previously learned tasks and allows the tasks to interact with each other in order to improve accuracy. In our experiments on five different types of NLP tasks, our single model achieves state-of-the-art results on syntactic chunking, dependency parsing, semantic relatedness and textual entailment. Learn more about the research:

Understanding Text Faster and More Accurately with New Neural Network Building Block

The traditional deep learning approach to processing text is what’s called a “recurrent neural network” that takes in text word by word, from beginning to end, just like people read a paragraph. Salesforce Research has improved over this idea by introducing a “quasi-recurrent neural network,” or QRNN. This model can process a set of text all at once (in parallel), and then quickly take context into account and make any needed corrections. Amazingly, the quasi-recurrent approach is up to 16 times faster than the traditional approach and yielded better results than conventional deep-learning models for sentiment analysis, next-word prediction and translation. Learn more about the research:

Improving Language Modeling and Translation with the Pointer Sentinel Mixture Model

Most neural networks use a predefined vocabulary of tens of thousands of words, but even then it doesn’t capture the full vocabulary of natural language. As a result a system’s translation abilities are somewhat limited. Salesforce Research introduced the pointer sentinel mixture model which allows neural networks to “point” to relevant previous words, just as a child would point to an object it doesn’t know the name for. This can help in machine translation and language modeling, where rare words such as uncommon foreign names may not be well covered by a standard vocabulary. The pointer sentinel mixture model promises to improve the vocabulary of existing neural networks, assisting models from question answering to machine translation. Learn more about the research:

--

--