Salesforce Data Science Interview— Acing the AI Interview

Salesforce explains the Salesforce Platform as a full suite of services from trailhead to Einstein AI, Lightning, IoT, Heroku, Analytics and the AppExchange in their 10-K. It is a full suite of products for the Enterprise which includes AI as well. To improve the data flow and growth, Salesforce bought Mulesoft at 21x its enterprise value.

Courtesy: Salesforce

Salesforce has acquired a great amount of AI arsenal thanks to the acquisitions it has made since 2015. If there is a company where building AI tools can directly impact the enterprise, it is Salesforce. Salesforce finally overtakes Tesla Motors (№2), regaining the top spot as the world’s most innovative company — a title Salesforce held from the list’s inception in 2011 to 2015. Salesforce follows the 1–1–1 philanthropy model since its inception which I am personally a big fan of. I also consider Salesforce CEO Mark Benioff as a visionary for the cloud as Salesforce was on the forefront of the movement to the cloud when it started in 1999. He is also a champion for great initiatives like Equal Pay.

Interview Process

The interview process is a bit different than the normal AI interview process. Phone screen is usually 30 minutes where there is a dialogue on research backgrounds. Post the phone screen, there are 5–6 rounds of interview sessions on site. While onsite, there is a 30-minute research talk to the entire group before the interview starts. The interviews are a mixture of 1–1 and 2–1 sessions, focusing on background and problem solving skills in machine learning.

Important Reading Specific to Salesforce

Courtesy: Salesforce
  1. Learning Salesforce AI (Einstein): Trailmix
  2. Using Einstein Platform: Einstein Dev Tools and API
  3. Auto-Machine Learning: The Magic Behind Einstein (Youtube video)
  4. Inside Salesforce AI: Salesforce’s Quest to bring AI to everyone

AI/Data Science Related Questions

  • Find the minimum number of coins required for a given amount with a given set of coins.
  • Implement scalable top K words for Amazon product descriptions using count-min sketch.
  • What is the computational complexity of finding the most frequent word in a document?
  • Imagine you were hired by a hospital to analyze some of their patient reports. The diagnosis of each report consists of a list of disease IDs, while the body of the report is the symptoms and the medical history of the patient logged down by the doctor in natural language. Propose a model to describe the correlations between the set of symptoms and set of diseases extracted from the patient reports.
  • If you have 10 TBs of unstructured customer data, and you need to find some clever ways to extract valuable information of it, What do you do? Explain to me in the most technical level that you think.
  • Explain decision trees, multi variate regression and PCA
  • How to build a classifier to predict the outcome of NFL games in real time?
  • Explain about a project on your resume in layman terms
  • Explain the Assumptions and violations of K-means clustering
  • Describe Feature selection for Document classification
  • Given a Dataset explain feature importance.
  • Describe Logistic regression parameters, SVM.
  • Describe Decision Trees, Random Forests
  • Explain SVM to a non-technical person.
  • Describe your experience with qualitative and quantitative methods.
  • What is the difference between trees and random forest?
  • Based on the data we provide you what kinds of questions would you ask to prove that the programs we’re offering that generate said data are adding value to the company?
  • How will you design to scale billions and billions of records. How do you decide which big data tools should be used?
  • Explain the different type of probability distributions like ggplot.
  • How would you model a distribution for churn?

All Questions are sourced from public sources. The questions are more of a descriptive guide than a prescriptive one.

Reflecting on the Questions

Salesforce interviews have a great rating on quality. Folks who go through the process mention that the company is very professional. The research talk on your topic is another skill that is not practiced elsewhere in interviews. I believe that is of paramount importance to any Data Scientist and it enhances his skill manifold. The questions are a healthy mix of coding as well as presenting your thoughts. There are also good questions about scale of systems and data.

Subscribe to our Acing AI newsletter, if you are interested:

Interested in learning how to crack machine learning interviews?

The sole motivation of this blog article is to learn about Salesforce and its AI technologies helping people to get into it. All data is sourced from public sources. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.

--

--