Most Asked AI/ML Interview Questions in India

Springboard India
22 min readDec 4, 2019

--

Once you have up-skilled and mastered Machine Learning and Artificial Intelligence, it is time to crack that interview. For those of you who still want to master the topic, visit our course on AI/ML.

In an interview for a Machine Learning and/or Artificial Intelligence job role, the interview process is much like a standard software engineer hiring process. Interview questions relate to the basic theoretical knowledge, software development, deep learning and applications.

So check out the following, and get ready to ace the interview!

Make sure to do your homework. Check out the hiring company profile, its nature of business, exposure to typical industry problems and other news related to their business model and future plans. Your preparation can set you apart from the herd!

The interviewer will not only check your knowledge, skillsets, programming knowhow and project experience, but also what value you will bring to their teams and how you will fit into their scheme of business plans.

A. INTRODUCTORY & GENERAL

MACHINE LEARNING

A1. How would you explain machine learning in layman terms?

This question checks your ability to explain complex concepts in easy terms.

What comes to one’s mind when we see a baby? How the baby keeps learning from experience. He stumbles, he falls; and yet gets up and walks again. That’s the essence of machine learning. The algorithms work the same way, redefining the learning process iteratively, to keep improving and give the best results. As the baby learns from the experience, machine learning learns from the data without set instructions.

A2. Do you have any training experience in machine learning? If so, what types of hands-on experience can you tell us about?

This answer depends upon your level of machine learning training and experience. Mention your machine learning certifications. Call attention to the projects you have worked on, whether under mentorship or as direct projects in a company. Highlight how it has helped to prepare you for a job role in machine learning.

A3. How do you catch up on your machine learning knowledge?

This question checks the interest level of the candidate and whether he or she is up-to-date with trends and innovative use cases.

Mention blogs, books and research papers you have read. Make sure to get into the habit of reading as part of your interview preparation.

Do you think it is important?

Yes, as this field is constantly evolving with new research methodologies, use cases and practical methods.

Mention the books and papers you have been reading.

Some examples would be:

Machine Learning by Tom M. Mitchell;

Machine Learning & Big Data by Kareem Alkaseer;

Learning Scikit-learn: Machine Learning in Python by Raul Garreta & Guillermo Moncecchi; Python Machine Learning by Sebastian Raschka & Vahid Mirjalili.

A4. Do you have experience on Spark or any big data tools for machine learning?

This is a tricky question that calls for a forthright answer. To begin with, make sure to be familiar with big data and the tools used. Talk about Spark only if you know it, and limit your answer to your extent of knowledge.

To know what is Spark, check out the big data tools available for machine learning.

Read on to prepare your answer on Spark.

Spark is the most popular big data tool because of its ability to handle large datasets with speed. It supports in-memory parallel computing, is open-source and compatible with Hadoop. It works well in an IoT network, and has a vast range of tools to work with, like machine learning, interactive SQL and real-time processing, that helps analyse real-time streaming data at high speed.

A5. What is your favourite algorithm? Why?

This question checks your ability to explain complex technical concepts in an easy manner and summarised efficiently. Select an algorithm that you can explain well.

Here, we have mentioned a sample algorithm. You can select any other algorithm you want.

Decision Trees

Decision Trees helps determine feature importance by finding the “best” attribute to split the data at each node of the tree. The problem of overfitting can also be overcome by either specifying the maximum depth of the tree, or the minimum sample size to allow for another split, or prune the final tree after it has completed.

Its advantages are:

  • It is an easy to understand model
  • Feature selections are performed by the algorithm itself
  • Little data preparation is required

A6. Tell us what algorithms are used in driverless cars.

This question checks your grasp of machine learning in the context of real world applications, and awareness of the latest trends.

The most common machine learning algorithm used in an autonomous vehicle is based on Object Tracking. It improves the accuracy of profiling and distinguishing between objects. i.e. whether it is another vehicle, a pedestrian, a bicycle, an animal? Sophisticated machine learning or pattern recognition algorithm is used with a training dataset fed with many images containing objects.

ARTIFICAL INTELLIGENCE

A7. What is the philosophy behind Artificial Intelligence?

With the exploding capabilities of the computer system, it led to the question, “Can a machine think and behave like humans do? Can machines have the same intelligent mechanism humans?” This curiosity led to the development of AI. AI was rooted in the philosophy that machines can behave in a similar intelligent manner, with the add-on benefits of sophisticated automation.

A8. What are some advantages of Artificial Intelligence?

  1. Low error rate compared to humans, if coded properly. It is capable of very high precision, accuracy, and speed.
  2. Can replace humans in repetitive and tedious tasks, thus saving on time and resources.
  3. Its ability to predict a human is very significant, as in applications like acting as “assistants”.
  4. Unlike humans, AI can think logically without emotions, thus making rational decisions with near-zero mistakes
  5. It has the ability to assess people intuitively, which is used in health sector applications.
  6. Can organise and manage records in an intuitive manner.
  7. It has applications in daily lives and across multiple scenarios, like browser search engines or banking fraud detection.
  8. As AI is not affected by hostile environments, it can perform dangerous and risk-ridden tasks as in mining, and exploration in space.
  9. Can interact with humans to provide entertainment as avatars or robots. For instance, video games.
  10. Can do repetitive jobs without any breaks

A9. Name some common uses and applications for AI?

This question tests your understanding of the AI field and how well you have grasped the far-reaching applications of AI.

When going for an interview, it is expected that you know about the company and its business. So if possible, highlight uses that are relevant to the interviewing company. It can earn your brownie points!

Applications and/or use cases:

  • object detection and classification in navigation
  • image recognition and tagging
  • predictive maintenance
  • data processing
  • automation of manual tasks
  • data-driven reporting
  • natural language processing
  • chatbots
  • sentiment analysis
  • sales prediction
  • self-driving cars
  • facial expression recognition
  • gaming
  • speech recognition

A10. Why is image recognition a key function of AI?

AI mimics human. As humans are visual, AI is designed to imitate human brains. Teaching machines to recognize and categorise images helps machines to learn and become intuitive. With more and more images processed, AI becomes highly proficient in recognising and processing those images, whether objects, people, places, writing or photographs. The image recognition function of AI is most important today, as it finds widespread applications in daily lives — in security systems, driverless cars, navigation, search engines, robots in logistics, or medical imaging,

A11. How is Game Theory related to AI?

Game theory is a framework of strategic hypothetical situations among competing players. AI uses game theory to evaluate potential actions of opponents where actions have a cost and some value. For instance, when writing an ‘agent software’ to bid for auctions, the agent has to be intelligent to understand the game theory and strategy working behind it.

A12. What is your coding background? What kind of projects are you interested in?

This is a common question to check your proficiency and depth of project involvement. You need to be clear and concise when you answer this question. Try to mention projects that are aligned with the interviewing company’s business, as it can sometimes be a tie-breaker.

A13. What is AI technique?

AI technique is an organized set of method derived from advanced statistical and mathematical models, for easy modification and error correction. It make it possible for machines to compute tasks done by humans.

Some AI Techniques are:

  • Heuristics
  • Support Vector Machines
  • Artificial Neural Networks
  • Markov Decision Process
  • Natural Language Processing

A14. Can you list some disadvantages of Artificial Intelligence?

  1. High Costs of creation.
  2. High costs of repair and maintenance.
  3. The power to replicate humans is limited, as intelligence is believed to be a gift of nature. Of course, there has been plenty of debate surrounding this thought process.
  4. Lacks the personal warmth of a human being despite super-intelligent robots developed like Sophia.
  5. Lacks original creativity that a human is capable of.

A15. What do you think is the future of Artificial Intelligence?

Artificial Intelligence is used today for benefits of society and businesses. It is a hard fact that AI is present deeply in our everyday lives, with more and more applications being implemented every day.

This poses several questions.

i) Is it possible for AI to outperform human performance?

No. Although AI research has made huge advances, there is still a limitation to AI as it lacks human touch and creativity.

ii) Can artificial intelligence replace humans or take away human jobs?

No. Humans after all are required to train AI models. Besides, the costs of computing are huge for implementing AI widely, so the benefits have to outdo the costs for widespread implementation. Human jobs are lost whenever there has been automation, but the fears are always overcome by the consequent job roles.

iii) For any organisation to implement AI, the factors to be considered are

  • Costs
  • Skillsets supporting AI development
  • Training with rules and boundaries, for automating the right insight.

AI has huge scope. The future possibilities are unlimited. However, these issues are to be carefully addressed, and AI implemented for proactive development and useful applications.

B. THEORY / ALGORITHM

MACHINE LEARNING

B1. How do you differentiate between deductive and inductive machine learning?

Deductive reasoning allows you to make statements based on known facts. Inductive Reasoning on the other hand, allows you to make statements based on evidence collected.

Deductive machine learning starts with a conclusion based on facts and learns by deducing what is right or wrong about that conclusion. Inductive machine learning starts with examples from which to draw conclusions and learns.

B2. Explain SVM and why it is called as maximum margin classifier.

Support Vector Machine (SVM) is a supervised machine learning algorithm, used for both, classification and regression. It sorts the data into one of two categories, and outputs a map of the sorted data with the margins between the two data points as far apart as possible.

It is known as maximum margin classifier because in a binary classification dataset, it places the decision boundary such that the distance between the two clusters is maximised. The SVM aims to find a separating hyperplane between positive and negative instances. It establishes the largest margin to avoid overfitting.

B3. Explain feature selection

Feature selection is the automatic or manual selection of attributes in the data (such as columns in tabular data) that are most significant to the predictive modelling problem. A subset of features is selected with focus on accuracy, relevance and value to the output. Irrelevant and redundant attributes are identified and removed from the data for more accuracy of the predictive model. The use of less data or representative data helps to reduce the complexity of the model. Feature selection contributes to a robust predictive model by reducing overfitting, improving accuracy and reducing training time.

B4. Differentiate between Precision and Recall.

Precision and Recall are model evaluation metrics that measure Relevance of results. They are used in pattern recognition, information retrieval and binary classification.

a) Precision means the percentage of the results that are relevant. On the other hand, Recall refers to the percentage of total relevant results that have been retrieved over the total amount of relevant instances.

For example, in a text search on a set of documents, Precision is the fraction of retrieved documents that are relevant to the query: However, Recall is the fraction of the relevant documents that are successfully retrieved.

b) Precision attempts to answer the following question:

What proportion of positive identifications was actually correct?

Recall attempts to answer the following question:

What proportion of actual positives was identified correctly?

c) Precision and Recall are opposite of each other, i.e. increasing one of them reduces the other and vice versa.

d) In simple terms, high precision means that an algorithm returned substantially more relevant results than irrelevant ones, while high recall means that an algorithm returned most of the relevant results

e) Precision measure the quality or relevancy of results of the model. Recall measures the quantity of positives or relevant result returned by the model.

B5. What is deep learning? How does it compare with other machine learning algorithms?

Deep learning is a machine learning technique that draws inspiration from the structure and function of the brain called artificial neural networks. It is a self-taught unsupervised feature learning, where models are trained using a large set of labelled data and neural network architectures that contain many layers. Deep learning models can achieve the highest accuracy, so it is used for large sets of unlabelled or semi-structured data. Simply put, deep learning teaches computers to do what comes naturally to humans, i.e. learning by example. Classification tasks are done directly from images, text, or sound.

A machine algorithm is used to parse data, learn from that data, and make informed decisions iteratively based on the learning. However, deep learning algorithm is used in layers to create an artificial “Neural Network” that learns and makes intelligent decisions on its own.

B6. How do you choose an algorithm for a classification problem?

There is no one solution fits all. Several factors go into the choice of a machine learning algorithm. The choice depends upon the level of accuracy required, and the size of the training set. Here is a sample answer.

The method followed would be:

a) Define the problem.

Herein, the output of the model is a class, as it is a classification problem.

b) Identify available algorithms from linear and non-linear classifiers

  • Logistic Regression.
  • Linear Discriminant Analysis
  • k-Nearest Neighbours.
  • Classification and Regression Trees.
  • Naive Bayes.
  • Support Vector Machines.

c) Implement all of them.

Next, set up a machine learning pipeline that compares the performance of each algorithm on the dataset, using a set of evaluation criteria or chosen metrics. The best performing one would be selected. Depending upon the results, either it would be run once or in intervals when new data is added.

d) Improve results using various optimisation methods

By using cross-validation (like k-fold) and hyperparameter tuning or Ensembling (bagging, boosting, etc); each algorithm would be tuned to optimise performance, if time is not a constraint. Otherwise, manually select the hyperparameters.

B7. If your model suffers from low bias and high variance, which algorithm would you use to tackle it? Why?

The error of a model can either be of bias and/or variance. Very low bias but high variance indicates overfitting, as well as complexity of the model. By averaging these out, we can reduce the variance and increase the bias.

a) A bagging algorithm can handle the high variance. The dataset is randomly subsampled mm times and the model trained using each subsample. Then the models are averaged by averaging out the predictions of each mode.

b) By using the k-nearest neighbour algorithm, the trade-off between bias and variance can be achieved. The value of k is increased to increase the number of neighbours that contribute to the prediction, and this in turn increases the bias of the model.

c) By using the support vector machine algorithm, the trade-off can be achieved by increasing the C parameter that influences the number of violations of the margin allowed in the training data, and this in turn increases the bias but decreases the variance.

Credit: Understanding bias-variance tradeoff

ARTIFICIAL INTELLIGENCE

B8. What is the difference between Statistical AI and Classical AI?

Statistical AI has its roots in machine learning, and is more concerned with “inductive” thought i.e. given a set of pattern, induce the trend etc. . Classical AI, is concerned with “deductive” thought, i.e., given a set of constraints, deduce a conclusion.

C++ is favoured for statistical AI, while LISP is used in classical AI.

However, a system cannot be truly intelligent without displaying properties of both inductive and deductive thought. So it is believed there will be some synthesis of statistical and classical AI in the future.

B8. In Artificial Intelligence, what is semantic analysis used for?

Semantic analysis is the process of understanding natural language, the way humans communicate, based on meaning and context of verbal expressions. In AI, semantic analysis is used to identify the most relevant elements in the text and understand the topic discussed.

B9. What is Fuzzy Logic?

In the real world, we often encounter situations when we cannot determine whether the state is true or false. In such cases, fuzzy logic provides a valuable reasoning method that closely reflects human reasoning. The approach considers the inaccuracies and uncertainties of any situation which humans engage in to consider possibilities. Thus, fuzzy logic is based on “degrees of truth” rather than the usual “true or false” (1 or 0) Boolean logic on which the modern computer is based. As a subset of AI, it encodes human learning for artificial processing and is represented as IF-THEN rules.

B10. What are some applications of Fuzzy Logic?

  • Facial pattern recognition
  • Home appliances like air conditioners, washing machines and vacuum cleaners
  • Antiskid braking systems, transmission systems
  • Control of subway systems and unmanned helicopters
  • Weather forecasting systems
  • Project risk assessment
  • Medical diagnosis and treatment plans
  • Aerospace, for altitude control of spacecraft and satellite
  • Speed control and traffic control in automotive systems
  • Decision making support systems and personal evaluation in large companies
  • Chemical industry applications for controlling the pH, drying, and chemical distillation process
  • Natural language processing and other intensive applications of AI
  • Stock trading

B11. What are the advantages of Artificial Neural Networks?

  • Artificial Neural Networks have the ability to learn and model non-linear and complex relationships between variables
  • ANN can generalise by inferring unseen relationships on unseen data as well
  • It requires less formal statistical training
  • It has the ability to detect nonlinear relationship between variables
  • It can better model data with high volatility and non-constant variance

B12. What is Tensorflow?

TensorFlow is an open source machine learning library for numerical computation using data-flow graphs. It is a fast and flexible toolkit for doing complex algorithm, offering developers the ability to build learning architectures for desired outputs. TensorFlow is cross-platform and runs on nearly everything, GPUs and CPUs, including mobile and embedded platforms, and even tensor processing units. The TensorFlow has a large number of open sourced models that can be found in the tensorflow/models repo.

B13. What are the various branches of AI?

  1. Artificial Neural Networks — A model based on the premise of biological neural network.
  2. Fuzzy logic — A reasoning method used where the truth values of the variable vary between 0 and 1.
  3. Pattern recognition — An automated recognition system of regularities and patterns in data.
  4. Swarm Intelligence — The collective behaviour of decentralised, self-organized systems, whether natural or artificial.
  5. Genetic algorithm — A search enabler inspired by Charles Darwin’s theory of natural evolution. The algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation.
  6. Expert Systems — A computer system that emulates the decision-making ability of a human expert. These are designed to solve complex problems, by reasoning through bodies of knowledge, represented as IF-THEN rules rather than conventional procedural code
  7. Data mining — The process of discovering patterns in large data sets, using intersecting methods of machine learning, statistics, and database system.
  8. Statistical AI — A sub-discipline of artificial intelligence and machine learning that deals with domain models that exhibit uncertainty and a complex, relational structure.

B14. What is Greedy Best First Search Algorithm?

It is a search algorithm that explores a graph by expanding the most promising node chosen according to a specified rule. It is a heuristic search of an efficient selection of the current best candidate implemented using a priority queue. The A* search algorithm is an example of a best-first search algorithm, as is B*.

B15. In AI, what are an alternate key, artificial key, compound key and a natural key?

Alternate Key — Any candidate key, except for primary keys.

Artificial Key — A key created artificially by assigning a number to an individual record, in the absence of a standalone or compound key.

Compound Key — Integration of multiple elements to create a unique identifier, in the absence of any data element that uniquely defines the occurrence within a construct.

Natural Key — A data element stored within a construct, utilised as the primary key.

C. PRACTICAL / PROGRAMMING

MACHINE LEARNING

C1.Where from do you usually source data-sets?

This question displays your interest in machine leaning and experience. It gauges your ability to experiment with data and perform under different problem scenarios. Sometimes this can be the tie-breaker, if you have experimented with some interesting and prolific datasets that the interviewing company is keen upon.

There are many free and open repositories of datasets available to build machine learning models — Amazon Product Data, Kaggle, Sentiment analysis, Socrata Open DataUCI Machine Learning Repository, for instance.

Spend time to explore and analyse. Have fun experimenting and visualising your data, while you prepare yourself for the ultimate machine learning job interview.

C2. How do you handle missing or corrupted data in a dataset?

Action would be based depending upon the pattern of missing or corrupt data — if missing completely at random with no sense, missing at random, and missing not at random which affects primary dependent variables.

Possible options are

  • Delete rows with missing values, if missing completely at random and percentage of missing values is low.
  • Recover the values by contacting the participants for sourcing the missing data
  • Use the average value, if there is not much variability in the data.
  • Use a more structured guess with a common point imputation
  • Use multiple regression analysis to predict the missing value from other values.
  • Use correlations between data with multiple imputation. Plausible values are created, based on the correlations for the missing data and the simulated datasets averaged by incorporating random errors in the predictions.
  • Assign a unique category, when we want to prevent loss of data
  • Predict missing values using linear regression
  • Use algorithms that support missing values, like KNN, random forest or a tree-based method.

C3. What are some methods of reducing dimensionality?

Popular Techniques include:

i) Missing Values Ratio — When data columns contain too many missing values, then removing those columns by setting a threshold for missing values.

ii) Low Variance Filter — When a data column has constant values, its variance would be 0 and such variables will not explain the variation in target variables.

iii) High Correlation Filter — When data columns are interdependent on each other and contain similar information it adds to the redundancy of the model. Highly correlated columns are thus identified using correlation coefficients.

iv) Random Forest — To tackle issues of missing values, outliers and most significant variables, the feature selection method is used to find most informative subset of features.

v) Backward Feature Elimination — Eliminates features that do not add value to the model, one at a time by checking the error rate after each elimination, till the maximum error rate tolerable is reached. The smallest number of features is then defined.

vi) Forward Feature Construction — Find the most significant features that improve the performance of the model, and adding them one at a time for model improvement.

vii) Principle Component Analysis (PCA) — Existing set of variables is transformed into a new set of variables, which is a linear combination of original variables.

viii) Factor Analysis — The variables are modelled as linear combinations of the potential factors, plus “error” terms. It is based on the assumption that there exists several unobserved latent variables that account for the correlations among observed variables.

ix) t-Distributed Stochastic Neighbour Embedding (t-SNE) — It considers the probability that pairs of data points in the high-dimensional space are related, and chooses low-dimensional embeddings that produce a similar distribution.

x) ISOMAP — Uses a matrix of pair-wise distances between all points and computes a position for each point. Then ISOMAP uses classic multi-dimensional scaling (MDS) to compute reduced-dimensional positions of the points

C4. What is stratified cross-validation and when is it used?

Where there is a large imbalance in the response variables, the cross-validation technique is used to rearrange data between training and validation sets, so that the distribution in each fold has a good representation of the whole dataset. It forces each fold to have at least m instances of each class.

Stratified cross-validation is used in the following events:

  1. When the dataset is small and has multiple categories, which creates an imbalance.
  2. When the dataset has different distributions and validation is required to prevent a generalisation problem.

C5. Solve a problem relating to decision tree.

Consider the Problem: From data of 70 patients, we have to eatimate which of them are more prone to lung cancer. Only two attributes, ‘Age’ and ‘Smoking’ habit have been tested against the possibility of having lung cancer.

The decision tree model can estimate the probability of patients having lung cancer based on two of the main attributes. Also, it helped to predict and identify which of the new patients are most likely to have lung cancer.

C6. What is an imbalanced dataset? Can you list some ways to deal with it?

An imbalanced dataset is one where the distribution of data in the target categories is not uniform. For example, in email classification problem, there will typically be more spam mails than ham (relevant) mails. The class imbalance may be as high as 70–95 % for the spam mail class, and 5–30 % % for the relevant mails. This disproportionate distribution of two classes of data is an imbalanced dataset.

Using an imbalanced dataset affects the performance and accuracy training model and needs modification.

Good ways to deal with imbalanced datasets should focus on correcting the imbalance, when there is no option to use another algorithm. Some ways are:

  • Oversampling of minority class when the data is insufficient.
  • Under sampling of majority class when there is a good quantity of data to work with.
  • Collecting more data and adding the data in the lighter category to control the imbalance.
  • Cluster-based oversampling so that all classes are of the same size and clusters of the same class have equal number of instances.
  • Generate synthetic samples by randomly sampling the attributes from instances in the minority class and adding to the dataset.
  • Resampling with different ratios between the rare and abundant class
  • Using appropriate metrics to deal with the imbalance. For instance, precision, confusion, recall, and F-score, to ensure better accuracy of the model.
  • Modifying existing classification algorithms and designing own models that work best with imbalanced datasets.

ARTIFICIAL INTELLIGENCE

C7. How to connect to the Amazon server using PuTTY?

Ideally, you will have had an exposure working with Amazon server. However, you can check out this resource.

C8. Which is the best way to go for Game Playing problem??

Heuristic Approach is the best way, as it will find out brute force computation, looking at hundreds of thousands of positions. For instance, in Chess competition between Human and AI based Computer.

C9. What libraries are already available in a deep learning AMI?

  • MxNet
  • TensorFlow
  • Keras with TensorFlow as default backend
  • Keras with MxNet as default backend
  • Caffee
  • CNTK
  • Theano
  • PyTorch
  • NVidia
  • CUDA
  • CuDNN

C10. How would you train your Deep Neural Network?

Prepare yourself by looking up these excellent resources.

How to train your Deep Neural Network

A bunch of tips and tricks

Building your Deep Neural Network: Step by Step

C11. What are the benefits of using artificial intelligence in testing?

To streamline software testing and make it more effective or smarter within time constraints, an AI-powered continuous testing platform is a proven method. It can identify changed controls much more effectively than a human. With a stream of steady and continuous updates to algorithms, even a small change can be observed.

Advantages:

a) AI test automation has much more capability than manual testing as it can simulate any number of virtual sets of users to interact with a software, a network, or web-based apps.

b) High accuracy of test results.

c) Supports both, Developers and Testers by sharing automated tests before it reaches Quality Assurance.

d) Saves time and money as it ensures faster time to market. AI automated tests can be implemented again and again, with low to zero additional cost at a speedy pace.

e) The overall test coverage can be increased for better software quality

C12. Can we apply deep learning classifier in biometric authentication?

Yes, using these five basic steps 1) Acquisition, 2) Pre-processing, 3) Registration and segmentation, 4) Feature extraction, and 5) Classification.

Suggested reads: a) Learning Pairwise SVM on Hierarchical Deep Features for Ear Recognition

b) Deep Features for Efficient Multi-Biometric Recognition with Face and Ear Images

C13. Which method would you prefer for pattern recognition, and why?

Pattern recognition methods are parametric and non-parametric. Choosing the best method depends on many factors such as the computational power, the amount of available data, dimension of the feature space, the distribution of the data, the application and the task.

The best algorithm of pattern recognition depends on the class of problems. If the conditional probability distributions of objects of different classes are known, then we may use Bayesian methods of classification. If conditional probabilities are not known, then we would use discriminant methods of SVM. To recognize optical images we could implement Convolutional neural networks.

C14. How does Facebook use Image Analysis?

Facebook uses a deep learning application called DeepFace that works as an advanced image recognition tool. It detects users’ friends to match newly uploaded pictures with the ones that are tagged elsewhere. The algorithm is fed with large amounts of training data, and machine learning algorithms like neural nets to classify and recognise images of pictures uploaded.

C15. Mention how you would use AI for fraud detection in banking transactions.

The following AI techniques would be used:

  • Data mining to classify, cluster and segment the data; like high transaction or cross-border transactions. Then, automatically find associations and rules in the data that signify suspicious patterns related to fraud.
  • Expert systems to encode expertise in detecting fraud in the form of rules governing banking regulations.
  • Pattern recognition to detect approximate classes, clusters, or patterns of suspicious behaviour either automatically (unsupervised) or to match given inputs.
  • Machine learning techniques to identify characteristics of fraud.
  • Neural networks that can learn suspicious patterns from samples and used later to detect them.

D. DOMAIN / INDUSTRY BASED / COMPANY SPECIFIC

D1. How would you implement a recommendation system for our company’s users?

This is a mere sample question. You can expect many of such questions that relate to the implementation of machine learning models in the hiring company system.

To prepare for such question types, you need to do your homework. Study the company profile, its current financials, the customer profile, its business and service/product offerings, as part of the interview preparation.

Some great resources:

How to implement a recommender system

Learning to make recommendations

How do recommendation engines work?

D2. What is your opinion on our current data process?

As suggested, you need to have a good knowledge of the hiring company business process. Identify their business model; understand what processes they use, and the areas of potential improvement.

Your reply should be in constructive, precise and insightful. Give your interviewer the chance to understand your potential and value to their team.

D3. How can we use your machine learning skills to generate revenue?

This is a tricky question, and often can be the tiebreaker. Your answer should exhibit knowledge of the industry, the company’s business process and the relevance of your skills.

For instance, you could mention your skills at developing NLP algorithms to make customer interactions more personal for CX; automation of financial processes for cost savings; in supply chain management; to discover patterns in pilferage, and so on.

Although these are merely examples, your answers have to relevant to the hiring company business process and associated industry problems.

D4. How can you help our marketing team be more efficient?

The answer depends on the type of company.

Here are some examples.

  • Clustering algorithms to build customer segments customised to the marketing campaign.
  • Predict conversion of website visits based on a 360 degree user behaviour to create better upselling and cross-selling campaigns.

D5. How would you suggest we implement virtual personal shoppers in our company?

This is a sample question you can expect if the interviewer is an e-commerce company, this. Here are some excellent resources, to notch up your knowledge-base. Use open source data and work on similar projects to hone your experience.

How to develop a personal shopper app based on artificial intelligence

How to develop a shopping assistant app using data intelligence and human consultants

--

--

Springboard India

The best way to learn Data Science, Cybersecurity, and UX Design skills online. Courses you'll actually complete - with 1-on-1 mentorship from industry experts.