AI Interview System using Generative AI

using Multi-Agent Orchestration for automating interview process with multiple rounds codes explained

Published in

Data Science in your pocket

15 min readMay 21, 2024

Multi-Agent Orchestration is a new emerging sub-domain in Generative AI which looks capable of solving a number of complex tasks rather than just text generation & completion using a combination of AI Agents. In my previous posts, I’ve already explained the basics of Multi-Agent Orchestration followed by a detailed post on Multi-Agent Debate application and Movie Script writing. Extending this list of POCs, in this post, I will rather cover a more complex usecase i.e. an entire AI Interview panel for interviewing a candidate for a specific role.

Let’s first understand the entire flow of the application

User inputs the role to be interviewed
Using Generative AI, decide over the Interview panel suitable for interviewing this role
Each interviewer asks a set of question (round wise)
At the end of the round, the interviewer passes on the feedback for the round
Once all rounds are the done, the feedback from each Interviewer is considered and final decision on the Interviewee is taken

For this prototype, we would again be using LangGraph. If you’re new to it, you can check the beginners tutorial below (this is a pre-requisite):

Before we move ahead, you would also need to generate a free Google Gemini API key as explained in the below short video

Once you’re done with the pre-requisites, let’s get our hands on the codes

My debut book: “LangChain in your Pocket” is out now

LangChain in your Pocket: Beginner's Guide to Building Generative AI Applications using LLMs

Amazon.com: LangChain in your Pocket: Beginner's Guide to Building Generative AI Applications using LLMs eBook : Gupta…

www.amazon.com

1. Install and import packages and set up your Gemini API key.

!pip3 install langchain==0.1.11 langgraph==0.0.26 -q -U google-generativeai langchain-google-genai

#Importing packages
from typing import Dict, TypedDict, Optional
from langgraph.graph import StateGraph, END
from langchain.output_parsers import CommaSeparatedListOutputParser
import os
import google.generativeai as genai
from langchain_google_genai import ChatGoogleGenerativeAI

#Setting Google API
GOOGLE_API_KEY=''
genai.configure(api_key=GOOGLE_API_KEY)
model = ChatGoogleGenerativeAI(model="gemini-pro",google_api_key=GOOGLE_API_KEY)
def llm(x):
    return model.invoke(x).content

2. Next, lets setup the required StateGraph variables

#Setting StateGraph variables

class GraphState(TypedDict):
    all_history: Optional[str]=None
    history: Optional[str] = None
    result: Optional[str]=None
    final_result: Optional[str]=None
    total_questions: Optional[int]=None
    interviewer: Optional[str]=None
    candidate: Optional[str]=None
    current_question: Optional[str]=None
    current_answer: Optional[str]=None
    round: Optional[int] = None
    panel: Optional[list]=[]
    total_rounds: Optional[int] =None
    total_questions_per_round: Optional[int] = None

workflow = StateGraph(GraphState)

Some of the important ones to understand are:

all_history : Saves the entire interview
history : Save current round’s conversation
result : Feedback for current round
final_result : Feedback for entire interview and whether interviewee was selected or not

Rest of the variables are self-explanatory and skipped for now.

3. Define the prompt set required alongside the nodes for the Graph object.

#Different Agent prompts used

prompt_interviewer = "You're a {}. You need to interview a {}. This is the interview so far:\n{}\n\
Ask your next question and dont repeat your questions.\
Output just the question and no extra text"

prompt_interviewee = "You're a {}. You've appeared for a job interview.\
Answer the question asked. Output just the answer as paragraph and no extra text. Question:{}"

prompt_result = "Evaluate the performance of the candidate based on the response given for the questions asked\
    give a rating on a scale of 10. Explain the rating in very short paragraph.\nThe interview:\n{}"

prompt_verdict="Given the interview, should we select the candidate?\
    Give output as Yes or No with a reason.\nThe interview:\n{}"

prompt_cleanup = "Remove empty dialogues, repeated sentences and repeate names to convert this input as a conversation.\
                  Output just the Interview and no extra text. \nInterview:\n{}"

#Defining nodes for LangGraph
def handle_question(state):
    history = state.get('history', '').strip()
    role = state.get('interviewer', '').strip()
    candidate = state.get('candidate', '').strip()
    prompt = prompt_interviewer.format(role,candidate,history)
    question = role +":"+ llm(prompt)
    if history=='Nothing':
        history = ''
    return {"history":history+'\n\n'+question,"current_question":question,"total_questions":state.get('total_questions')+1}

def handle_response(state):
    history = state.get('history', '').strip()
    question = state.get('current_question', '').strip()
    candidate = state.get('candidate', '').strip()
    prompt = prompt_interviewee.format(candidate,question)
    answer = candidate +":"+ llm(prompt)
    return {"history":history+'\n\n'+answer,"current_answer":answer}

def handle_result(state):
    history = state.get('history', '').strip()
    interviewer = state.get('interviewer', '').strip()
    round = state.get('round')
    panel = state.get('panel')
    prompt = prompt_result.format(history)
    result = llm(prompt)
    all_history = state.get('all_history', '').strip()
    result =  " \nInterviewed by {}\n{}".format(interviewer,result)
    print("ROUND {} done by {}".format(round+1,interviewer))
    try:
        next = panel[round+1]
    except:
        next = ''
    return{"result":state.get('result')+'\n'+result,'history':'Nothing',"all_history":all_history+'\n\n INTERVIEWED BY {} \n'.format(interviewer)+history,\
           'round':round+1,'interviewer':next,'total_questions':0}

def handle_selection(state):
    result = state.get('result', '').strip()
    prompt = prompt_verdict.format(result)
    result = llm(prompt)
    return{"final_result":result}

The nodes require a brief explanation I guess

handle_question : Let the current interviewer agent generate the next question based on the conversation so far (history).
handle_response: This generates the answer for the question being asked
handle_result : Based on the entire round, the interviewer gives a feedback and rating for the interviewee
handle_selection : Based on feedback from all the rounds, decide whether the interviewee should be selected or not

4. Add the above defined nodes into the workflow graph object.

workflow.add_node("handle_question",handle_question)
workflow.add_node("handle_response",handle_response)
workflow.add_node("handle_result",handle_result)
workflow.add_node("handle_selection",handle_selection)

5. Add required conditional edges

#Defining conditional edges

def check_conv_length(state):
    return "handle_question" if state.get("total_questions")<state.get("total_questions_per_round") else "handle_result"

def check_rounds(state):
    return "handle_question" if state.get("round")<state.get("total_rounds") else "handle_selection"

workflow.add_conditional_edges(
    "handle_response",
    check_conv_length,
    {
        "handle_question": "handle_question",
        "handle_result": "handle_result"
    }
)
workflow.add_conditional_edges(
    "handle_result",
    check_rounds,
    {
        "handle_question": "handle_question",
        "handle_selection": "handle_selection"
    }
)
workflow.set_entry_point("handle_question")
workflow.add_edge('handle_question', "handle_response")
workflow.add_edge('handle_selection',END)

Let’s understand the different conditional edges :

check_conv_length: This helps us to end a specific round once the threshold questions (say 5) have been asked by the interviewer.
check_rounds: this helps in ending the interview process given all the rounds are done.

6. Compile the graph object

app = workflow.compile()

7. Decide Panel for the interview

#Determining panel

interview_role = 'data scientist'
interview = "Suggest a interview panel to interview a {}. Suggest just the roles as comma separated list and nothing else. Dont use new line character".format(interview_role)
output_parser = CommaSeparatedListOutputParser()
classes = output_parser.parse(llm(interview))

8. Invoke the graph object & initiate interview

conversation = app.invoke({'total_questions':0,'candidate':interview_role,'total_rounds':len(classes),'total_questions_per_round':3,'round':0,'result':'','all_history':'','interviewer':classes[0],'panel':classes,'history':'Nothing','round':0,'all_history':''},{"recursion_limit":1000})

We have started the execution with default values for StateGraph variables. The recursion_limit parameter is required for generating longer conversation using LangGraph

9. Checkout the final decision

print(conversation['final_result'])

Yes. The candidate has consistently received high ratings from all interviewers across various domains of data science, including data modeling, machine learning, NLP, Agile methodologies, data preprocessing, and software engineering. The candidate demonstrates a strong understanding of the concepts and techniques used in each area, coupled with practical experience in applying them to real-world problems. The candidate's willingness to address their weaknesses and continuously improve further strengthens their suitability for the role.

10. Entire interview process

print(conversation['all_history'])

INTERVIEWED BY Hiring manager 
Hiring manager:What are your experiences with data modeling and machine learning algorithms?
data scientist:In my previous role, I was responsible for developing and implementing data models and machine learning algorithms to solve a variety of business problems. I have experience with a wide range of data modeling techniques, including supervised and unsupervised learning, and have successfully applied these techniques to problems in areas such as customer segmentation, churn prediction, and fraud detection. I am also familiar with a variety of machine learning algorithms, including linear and logistic regression, decision trees, and neural networks, and have used these algorithms to build predictive models that have improved business outcomes.
Hiring manager:How do you approach the challenge of working with large and complex datasets?
data scientist:To approach the challenge of working with large and complex datasets, I follow a structured methodology:
1. **Data Exploration and Understanding:** I begin by exploring the dataset to grasp its structure, variable types, and potential relationships. This helps me identify data quality issues, outliers, and missing values.
2. **Data Cleaning and Preprocessing:** I then clean and preprocess the data to address any inconsistencies or errors. This involves handling missing values, standardizing formats, and removing duplicate or irrelevant data.
3. **Feature Engineering:** To extract meaningful insights, I perform feature engineering to create new features or transform existing ones. This step enhances the predictive power of the data and improves model performance.
4. **Model Selection and Training:** Based on the problem at hand, I select appropriate machine learning or statistical models. I train these models on the preprocessed data, optimizing hyperparameters to achieve optimal performance.
5. **Model Evaluation and Validation:** I thoroughly evaluate the trained models using metrics relevant to the business objective. This involves cross-validation, testing on unseen data, and assessing model stability and robustness.
6. **Data Visualization and Communication:** I present my findings through clear and concise data visualizations. I effectively communicate the insights derived from the analysis to stakeholders, ensuring that they can make informed decisions based on the data.
Hiring manager:Can you provide an example of a successful data modeling or machine learning project that you led?
data scientist:In my previous role, I led a project to develop a predictive model for customer churn. We utilized a combination of supervised machine learning techniques, including logistic regression and decision trees, to identify key factors influencing customer attrition. The model was implemented in a production environment and resulted in a 15% reduction in churn rate, leading to significant cost savings for the organization. Additionally, we employed feature engineering techniques to optimize model performance, enhancing its accuracy and interpretability.
 
INTERVIEWED BY Data science manager 
Data science manager:What is your experience with natural language processing (NLP) techniques?
data scientist:In my previous role, I led a team responsible for developing and implementing NLP solutions for a major e-commerce platform. My responsibilities included:
- Designing and building NLP models for tasks such as text classification, named entity recognition, and machine translation
- Developing and maintaining NLP pipelines for data preprocessing, feature engineering, and model training
- Collaborating with engineers and product managers to integrate NLP solutions into the platform's core functionality
- Monitoring and evaluating NLP models to ensure optimal performance and identify areas for improvement
Through these experiences, I have gained a deep understanding of NLP techniques, including:
- Supervised and unsupervised learning algorithms for NLP
- Feature engineering techniques for text data
- Evaluation metrics for NLP models
- Cloud-based platforms and tools for NLP development
I am also familiar with the latest advancements in NLP, including the use of deep learning and transfer learning techniques. I am confident that my skills and experience in NLP would be a valuable asset to your team.
Data science manager:What are your strengths and weaknesses as a data scientist?
data scientist:**Strengths:**
* Strong analytical and problem-solving skills, with a deep understanding of statistical modeling and machine learning techniques.
* Proficient in data manipulation, cleaning, and visualization using tools like Python, R, and SQL.
* Excellent communication and presentation skills, capable of translating complex technical concepts to stakeholders.
* Proven ability to work independently and as part of a team, contributing to the successful completion of data science projects.
* Passionate about data-driven decision-making and leveraging data to solve business problems.
**Weaknesses:**
* Limited experience in cloud-based data processing platforms like AWS and Azure.
* Still developing expertise in deep learning and natural language processing.
* Sometimes prone to perfectionism, which can lead to extended project timelines.
Data science manager:What are your career goals and how do you see this role aligning with them?
data scientist:As an experienced data scientist with a passion for leveraging data to drive business outcomes, I am drawn to this Data Science Manager position as a pivotal step in my career trajectory. My ultimate goal is to lead a team of data scientists and analysts in developing innovative solutions that empower organizations to make data-driven decisions. This role aligns perfectly with my aspirations, providing me with the opportunity to manage a team, contribute to strategic decision-making, and shape the future of data science within your organization.
 
INTERVIEWED BY Senior data scientist 
Senior data scientist:What are your experiences with working in an Agile environment?
data scientist:I have extensive experience working in an Agile environment, utilizing Scrum and Kanban methodologies to deliver value-driven solutions. I am proficient in sprint planning, backlog management, and retrospectives, ensuring effective collaboration and continuous improvement within cross-functional teams. I have a deep understanding of Agile principles and practices, including user-centered design, iterative development, and test-driven development. This experience has enabled me to successfully contribute to Agile projects, consistently meeting deadlines and exceeding stakeholder expectations.
Senior data scientist:How have you applied Agile methodologies to complex data science projects?
data scientist:In my experience, applying Agile methodologies to complex data science projects has proven transformative. By embracing an iterative and incremental approach, I break down projects into smaller, manageable sprints. This allows us to gather feedback early and adjust our course as needed, ensuring timely delivery of value. Regular sprint retrospectives foster continuous improvement, empowering the team to identify areas for optimization and enhance collaboration. By leveraging tools like Scrum and Kanban, we maintain transparency and accountability, keeping stakeholders informed and aligned throughout the project lifecycle. This Agile mindset has enabled us to navigate complex data science challenges effectively, delivering high-quality solutions that meet evolving business needs.
Senior data scientist:Can you describe a specific example of how you used Agile methodologies to solve a challenging data science problem?
data scientist:In my previous role, we encountered a complex data science problem involving predicting customer churn. We adopted Agile methodologies to tackle this challenge effectively. We broke the project into smaller sprints, focusing on specific aspects of the problem. This allowed us to iterate quickly, gather feedback, and make adjustments as needed. By leveraging daily stand-up meetings and sprint retrospectives, we fostered collaboration, identified bottlenecks, and continuously improved our approach. Through this iterative process, we developed a robust churn prediction model that significantly reduced customer attrition and improved business outcomes.
 INTERVIEWED BY Data engineer 
Data engineer:What is your experience with data preprocessing techniques?
data scientist:During my previous role as a data scientist at [Company Name], I was extensively involved in data preprocessing to prepare raw data for analysis and modeling. I possess proficiency in a range of data preprocessing techniques, including data cleaning, data transformation, feature engineering, and data reduction.
In particular, I am highly skilled in handling missing data, dealing with outliers, and performing data normalization and standardization. I have successfully applied these techniques to various datasets, including customer data, financial data, and sensor data.
Additionally, I have experience in using machine learning algorithms for data preprocessing tasks, such as anomaly detection and feature selection. I am also familiar with big data tools and technologies, such as Hadoop and Spark, for handling large-scale data preprocessing.
Data engineer:What are your strengths and weaknesses as a data scientist?
data scientist:As a data scientist, my strengths lie in my analytical mindset, coupled with a strong foundation in statistics and machine learning techniques. I am adept at extracting meaningful insights from complex datasets, utilizing both structured and unstructured data. Furthermore, my proficiency in programming languages such as Python and R enables me to efficiently implement data science solutions. However, I recognize that my weakness lies in the area of cloud computing technologies. While I have a foundational understanding of concepts such as AWS and Azure, I am eager to expand my knowledge and gain hands-on experience in these platforms.
Data engineer:What are your experiences working with different stakeholders?
data scientist:In my previous role, I collaborated extensively with various stakeholders across the organization, including business analysts, product managers, and executives. I actively participated in requirement gathering sessions, translating business needs into technical specifications. I also provided regular updates on project progress and data analysis insights, ensuring alignment with business objectives. Additionally, I worked closely with IT teams to ensure data pipelines met performance and security requirements. Through effective communication and a deep understanding of stakeholder perspectives, I successfully facilitated data-driven decision-making and fostered a collaborative work environment.
 
INTERVIEWED BY Software engineer 
Software engineer:What are your experiences with working with large datasets?
data scientist:Throughout my career, I have consistently worked with large datasets, ranging from hundreds of gigabytes to several terabytes in size. In my previous role as a Data Scientist at ABC Company, I was responsible for managing and analyzing a dataset of over 100 million customer records. I utilized a combination of Hadoop, Spark, and SQL to process and analyze this data, identifying key trends and patterns that informed business decisions. Additionally, I have experience working with large-scale datasets in the healthcare industry, where I developed machine learning models to predict patient outcomes using a dataset of over 1 million medical records.
Software engineer:What specific techniques do you employ to handle data quality issues, such as missing values, outliers, and data inconsistencies?
data scientist:To handle missing values, I utilize imputation techniques like mean, median, or mode, considering the data distribution and variable characteristics. For outliers, I employ statistical approaches such as winsorization or capping to mitigate their impact while preserving valuable information. In cases of data inconsistencies, I perform data validation and cleansing, leveraging domain knowledge and data profiling tools to identify and correct errors. Additionally, I implement data quality checks and monitoring mechanisms to ensure ongoing data integrity and consistency.
Software engineer:How do you approach feature engineering and selection for complex datasets with a large number of features?
data scientist:When approaching feature engineering and selection for complex datasets with a large number of features, I employ a systematic and iterative process:
1. **Data Exploration and Understanding:** I begin by exploring the data to gain insights into its distribution, relationships, and patterns. This helps me identify potential features and their relevance to the target variable.
2. **Feature Generation:** I generate new features by transforming, combining, or creating new ones from the existing features. This process aims to extract meaningful information and reduce dimensionality.
3. **Feature Selection:** I employ various techniques such as correlation analysis, information gain, and recursive feature elimination to select the most relevant and informative features. This helps eliminate redundant or irrelevant features and improves model performance.
4. **Feature Scaling and Normalization:** I scale and normalize the selected features to ensure they are on the same scale and have a similar distribution. This enhances model stability and comparability.
5. **Model Training and Evaluation:** I train and evaluate multiple models using different combinations of features to determine the optimal feature set. I use metrics such as accuracy, precision, recall, and area under the curve (AUC) to assess model performance.
6. **Iterative Refinement:** Based on the evaluation results, I refine the feature set by adding or removing features and re-train the model. This iterative process continues until I achieve the desired model performance.

11. Round wise rating

print(conversation['result'])

Interviewed by Hiring manager
**Rating: 9/10**The candidate demonstrates a strong understanding of data modeling and machine learning algorithms. They provide detailed explanations of their approach to working with large and complex datasets, including data exploration, cleaning, feature engineering, model selection, evaluation, and visualization. The candidate also provides a specific example of a successful data modeling project they led, showcasing their ability to apply their skills to real-world problems and achieve positive business outcomes. Overall, the candidate's responses indicate a high level of competence and experience in data science.
 
Interviewed by Data science manager
**Rating: 9/10**The candidate has a strong understanding of NLP techniques and has applied them successfully in a previous role. They are also familiar with the latest advancements in NLP. The candidate's strengths as a data scientist include strong analytical and problem-solving skills, proficiency in data manipulation and visualization, excellent communication and presentation skills, and proven ability to work independently and as part of a team. The candidate's weaknesses include limited experience in cloud-based data processing platforms, still developing expertise in deep learning and natural language processing, and sometimes prone to perfectionism. However, the candidate is aware of these weaknesses and is taking steps to address them. Overall, the candidate is a strong candidate for the Data Science Manager position and has the potential to be a valuable asset to the team.
 
Interviewed by Senior data scientist
**Rating: 9/10**The candidate demonstrates a strong understanding of Agile methodologies and their application to data science projects. They provide clear and detailed explanations of their experiences, highlighting their proficiency in Scrum and Kanban methodologies and their deep understanding of Agile principles. The candidate also provides a concrete example of how they successfully applied Agile methodologies to solve a complex data science problem, showcasing their ability to break down projects, iterate quickly, and continuously improve. Overall, the candidate's responses indicate a high level of competence and experience in working in an Agile environment.
 
Interviewed by Data engineer
Rating: 9/10The candidate demonstrates a strong understanding of data preprocessing techniques and has applied them successfully in various projects. They also possess a solid foundation in statistics and machine learning, as well as proficiency in programming languages. However, they acknowledge their weakness in cloud computing technologies and express a willingness to learn and improve in this area. The candidate's experience in working with different stakeholders, including business analysts, product managers, and executives, showcases their ability to collaborate effectively and translate business needs into technical specifications.
 
Interviewed by Software engineer
Rating: 9/10The candidate demonstrates a strong understanding of working with large datasets, handling data quality issues, and feature engineering and selection for complex datasets. They provide specific examples of techniques and methodologies used, indicating a practical and hands-on approach. The candidate also emphasizes the importance of data exploration, iterative refinement, and model evaluation, showcasing a comprehensive and data-driven approach to data science. Overall, the candidate's responses are well-structured, informative, and reflect a high level of expertise in the field.

So with this, we are able to build out an end to end AI Interview system using LangGraph. A lot of modifications can be done for making the system more useful like

Add Interviewee’s resume for context
Let the feedback be sequential and the interview to stop if the Interviewee fails in any round
Topic deep-dive rather than independent questions being asked by the Interviewer
… , etc.

If you don’t want such a complex system, even a two agent interview system without a panel can be build using the below tutorial

With this, its a wrap. See you soon with something more interesting on GenAI