ChatGPT/Gemini: A Friend or Foe to Manual QA Engineers?

Published in

Byborg Engineering

13 min readMar 1, 2024

Introduction:

In the rapidly evolving landscape of software testing, the integration of artificial intelligence (AI) tools has sparked both curiosity and concern among QA professionals. One such tool that has garnered attention is ChatGPT/Gemini, an advanced language learning model (LLM). We embarked on a journey to explore whether these advanced LLM models are friends or foes to manual QA engineers. In this article, we will delve into various topics like — “Does chatGPT/Gemini have the potential to replace Manual QA”, “Pros and cons of leveraging AI models in software testing” and some basics of “Prompt Engineering”. Let’s not wait any longer; it’s time to dive into the heart of the matter and uncover the true impact of ChatGPT/Gemini on the world of manual QA testing.

The Rise of AI in Software Testing:

As technology continues to advance, the role of AI in software testing has become increasingly prominent. AI-powered tools offer the promise of automating repetitive tasks, enhancing efficiency, and uncovering complex issues that may elude manual testing. However, alongside the potential benefits, there are concerns about AI’s impact on manual QA engineers and whether it poses a threat to traditional testing methodologies. First, let’s tackle a question at the forefront of industry conversations, sparking curiosity and debate alike: “Does the advent of ChatGPT/Gemini and similar large language models signal the end for manual QA, or is it merely the beginning of a new chapter?”

ChatGPT/Gemini Arrival: A Threat to Manual QA or Simply Uncharted Territory?

The emergence of AI-powered tools has prompted speculation about the future role of manual QA engineers in software testing. While ChatGPT or Gemini offers significant capabilities in automating certain aspects of the testing process, it is unlikely to replace manual QA in the foreseeable future. Here’s why:

1. Complexity of Testing Scenarios: Manual QA engineers possess critical thinking skills and domain expertise that are essential for navigating complex testing scenarios. While AI models can automate routine tasks and generate test cases based on predefined criteria, it may struggle to handle nuanced or ambiguous situations that require human judgment and intuition.

2. Adaptability to Changing Requirements: Software development is inherently dynamic, with requirements evolving rapidly in response to user feedback, market trends, and technological advancements. Manual QA engineers are adept at adapting to these changes, adjusting their testing strategies and priorities accordingly. AI models, on the other hand, rely on pre-defined rules and data patterns, which may limit their ability to adapt to evolving requirements and testing contexts.

3. Creativity and Exploration: Manual QA engineers play a crucial role in exploring edge cases, uncovering hidden defects, and pushing the boundaries of software functionality. Their creativity and curiosity enable them to identify potential risks and vulnerabilities that automated tools like ChatGPT may overlook. While AI models can assist in generating test scenarios, they may lack the intuition and ingenuity required to explore novel testing scenarios effectively.

4. Human-Centric Testing: Software testing goes beyond technical validation; it also encompasses user experience, accessibility, and usability testing. Manual QA engineers bring a human-centric perspective to testing, empathizing with end-users and advocating for their needs. While AI models can analyze data and generate test cases, it may struggle to assess subjective factors such as user satisfaction, emotional resonance, and cultural relevance.

5. Collaboration and Communication: Effective software testing requires collaboration and communication among diverse stakeholders, including developers, product managers, designers, and QA engineers. Manual QA engineers serve as liaisons between these stakeholders, facilitating dialogue, clarifying requirements, and aligning priorities. While AI models can automate documentation and reporting tasks, it may lack the interpersonal skills and contextual understanding required for effective collaboration.

6. Continuous Learning and Improvement: The landscape of technology and user expectations is continually evolving, thus requiring QA Engineers to constantly update their knowledge and skills. Manual QA engineers can learn from each testing cycle, gaining insights that can be applied to future projects. This continuous learning loop enhances their ability to anticipate problems, innovate testing methodologies, and refine their approach based on new trends, technologies, and user feedback. While AI models can learn from new data, the scope of their learning is confined to the data they are trained on and may not capture the full spectrum of technological advancements and user expectations.

7. Crisis Management and Rapid Response: In situations where critical bugs or security vulnerabilities are identified, especially close to product launch dates, manual QA engineers can prioritize issues effectively, strategize rapid responses, and work closely with development teams to implement fixes. Their ability to understand the broader impact of these issues on the product and the business is invaluable. While AI tools can assist in identifying problems, the human insight provided by QA engineers is crucial for managing crises, making informed decisions under pressure, and mitigating risks in a timely manner.

8. Nurturing a Quality Culture: Manual QA engineers contribute to fostering a culture of quality within organizations. They advocate for best practices, quality standards, and continuous improvement, influencing not just the testing process but also the development lifecycle and organizational mindset towards quality. Their role extends beyond identifying bugs to encompass mentoring, knowledge sharing, and promoting a holistic approach to quality. This cultural aspect of quality assurance is something that automated tools cannot replicate, as it involves human values, attitudes, and behaviors.

9. Test Execution capabilities: AI models, while highly sophisticated in natural language processing and generation, lack the ability to interact with software interfaces the way a human does. Manual test execution typically requires a QA Engineer to interact with a software application’s user interface, observe behaviors, compare expected outcomes against actual outcomes, and use judgment to determine whether a feature is working correctly. This includes clicking buttons, entering data, handling unexpected behaviors, and intuitively understanding the nuances of a human-computer interaction. AI models do not have the capability to manipulate a mouse, keyboard, or interpret visual interfaces, which are critical components of manual testing. It operates within a text-based environment and cannot perceive graphical user interfaces or execute actions within them. Moreover, manual testing often relies on the tester’s experience, intuition, and tacit knowledge, which are inherently human traits that AI models do not possess.

In conclusion, while AI models holds promise in augmenting certain aspects of the testing process, they are unlikely to replace manual QA engineers. Instead, the future of software testing lies in synergistic collaboration between human expertise and AI-powered automation. By leveraging AI models as a valuable tool in their toolkit, manual QA engineers can enhance their efficiency, expand their capabilities, and focus on higher-level testing activities that require human judgment and creativity.

Embracing AI models: The Pros

It’s virtually impossible, with odds below 0.0001%, that AI models could take over the entire role of manual QA in software testing. However, they have great potential to significantly support manual QA engineers by optimizing and reducing the time spent on their time-consuming repetitive QA processes. Let’s discuss some of the capabilities that AI models possess to help manual QA engineers.

Natural Language Understanding: AI models possesses a remarkable ability to comprehend and generate human-like text, making it adept at understanding the nuances of test cases, requirements, and user stories. This natural language understanding capability enables AI models to interpret complex instructions, extract pertinent information, and formulate test scenarios with remarkable accuracy. For manual QA engineers accustomed to the tedious task of documenting processes by hand including requirement analysis and Test Case creation, AI models serve as a valuable tool. By automating the interpretation and synthesis of testing requirements, AI models streamline the test planning phase, reducing the workload for QA engineers.
Automated Test Case Generation: One of the most compelling advantages of AI models is their capacity to automate the generation of test cases based on given specifications and requirements. In traditional test case creation, QA engineers typically invest considerable effort into manually drafting test scenarios based on their interpretation of system functionalities and user interactions. However, with AI models, this process is accelerated and enhanced. The model autonomously generates comprehensive test cases by analyzing input data and deriving potential scenarios. This automation not only accelerates the test case creation process but also ensures consistency and thoroughness in test coverage, thereby enhancing the overall quality of testing outcomes.
Enhanced Test Coverage: Comprehensive test coverage is essential for ensuring the robustness and reliability of software applications. However, manual QA engineers may inadvertently overlook certain edge cases or exceptional scenarios, leading to gaps in test coverage and potential vulnerabilities in the system. AI models serves as a valuable ally in addressing this challenge by assisting in the identification of edge cases, corner scenarios, and potential pitfalls that may evade manual detection. Through their sophisticated language processing capabilities, AI models can analyze system specifications, user requirements, and historical testing data to identify areas of potential risk and formulate targeted test scenarios. By augmenting manual testing efforts with AI models-powered analysis, QA teams can achieve enhanced test coverage and mitigate the risk of undiscovered defects in the software.
Efficient Documentation: Documentation is a critical aspect of the software testing process, providing stakeholders with insights into test objectives, methodologies, results, and actionable insights. However, manual documentation processes can be time-consuming and prone to errors, especially in environments with rapidly changing requirements and tight deadlines. AI models offer a solution to this challenge by facilitating efficient and accurate documentation of test results, insights, and observations. Leveraging their natural language generation capabilities, AI models can autonomously generate detailed test reports, summaries, and documentation artifacts based on testing outcomes and observations. This automation not only expedites the documentation process but also ensures consistency, clarity, and completeness in reporting, thereby enabling effective communication and decision-making within QA teams and with other stakeholders.

In summary, AI models represents a transformative tool for manual QA engineers, offering unparalleled capabilities in natural language understanding, automated test case generation, enhanced test coverage, and efficient documentation. By harnessing the power of AI models, QA teams can streamline testing processes, improve test coverage, and elevate the overall quality of software products, thereby delivering value to the team/stakeholders.

Navigating the Challenges: The Cons

While AI models certainly offer significant benefits to manual QA engineers, it’s essential to acknowledge and address the potential challenges and limitations associated with its usage. Let’s explore some of the key cons:

Bias and Error Propagation: ChatGPT or Gemini like any AI model, is susceptible to biases present in the data it was trained on. These biases can manifest in the form of skewed interpretations, erroneous recommendations, or unintended preferences in generated test cases. For example, if the training data contains imbalanced representations of certain scenarios or demographics, AI models may inadvertently propagate these biases in its outputs, leading to skewed test coverage or inaccurate recommendations. QA engineers must remain vigilant in identifying and mitigating potential biases in AI model-generated test cases to ensure fair, comprehensive, and unbiased testing outcomes.
Complexity and Interpretability: Understanding the inner workings of AI models and interpreting its outputs can be challenging, particularly for manual QA engineers who may not have a background in AI or natural language processing. AI models operate as a black-box model, meaning that their decision-making processes are opaque and difficult to interpret or scrutinize. This lack of transparency can pose challenges in understanding how AI models arrive at their recommendations, evaluating the validity of their outputs, and diagnosing potential errors or biases. To effectively leverage AI models, QA engineers may require training and upskilling in AI fundamentals, interpretability techniques, and model evaluation methodologies to enhance their understanding and confidence in utilizing AI models for testing purposes.
Integration and Adaptation: Integrating AI models into existing testing workflows and tools may present technical and logistical challenges. QA teams must assess the compatibility of AI models with their current infrastructure, tools, and processes, and invest in appropriate integration strategies to seamlessly incorporate AI models into their testing workflows. Additionally, adapting to the use of AI models may require changes in established testing methodologies, workflows, and team dynamics, which may encounter resistance or skepticism from stakeholders accustomed to traditional testing practices. QA leaders must proactively address these challenges by advocating for the value of AI models, providing training and support to QA teams, and fostering a culture of experimentation and continuous improvement.
Quality Assurance and Validation: While AI models can assist in generating test cases and documentation, it’s essential to ensure the quality and validity of its outputs. QA engineers must exercise critical judgment and validation to verify the accuracy, relevance, and completeness of AI models-generated test cases and documentation artifacts. This validation process may involve manual review, peer collaboration, and cross-referencing with domain knowledge and requirements to ensure that the AI models’ output align with testing objectives and stakeholder expectations. Additionally, QA teams must establish robust quality assurance processes and guidelines for utilizing AI models to mitigate the risk of erroneous or misleading recommendations and maintain the integrity of testing outcomes.

In summary, while ChatGPT or Gemini offers significant potential to augment manual testing efforts, it’s imperative to recognize and address challenges related to biases, complexity, integration, and quality assurance. By proactively addressing these cons and adopting best practices for leveraging AI models effectively, QA teams can harness its capabilities to enhance testing efficiency, improve test coverage, and deliver high-quality software products.

Leveraging Testing Techniques with Prompt Engineering

Prompt engineering essentially involves crafting and fine-tuning inputs (prompts) to steer AI models towards producing specific outputs or answers.
By blending the principles of prompt engineering with classic testing methods, we’ve created a “Structured approach” to prompts that significantly enhances output efficiency. This approach is tailored to enable QA engineers to more effectively utilize the trained model’s strengths, seamlessly incorporating these into standard testing practices.

Guidance for Crafting Effective Prompts:

In the realm of interacting with artificial intelligence, the art of crafting effective prompts stands as a cornerstone for achieving meaningful and precise outcomes. As AI models, particularly in natural language processing, become more intricate and capable, the ability to guide these models through well-structured prompts has emerged as a critical skill. We tried to shed some light on the nuances of prompt engineering — a field that combines the precision of language with the analytical depth of technology. Let us delve into the methodologies behind constructing effective prompts and the various parameters of a prompt.

Contextual Background: Begin your prompt with a detailed overview of your system to give the AI a clear understanding, as it lacks inherent knowledge about your specific scenario. Providing thorough background details is crucial for optimizing the AI’s responses; the more context you provide, the more accurately the AI can customize its output. For instance, ensure your prompt encompasses all pertinent information that might affect the AI’s comprehension and the quality of its responses.
Role Specification: Explicitly specify the perspective you’re requesting a response from. Whether you seek insights from a Senior Business Analyst, Senior QA Engineer, or another professional role, it significantly shapes the response’s character. Each role carries its own set of expertise and viewpoints, which will be reflected in the responses you receive. Try applying the same prompt across multiple roles to see how the responses vary and to extract valuable insights from each distinct viewpoint. You would be surprised with the results!
Desired Output Format: Articulate the specific format in which you expect the AI to deliver its response. Whether it’s a list, a detailed explanation, or a structured report, the defined format helps the AI to structure its responses accordingly and meet your expectations.
Requirements and Scenarios: Elaborate on the particular flows, use cases, or scenarios you need. The inclusion of detailed requirements, such as positive and negative cases, directs the AI’s output generation process, leading to more accurate and useful responses.
Technique Specification: Indicate any specific techniques or methodologies the AI should utilize or consider in its response, such as Boundary Value Analysis or Equivalence Partitioning. The AI’s extensive knowledge base includes a variety of techniques, and specifying one can steer the generation process towards outputs that align with those methods.
Purpose of Inquiry: Explain the intention behind your question or the problem you’re aiming to solve. Understanding the purpose can help the AI prioritize information and provide a response that is more actionable and relevant.
Constraints and Limitations: Mention any constraints or limitations that should be considered. This can include technical boundaries, regulatory requirements, or resource limitations, which can significantly impact the direction and applicability of the AI’s output.
Target Audience: Identify the audience for whom the output is intended. Tailoring the language, technical depth, and complexity to the appropriate audience ensures that the AI’s output is understandable and meets the needs of its intended users.

These critical elements within the prompt are pivotal for enhancing dialogues with AI models to guarantee the delivery of productive and significant outcomes. Follow along as we explore a demonstrative prompt aimed at producing Acceptance Tests in BDD format, starting from specified requirements.

Examples Prompt: “We are working on an online digital payment solution which provides services to the user like digital transactions. The goal of this requirement is to improve the user experience of the digital payments. Act as a Senior Business Analyst/Senior QA Engineer, Generate Gherkin Acceptance scenarios from these functional requirements including positive cases, negative cases, edge cases covering different testing techniques such as equivalence partitioning, boundary value analysis, decision table, state transition, pairwise testing approaches with copy code option”.

Input the requirements as a plain text possibly one requirement at a time.
Review the validity of the generated scenarios based on the output.

Follow-up Prompt: Any other positive/negative/Edge cases?

Repeat this for 4–5 times until you see all the cases.
Use “Copy code” to copy the valid scenarios.

Explaining the Follow-up Prompt structure:

AI models/LLMs will generate outputs based on the initial inputs. Its always crucial to ask the LLM to try generating new cases until you feel that all the requirements are covered from all sides. Remember that exhaustive testing is not possible and LLM can generate 100’s of cases in just few mins, so always take the scenarios that are relevant and important from the business’ side by considering various parameters like project deadlines, test execution timelines, priority of the use case, etc.

Conclusion:

In the exploration of AI models such as ChatGPT and Gemini, and their relationship with manual QA engineering, a nuanced understanding emerges. These advanced tools are neither outright adversaries nor complete substitutes for the intricate work of manual QA engineers. Instead, they represent powerful allies in the quest for quality and efficiency in software testing. By automating repetitive tasks and providing insightful analyses, AI models can significantly augment the capabilities of QA professionals, allowing them to focus on more complex, value-added activities that require human insight and creativity. The future of QA engineering, therefore, seems not to be overshadowed by these technologies but enhanced by them, opening up new avenues for collaboration, innovation, and quality assurance. As we continue to navigate this evolving landscape, it becomes clear that the integration of AI into manual QA processes signifies not an end but a transformative shift towards a more dynamic, efficient, and effective testing paradigm.

Note: This article is intended to spark discussion and exploration within the QA community. We encourage readers to share their insights, experiences, and perspectives on leveraging ChatGPT in software testing.

Penned by:

Hari Janapareddy (QA Lead)
Tibor Farkas (QA Lead)

Thank you Gary Kovács for your writing advice and editing skills.

Legends:

QA — Quality Assurance

AI — Artificial Intelligence

ChatGPT — Chat Generative Pre-trained Transformer

LLM — Large Language Model

AI models — Artificial intelligence models like ChatGPT, Gemini