What is Possible with Natural Language Processing (NLP) in Software Testing?

Cemre Çınar
Orion Innovation techClub
6 min readJun 28, 2024

What is NLP?

Natural Language Processing (NLP) is an AI method which is used to perceive and interpret what natural language expresses in order to be able to relate it to machine language and it utilizes deep learning and machine learning [1]. NLP works in two different ways: understanding and interpreting natural language and generating natural language from machine language. Translation applications, document processing, voice recognition, chatbots, and many other areas can benefit from this technology.

Some NLP Techniques

Tokenization: It is the breaking up of a text into smaller parts for better analysis; these small pieces are called tokens, which can be sentences, phrases, words, or symbols [2].

Stemming: It removes affixes from words and converts them into root forms [2].

Lemmatization: Unlike stemming, lemmatization takes into account the context of the word and turns it into a root which is a word as well [2].

Sentiment Analysis: It analyzes emotions from texts according to whether they are positive or negative [2].

Named Entity Recognition: This method allows the identification of proper names in the text, such as place and person names in the data set [2].

Semantic Analysis: Analyzing the meaning accuracy of parts of the text [4].

Part-of-Speech (POS) Tagging: It is a technique for sorting words by type [7].

Dependency Parsing: It analyzes whether the words in a sentence come together grammatically correctly [8].

What is the role of NLP in software testing?

Software testing is a process to ensure that software products best meet customer requirements and contain a minimum of errors. It aims to make the product the most accurate and highest quality. There are multiple types of tests, depending on the requirements of the project and the stage of the process. These tests can be performed manually or using automation. In order to ensure that this testing process proceeds as systematically as possible and that the most accurate results are obtained, the so-called “Software Testing Life Cycle (STLC)” approach is used which are [3]:

1. Requirement analysis: examination of customer requirements

2. Planning: requirements for the tests to be performed are planned

3. Test Case Development: test cases to be run are generated

4. Test Environment Setup: test environment suitable for the project is prepared

5. Test Execution: includes manual or automated execution of tests and reporting of failure results

6. Test Closure: the whole process is completed and reported

Figure 1: STLC Steps [11]

NLP helps us to automate not only the execution of tests, but also entire process. There is a lot of workload in the steps applied in the software testing process. We can reduce the effort and time we will spend while doing these by using NLP. In addition, using artificial intelligence in these processes minimizes human errors. The use of NLP in software testing supports agile methodology as it contributes to quality, speed, and early defect detection [10].

How does NLP contribute to facilitating these steps?

1. Test Case Generation:

In this step, appropriate test cases should be generated according to customer requirements. When the number of these scenarios increases, things start to get difficult. NLP can help in generating these scenarios. It can detect customer requirements expressed in natural language and write appropriate scenarios in the appropriate format. This saves a lot of time for test engineers. Test cases should be very detailed while generating test cases. As many scenarios as possible should be generated to minimize errors. Using artificial intelligence while doing this work can save scenarios that humans can skip while generating test cases. NLP techniques that can be used in this process are: Tokenization, Part-of-Speech (POS) Tagging, Dependency Parsing, Named Entity Recognition (NER), and Semantic Analysis [5].

For instance, a test tool called Qtest produced by Tricentis can detect customer requirements and generate various test cases [5]. Chunhui Wang et al. developed an embedded system-oriented project that generates acceptance test cases which is Use Case Modeling for System-level, Acceptance Tests Generation (UMTG), it can accurately reproduce 96% of the restrictions given by the language [9].

Many AI tools use NLP to accomplish many of their tasks. Let’s take a visual example of an AI tool generating test cases using NLP:

Figure 2: Screenshot of Chat-GPT Conversation [12]
Figure 3: Screenshot of Chat-GPT Conversation [12]
Figure 4: Screenshot of Chat-GPT Conversation [12]
Figure 5: Screenshot of Chat-GPT Conversation [12]

2. Difference analysis:

NLP techniques are not only used to prepare test cases. It can also help us to understand if there are deficiencies in the scenarios by comparing the prepared test cases with the customer requirements. It does this by using the semantic similarity analysis technique.[6] We can make use of NLP to improve existing test cases according to customer requirements, to analyze the differences according to changing requirements, and to update the scenarios easily. When the test case analysis framework proposed by Viggiato et al. was tested on test cases generated to develop the Prodigy Math game, it was found to be up to 88% precise [9].

3. Test Automation:

Scenarios can be executed manually during the test execution process, while automations coded in certain programming languages can also be used. NLP can be used to generate test cases as well as to write the code to execute them. It can write the necessary commands to execute the scenarios expressed in natural language in the programming language to be used. This saves time as in the other phases, but also enables the test team to prepare automations and perform tests without the need for very good coding knowledge.

4. Reports and Documentation:

Reporting is a process to bring together the defects found and the results obtained during the test execution process. The expected results should already be specified when writing the test cases beforehand. Accordingly, a reporting process should be realized by comparing the results we obtained with the expected results. After comparing the results, NLP can be used to write reports in natural language. Contrary to the points mentioned earlier, this is the part of producing natural language as a result of the work done, rather than processing the natural language and reaching a conclusion. We can also use NLP in the process of evaluating reports. According to Fazzini et al., the method, called Yakusu, can generate new test cases by processing natural language based on bug reports and it successfully generated scenarios suitable for 59.7% of reports in a study done by 62 bug reports [9].

References:

[1] A. S. Yuksel and M. A. Karabıyık, “A study on text-to-SQL query prediction with natural language processing methods,” NOHU J. Eng. Sci., vol. 11, no. 4, pp. 846–855, Oct. 2022.

[2] H. Dhaduk, “8 Must-Know NLP Techniques to Extract Actionable Insights from Data,” Simform, https://www.simform.com/blog/nlp-techniques/ (accessed Apr. 19, 2023).

[3] “Software Testing Life Cycle (STLC),” Geeksforgeeks, https://www.geeksforgeeks.org/software-testing-life-cycle-stlc/ (accessed Feb. 23, 2024).

[4] “The 5 Steps in Natural Language Processing (NLP),” Twilio, https://www.twilio.com/en-us/blog/nlp-steps (accessed Jun. 21, 2023).

[5] J. Paul, “Natural Language Processing (NLP) in Software Testing: Automating Test Case Creation and Documentation,” Dzone, https://dzone.com/articles/natural-language-processing-nlp-in-software-testin (accessed Mar. 30, 2023).

[6] “NLP for Streamlined Test Case Documentation & Analysis,” Medium, https://medium.com/@workboxtech/nlp-for-streamlined-test-case-documentation-analysis-373c92f03f30 (accessed May. 24, 2024).

[7] “POS(Parst-of-Speech) Tagging in NLP,” Geeksforgeeks, https://www.geeksforgeeks.org/nlp-part-of-speech-default-tagging/ (accessed Jan. 03, 2024).

[8] “Constituency Parsing and Dependency Parcing,” Geeksforgeeks, https://www.geeksforgeeks.org/constituency-parsing-and-dependency-parsing/ (accessed Feb. 22, 2023).

[9] H. Ayenew and M. Wagaw, “Software Test Case Generation Using Natural Language Processing (NLP): A Sistematic Literature Review,” Artificial Intelligence Evolution, vol. 5, no. 1, Jan. 2024.

[10] M. Leotta et al., “An empirical study to compare three web tests automation approaches: NLP-based, programmable, and capture&replay,” J Softw Evol Proc., vol. 36, no.5, Jul. 2024.

[11] U. Akpolat, “Yazılım Test Yaşam Döngüsü (STLC) Nedir?” Medium, https://medium.com/@umudigo48official/yaz%C4%B1l%C4%B1m-test-ya%C5%9Fam-d%C3%B6ng%C3%BCs%C3%BC-stlc-nedir-2632ad8b8324 (accessed Apr. 23, 2023).

[12] OpenAI, “Screenshot of ChatGPT conversation,” Online. (accessed: June 25, 2024).

--

--