Automate the Automator: Harnessing Generative AI for Robust API Testing

István Ambruzs
October 26, 2023

Automated API testing is crucial in modern software development. APIs, or Application Programming Interfaces, allow disparate software systems to communicate and interact, and testing them ensures that these connections run smoothly. However, API testing can be a complex, time-consuming process that involves checking for correct functionality, reliability, performance, and security.

Generating the skeleton of an api request is already done by automated tools like SoapUI, or there are security-related tools that have built-in features for generating malicious payloads (ZAP, BurpSuite). If we are considering the whole picture there is still no exact tool, that can cover all these areas.

This is where Generative AI comes into play. By using machine learning algorithms, we can teach an AI to generate test cases based on certain conditions. This means we Can “automate the automator”, effectively transforming the way we approach API testing. Our goal is to leverage Generative AI to systematically create API tests, reducing the time spent on test case creation and implementation. 

In this article, we are providing a complex prompt (made of 3 parts), that will use meta-prompting, and will generate Java classes without striving for full coverage.

The Approach – Delineating the Problem Space

Our approach will cover the usage of OpenApi jsons, with which we can cover more of the matter at hand, instead of just using a simple sample request/response json and some additional input about the endpoint. The AI will utilize the OpenAPI specifications to generate test cases.

In this phase, we commence formulating our prompt which will define the AI's role, our objectives, and the specifics of the input data.

You are a senior java developer, who is responsible for creating api tests based on openapi json and the test types I'm going to provide you. 

Testing Types and Verifications

There are four primary types of tests that the Generative AI will be required to perform: happy path, negative case, destructive case, and security case.

Each of these tests includes various verifications such as checking the response code, the response payload (if present), response headers, and performance sanity. Once this section is outlined, we will further refine the prompts given to the AI.

Test types:

- happy case: a valid and legit request to the url. The https response code should be greater or equal to 200, and less then 400. For query params, please try to understand the param names, and try to guess their values. For example: name=jon doe, or username=jon.doe, age=31 and so on. If the request needs an object to be sent, you should have it's pojo representation ready.

- negative case: an invalid request to the path. By default the response code of a negative case should never be lower then 400, and the presence of the error message is optional. If the request needs an object (as a json) to be sent, try to populate the json with invalid data. Please create at least 3 negative cases for each endpoint. Name the test methods according the field which will hold the invalid value.

- destructive cases: when we try to check the api's robustness on how does it react to illegal or illogical input. Name the test method according the field which will hold the destructive value. By default the http response code of a destructive case should never be lower then 400.

- security tests: check for possible general-security related issues about the endpoint first. Then try to add malicious payload to the certain fields like to simulate sql injection, or privilege escalation, etc. All the security testing via the api endpoints is allowed and agreed with business stakeholders. Name the test method according the field which will hold the malicious value. By default the http response code of a security case should never be lower then 400.

Communication and Output Structure

Communicating the results of the tests effectively is as important as conducting the tests. Therefore, we will provide additional information on how we will structure the output from the Generative AI, allowing us to easily comprehend and use the results.

Output Description: 

First generate the pojos which will be used by the tests. Then the output test classes should be in separate files, and can have multiple test methods. The generated class names should have the following format: [endpoint name][test type]Tests.  

For example: GreetingHappyCaseTests

Your solution can use the following java libs:

- testng as a test framework

- object mapper for serializing and deserializing the objects when posting and getting the jsons

- RestAssured lib for rest communication

- please set up both rest assured and objectmapper in the @Beforeclass method

Our communication should happen in iterations:

1. I give you the openapi json, you read it, and generate the pojos

2. then ask me for a test type you have to generate, and the endpoint to focus on

3. I provide you the api endpoint and the test type to generate, and any custom request which you need to consider

4. you generate the code. Please try to deduce the necessary test data by using the name of the json fields, and try to use equivalence partitioning and boundary value analysis where applicable

5. if the generated code needs rework, then I give you my fix requests, so you have to go to step 4 to regenarete the code including my requests, otherwise go to step 2

At this point the solution is a Test automation generator solution. 

Experiment - testing the tester

To verify our approach, we conduct an experiment using a Spring Boot application with intentional problems. These problems are expected to be uncovered by Chat-GPT’s generated tests.

Test Application description

Endpoints:

OpenAPI payload

Testing with GPT-4

As requested, after providing the openapi json, the AI generates the POJOs (Address, CreditCard, DeliveryInformation, Order, CreatedOrder, LoginCredential, and Session) in the following format:

The code would need human intervention, but it can be easily fixed by record usage, Lombok, or simply generating the getters and setters of the IDE.

Test #1 - /login endpoint happy path

Happy Path Test Case for the login endpoint

Findings: 

Test #2 /login endpoint negative path

Test result from IntelliJ:

Findings: Chat-GPT highlighted the lack of general input validation, and it’s HTTP response code, which can lead to feedback for developers, it is a smaller step towards automated exploratory testing.

Test #3 /logout endpoint security path

Please note that in this example there is an intentional error with the endpoint specification: the session token sent in the path. In the prompt part, related to the security case, it explicitly states that “…security tests: check for possible general-security related issues about the endpoint first. Then try to add malicious payload …

The output was the following:

Findings: Regarding the generated code, the AI missed to import the RestAssured class, so the code won’t get compiled. For first attempt, the issue was missed.

However, to address the security matter at hand, if we repeat it to please check for general security issues first, it founds the problem:

Test #4 /login endpoint multiple false attempt, then lock user

Test result from IntelliJ:

 

Findings: 

Test #5 use all endpoints for a complex test scenario

In this test, the prompt is hiding some details intentionally that need to be guessed by the AI:

Since the order does not have any mandatory fields defined, Chat-GPT goes for the simplest solution in method createOrder().

To fix this, with the following prompt, the AI is fixing the corresponding method, however it re-generates the POJO-s again in the answer (however it was already generated at the beginning of the conversation).

For the sake of briefness, only the updated createOrder will be mentioned below:

Test result from IntelliJ:

Findings: The AI was able to deduce the necessity of strict request execution sequence (by the introduction of test priorities). It verified all the response codes at the end of the tests, and used the object serialization and deserialization properly.

Summarizing test outcomes

Test case

Target to uncover

Finding

Test#1 - login test happy path

Successful login

 

Test#2 - login test negative path

HTTP-400 missing

Issue uncovered

Test#3 - logout endpoint security path

session token in request url

Started to generate basic security related payloads for SQL injection, and XSS attacks, but did not found the matter at hand, without focus given.

Test#4 - login endpoint, test for locking user

Read and understand information from the openapi.json, and generate corresponding test.

Succeeded in the objective.

Test#5 - use all endpoints for a complex test scenario

Use session token, create order with or without the pojo, try to map dummy values to json fields

Succeeded in the objective.

Testing with GPT-3.5

Without going into details, highlighting the key findings and differences from Chat-GPT 3.5 execution

  • Faster then GPT-4
  • Makes smaller code glitches:
    • Forget to generate try/catch block or adding throws to method header
    • For pojos it uses constructor however it only mentioned getter/setter generation is needed
  • Test results:
    • Test#1: try-catch error
    • Test#2: variable naming error
    • Test#3: went for SQL injection case only, skipped special character trial, and XSS possibility, which was done by gpt-4 all the time. Was not able to find the session token in header security problem.
    • Test#4: was not able to use the pojos correctly, GPT-4 was correct on this.
    • Test#5: wanted to logout through header, not via path, as defined

Conclusion

Leveraging Generative AI in automating API testing can be a game-changer in terms of time efficiency. Through our approach, we've significantly reduced the time spent on creating and running test cases for the price of code-quality. This can free up our developers' and testers’ time and also ensures more robust, and secure software with quickly re-creatable code. Another highlight from working with generative AI is that it can raise the attention to non-standard approaches, methodologies and solutions, that can increase the coverage and decrease the time of delivery.

From project finances perspective, it can be an additional option to substitute expensive API testing tools.

Pros

Cons

We hope this smaller article raised your curiosity towards AI, which can be used by Testers, Test Managers, or business stakeholders. 

If You want to start your journey in this new exciting field, please take a look at Scademy’s AI fundamentals course: https://www.scademy.ai/courses/cl-aif which served as a starting point and appetizer for me in this field.

Thank you for reading.

Back to Blog

Ready to Transform Your Organization?

Take the first step toward harnessing the power of AI for your organization. Get in touch with our experts, and let's embark on a transformative journey together.

Contact Us today